In [2]:
import sys
sys.path.append("../..")
from bayesianquilts.tf.parameter import Interactions, Decomposed


Let's consider extending the linear regression problem

$$ y = \vec{x} \cdot \vec{\beta} $$

so that $\beta$ can vary. Let's assume that $\vec{\beta}$ is $100$ dimensional

Let's say you have some number of **discrete** factors over which you would like to vary $\beta$, for example, you might have sex, race, and whether one is currently a smoker. Let's assume you have 5 possible values for race, two for sex, and two for smoker status. Then you have $2\times 5\times 2=20$ different possible groups over which to fit your model, and $20\times 100=2000$ total regression parameters.

A common approach is to divide the data into these 20 groups, and fit 20 separate models. One can see who this procedure would be statistically problematic  already with this simple example. We would like to define a way of regularizing such a problem.

# Parameter decomposition method

In dividing the data into 20 groups, one doesn't allow the groups to share information. Instead, let's consider decomposing the parameter $\beta$ in terms of the order of interactions, so $\beta=\beta_0 + \beta_{sex}+\beta_{race}+\beta_{smoking} + \beta_{sex, gender} + \ldots $

## WHY??

The reason we  want to do this is because we can increase the strength of the regularization for higher order terms, in effect more-strongly forcing more of the higher-order contributions to zero by default. What this procedure does is partially pool the information in the dataset so that the model is effectively lower-order except in places where the data supports high-order contributions.

The `Interaction` and `Decomposed` classes work together in creating this decomposition:

In [3]:
interaction = Interactions(
    [
        ('sex', 2),
        ('race', 5),
        ('smoking', 2)
    ], exclusions=[]
    )

print(interaction)

Interaction dimenions: [('sex', 2), ('race', 5), ('smoking', 2)]


In [4]:
beta = Decomposed(
    param_shape=[100],
    interactions=interaction,
    name='beta'
)
print(beta)

Parameter shape: [100] 
Interaction dimenions: [('sex', 2), ('race', 5), ('smoking', 2)] 
Component tensors: 8 
Effective parameter cardinality: 2000 
Actual parameter cardinality: 5400



We have created a representation for $\vec{\beta}$ that consists of eight component parameters:

In [5]:
print(beta._param_shapes)

{'beta__': TensorShape([1, 1, 1, 100]), 'beta__smoking': TensorShape([1, 1, 2, 100]), 'beta__race': TensorShape([1, 5, 1, 100]), 'beta__race_smoking': TensorShape([1, 5, 2, 100]), 'beta__sex': TensorShape([2, 1, 1, 100]), 'beta__sex_smoking': TensorShape([2, 1, 2, 100]), 'beta__sex_race': TensorShape([2, 5, 1, 100]), 'beta__sex_race_smoking': TensorShape([2, 5, 2, 100])}


The `Decomposed` class created the constituent parameters and initialized them to a value which is default of $\vec{0}$.

We can also choose to exclude certain interactions. Let's say we think the model should exclude the interaction between race and sex not qualified by smoking:

In [6]:
interaction = Interactions(
    [
        ('sex', 2),
        ('race', 5),
        ('smoking', 2)
    ], exclusions=[('race', 'sex')]
    )

beta = Decomposed(
    param_shape=[100],
    interactions=interaction,
    name='beta'
)
print(beta)
print(beta._param_shapes)

Parameter shape: [100] 
Interaction dimenions: [('sex', 2), ('race', 5), ('smoking', 2)] 
Component tensors: 7 
Effective parameter cardinality: 2000 
Actual parameter cardinality: 4400

{'beta__': TensorShape([1, 1, 1, 100]), 'beta__smoking': TensorShape([1, 1, 2, 100]), 'beta__race': TensorShape([1, 5, 1, 100]), 'beta__race_smoking': TensorShape([1, 5, 2, 100]), 'beta__sex': TensorShape([2, 1, 1, 100]), 'beta__sex_smoking': TensorShape([2, 1, 2, 100]), 'beta__sex_race_smoking': TensorShape([2, 5, 2, 100])}


You see that the number of constituent tensors is now $7$. In practice, we might wish to exclude higher order terms in order to save memory.

Now we have created the decomposition. Let's use the decomposition. Suppose I have a sample of people:

1. sex=1, race=1, smoking=1
2. sex=0, race=3, smoking=0
3. sex=0, race=2, smoking=1
4. sex=1, race=0, smoking=1

 and I want to retrieve the effect values of $\vec{\beta}$ for these people. The `Decomposed` class takes care of this lookup

In [7]:
indices = [
    [1, 1, 1],
    [0, 3, 0],
    [0, 2, 1],
    [1, 0, 1]
]

beta_effective = beta.query(indices)
print(beta_effective.shape)



(4, 100)


You see that I have $4 \times 100$ values that are returned, where each row corresponds to the regression parameter vector for each of the people.

# Parameter batches

TFP works on parameters in sample batches. The `Decomposed` class handles batching. Let's generate a batch of size $5$ of the component tensors that add to $\vec{\beta}$:

In [8]:
t, l = beta.generate_tensors(batch_shape=[5])
print([f"{k}: {v.shape}" for k, v in t.items()])

['beta__: (5, 1, 1, 1, 100)', 'beta__smoking: (5, 1, 1, 2, 100)', 'beta__race: (5, 1, 5, 1, 100)', 'beta__race_smoking: (5, 1, 5, 2, 100)', 'beta__sex: (5, 2, 1, 1, 100)', 'beta__sex_smoking: (5, 2, 1, 2, 100)', 'beta__sex_race_smoking: (5, 2, 5, 2, 100)']


now let's query the batched parameter to get batched effective values:

In [9]:
beta.set_params(t)
effective_batched = beta.query(indices)
print(effective_batched.shape)


(5, 4, 100)


So you see that for each of the 4 people we have a batch of size 5, of the vector $\vec{\beta}$.

# Inflating and deflating the indices

Tensorflow has an intrinisic limitation in the tensor rank allowed in its most basic operations. To get around this limitation, we can inflate and deflate the interaction index dimensions. Let's take a look at generating deflated constituent parameters and inflating them:

In [10]:
t, n = beta.generate_tensors(batch_shape=[5], flatten_indices=True)
print([f"{k}: {v.shape}" for k, v in t.items()])

['beta__: (5, 1, 100)', 'beta__smoking: (5, 2, 100)', 'beta__race: (5, 5, 100)', 'beta__race_smoking: (5, 10, 100)', 'beta__sex: (5, 2, 100)', 'beta__sex_smoking: (5, 4, 100)', 'beta__sex_race_smoking: (5, 20, 100)']


So you see here that the middle axes corresponding to the interaction indices has collapsed into a single axis in each of the parameter tensors. To re-inflate, we have the method:

In [11]:
t1 = beta.inflate_indices(t)
print([f"{k}: {v.shape}" for k, v in t1.items()])

['beta__: (5, 1, 1, 1, 100)', 'beta__smoking: (5, 1, 1, 2, 100)', 'beta__race: (5, 1, 5, 1, 100)', 'beta__race_smoking: (5, 1, 5, 2, 100)', 'beta__sex: (5, 2, 1, 1, 100)', 'beta__sex_smoking: (5, 2, 1, 2, 100)', 'beta__sex_race_smoking: (5, 2, 5, 2, 100)']


And all is good in the world. Happy 2022!