# Joint vs Individual Linear Regression Loadings

First, let's establish some notation. Define $\texttt{linreg}$ such that e.g. $\texttt{linreg}(y,\, \{1, x_1, x_2\})$ means the vector of MLE coefficients from a linear regression of $y$ on a constant (AKA "intercept"), $x_1$, and $x_2$. This vector has the same dimension as the regressors, e.g. in this case $(\beta_0,\, \beta_1,\, \beta_2)$ has dimension 3. Note that some authors label $\beta_0$ as $\alpha$.


\# TODO: prove that $\texttt{linreg}(y, \{1, x_1})

In [3]:
import scipy.stats as stats
import pandas as pd
import numpy as np
import statsmodels.api as sm

## construct data

In [44]:
n = 64

mean = pd.Series({"x1": 3.14, "x2": 3.14, "x3": 2.72, "white_noise": 0})

std = pd.Series({"x1": 3.14, "x2": 3.14, "x3": 2.72, "white_noise": 1})
# diagonal matrix with std's on the diagonal
std_ = pd.DataFrame(np.diag(std), index=std.index, columns=std.index)

corr = pd.DataFrame({
    "x1": {"x1": 1, "x2": -0.5, "x3": 0.5, "white_noise": 0},
    "x2": {"x1": -0.5, "x2": 1, "x3": 0, "white_noise": 0},
    "x3": {"x1": 0.5, "x2": 0, "x3": 1, "white_noise": 0},
    "white_noise": {"x1": 0, "x2": 0, "x3": 0, "white_noise": 1}
})

cov = std_ @ corr @ std_
cov

Unnamed: 0,x1,x2,x3,white_noise
x1,9.8596,-4.9298,4.2704,0.0
x2,-4.9298,9.8596,0.0,0.0
x3,4.2704,0.0,7.3984,0.0
white_noise,0.0,0.0,0.0,1.0


In [46]:
x = pd.DataFrame(stats.multivariate_normal.rvs(mean=mean, cov=cov, size=n), columns=mean.index)
x.loc[:, "x1+x2"] = x["x1"] + x["x2"]
x

Unnamed: 0,x1,x2,x3,white_noise,x1+x2
0,3.689891,0.211331,-0.316387,1.049992,3.901222
1,0.905872,3.390582,1.330246,-1.477451,4.296453
2,6.677553,1.623315,2.229324,1.062024,8.300867
3,-0.120413,5.179490,3.393259,0.848405,5.059077
4,2.939849,2.102987,3.184186,-0.733020,5.042836
...,...,...,...,...,...
59,0.127113,7.963481,2.895155,1.613224,8.090594
60,-2.215368,3.700655,1.423058,-0.391185,1.485287
61,4.624515,1.176619,1.609477,-1.797265,5.801134
62,-0.971553,4.970889,3.775684,1.953737,3.999336
