# TA Review Session - 2
## Linear Algebra, Regression and Data Analysis in Python
### FINM September Launch

Maneet Singh - TA

maneetsingh@uchicago.edu

## Recap

In the last TA session, we discussed OLS Model of the form:

$$
y = x'b + e
$$

where $b$ is given by:

$$
b = (x'x)^{-1}x'y
$$


**What were some of the issues with calculating $(x'x)^{-1}$? Think of a situation where a lot of factors are considered.**

Let's look at an example to see how this is a practical problem.

In [5]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_excel('sp500_returns.xlsx', sheet_name=2).set_index('date')
data.index = pd.to_datetime(data.index)
display(data.head())

spy = pd.read_excel('sp500_returns.xlsx', sheet_name=3).set_index('date')[['SPY']]
spy.index = pd.to_datetime(spy.index)
display(spy.head())

Unnamed: 0_level_0,A,AAP,AAPL,ABC,ABT,ACN,ADBE,ADI,ADM,ADP,...,WY,WYNN,XEL,XOM,XRAY,XYL,YUM,ZBH,ZBRA,ZION
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2016-01-15,-0.038351,-0.008719,0.001752,-0.05298,0.003094,0.010897,0.015026,-0.003201,-0.083746,-0.002688,...,-0.074549,0.010911,0.008843,0.038693,-0.008773,-0.040441,-0.008857,-0.004629,-0.04557,-0.09187
2016-01-22,0.023444,0.000965,0.044174,-0.01908,-0.012581,0.029918,0.005159,0.031311,0.054588,0.04094,...,-0.015492,0.013704,0.003561,-0.013019,0.018233,0.031839,0.028172,-0.019094,-0.000884,-0.037156
2016-01-29,-0.008689,0.052759,-0.040244,-0.015823,-0.05446,0.032276,-0.005578,0.048269,0.063797,0.024411,...,0.007473,0.138077,0.043409,0.016717,0.023818,0.05642,0.049144,0.00111,0.069027,0.054398
2016-02-05,-0.042761,-0.053338,-0.028864,-0.054265,-0.011623,-0.06225,-0.11141,-0.075009,-0.03621,-0.021423,...,-0.063257,-0.080635,0.034275,0.028645,0.001698,0.01975,-0.036064,-0.042818,-0.05,-0.046298
2016-02-12,0.004992,-0.022021,-0.000323,0.010744,-0.007485,-0.043144,-0.031944,-0.003614,-0.038858,0.005289,...,-0.057941,0.116782,-0.017961,0.021048,-0.072216,0.010364,-0.03469,-0.033996,0.033635,-0.029592


Unnamed: 0_level_0,SPY
date,Unnamed: 1_level_1
2016-01-15,-0.02143
2016-01-22,0.014429
2016-01-29,0.0168
2016-02-05,-0.029789
2016-02-12,-0.007023


Let's run the following regression:

$$
r^{SPY}_t = \alpha + \sum^{451}_{i=1}\beta^ir^i_t + \epsilon_t
$$

In [16]:
X = sm.add_constant(data.sample(n = 330, axis = 1))
y = spy

ols_model = sm.OLS(endog=y, exog=X).fit()
print(ols_model.summary())

                            OLS Regression Results                            
Dep. Variable:                    SPY   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  0.989
Method:                 Least Squares   F-statistic:                     93.86
Date:                Fri, 09 Sep 2022   Prob (F-statistic):           0.000225
Time:                        08:29:10   Log-Likelihood:                 2273.0
No. Observations:                 335   AIC:                            -3884.
Df Residuals:                       4   BIC:                            -2622.
Df Model:                         330                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.0003      0.001     -0.382      0.7

In [20]:
print('Condition Number of Design Matrix: {:,.2f}'.format(np.linalg.cond(X)))
print('Condition Number of Covariance Matrix: {:,.2f}'.format(np.linalg.cond(np.dot(X.transpose(), X))))

Condition Number of Design Matrix: 6,156.03
Condition Number of Covariance Matrix: 37,896,688.39


**What do you notice?**

The high condition number contributes to numerical instability of the regression model. A small change in the design matrix $X$, cause a large change in the coefficient vector $\beta$. Such a model can be expected to have very poor out of sample performance.

Let's examine the problem using $VIF$:

In [21]:
from statsmodels.stats.outliers_influence import variance_inflation_factor
display(pd.DataFrame([variance_inflation_factor(X.values, i) for i in range(X.shape[1])], \
                     index=X.columns, columns=['VIF']))

Unnamed: 0,VIF
const,26.397941
KMB,114.253428
ENPH,109.954938
VNO,575.386751
BR,220.845190
...,...
PPL,276.363638
TGT,80.719437
PG,521.003249
CHD,180.397096


Clearly, multicollinearity is a big problem in this model.

**Agenda**

In today's session, we will discuss:

- What is principal component analysis 


- How can we use it solve the problem of multicollinearity


- What is the role of **eigendecomposition** in PCA


- Steps to calculate PCA


- Practical applications of PCA

## Recap of Principal Component Analysis

###### Reference: data_analysis_2022_machinelearning.pdf

Principal Component Analysis is a way to reduce the dimensionality of the dataset. 

The procedure helps us reproduce the data in a rotated vector space such that the projections along the orthonormal basis in the new vector space are uncorrelated. 

It also ranks these orthonormal basis in the order of *importance* which allows us to reduce the basis vector to represent an acceptable level of variation.


PCA is widely used for:

1. Data Compression


2. Efficient Model Development


3. Better Visualizations


Let's take an example from the previous dataset:

![image-2.png](attachment:image-2.png)

The objective of PCA is to find the dimensions along which the projection variance is maximized.


Assume $w^{\rightarrow}$ to be a direction vector along which the projection variance is maximized. One such vector is shown below in green:

![image-2.png](attachment:image-2.png)

Remember, projection is just the dot product of the vector with the basis vector:

$$
P = (X_i.w^{\rightarrow})*w^{\rightarrow}
$$


One way to mazimize the variance is to calculate the variance of the projection and maximize it (Please refer to the notes slide 33)

Another way is to minimize the **reconstruction error**, as shown below:

$$
arg min \{error_{rc}\} = arg min \{\sum_i||X_i - (X_i.w^{\rightarrow})*w^{\rightarrow}||\}
$$

Reconstruction Errors are shown as red in the image below. Notice how these are different from the residuals of a regression

![image.png](attachment:image.png)


So we keep rotating the line above untill we minimize the argument above. This gives us our **first principal component**.

Other principal components $(v^{\rightarrow})$ are found by iteratively solving the above equation but with the following additional constraint:

$$
(v^{\rightarrow}).(w^{\rightarrow}) = 0
$$

Image below shows the data projected on the principal components $(v^{\rightarrow})$ and $(w^{\rightarrow})$ found in such manner. Notice how the correlation between variables in low.

![image.png](attachment:image.png)

**This problem was simple in two dimensions, what if you have a 1000 dimensions? The algorithm becomes very slow if the dimensions are increased. Fortnately there is a solution...** 

Without using mathematical rigor, you can observe geometerically that if a design matrix $X$ rotates the principal direction vector $w^{\rightarrow}$, then the resulting component will not have minimum variance (as we discussed above)

Hence, the algorithm select $w^{\rightarrow}$ in such a way that the operation $X.w^{\rightarrow}$ only causes a scalar change $\lambda$ in $w^{\rightarrow}$

$$
X.w^{\rightarrow} = \lambda.w^{\rightarrow}
$$


This is also the characteristic equation for $X$, which means that principal directions will be given by **eigenvectors** and the amount by which the principal direction is stretched (or the amount of variance captured) is given by the **eigenvalues**.

For a complete mathematical proof, refer: https://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch18.pdf


To find eigenvalues and eigenvectors for a large dimension matrix, we use **singular value decomposition**

### Singular Value Decomposition:

We can decompose a full-rank matrix $X$ on $\mathbb{R}^{mxn}$ as follows:

$$
X = U\sum V^T
$$

where $V$ is an n×n matrix that its columns are $v_i$. So:

$$
V = [v_1, v_2,...v_n]
$$

Where $v_i$ are orthogonal and normalized vectors an orthonormal set. 

$\sum$ is mxn matrix such that $\sum _{ij} = \lambda$ for all $i=j$ and 0 otherwise.

and $U$ is an orthonormal basis $\{u_1, u_2, … ,u_m\}$


This process helps resolve the numerical instability of calculating the variance covariance matrix, needed for eigendecomposition.

## Steps of PCA

##### Step 1: Standardize the data

In [28]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
pca_sample = X[['JPM', 'GS']]
scaler.fit(pca_sample)
pca_scaled_sample = scaler.transform(pca_sample)

#### Step 2:  Perform Singular Value Decomposition (Or estimate the eigenvalues of eigenvectors)

In [68]:
from numpy.linalg import svd 
U, S, Vt = svd(pca_scaled_sample, full_matrices=False)

#### Step 3:  Calculate PC using $U.S$

In [69]:
pca_factors = U @ np.diag(S)
pca_factors

array([[-1.51491135e+00,  2.39831925e-01],
       [-6.80498033e-03, -1.72746649e-01],
       [ 1.18813070e+00,  3.06479563e-01],
       [-1.15480522e+00, -3.01546793e-02],
       [-1.26928512e+00,  9.84121932e-01],
       [ 8.14418552e-02,  3.26112061e-03],
       [ 1.74099691e-01, -4.71605949e-01],
       [ 1.46815941e+00, -1.52598222e-02],
       [-6.24664064e-01,  7.50172064e-02],
       [ 6.25545530e-01, -5.42848249e-02],
       [-8.84519348e-01,  1.64677457e-01],
       [ 7.36561986e-01, -6.22368685e-01],
       [-1.59722391e+00,  4.47563785e-01],
       [ 2.07712096e+00,  3.83658851e-01],
       [ 1.35085032e+00, -2.47511727e-01],
       [-5.85214547e-01,  2.78655647e-02],
       [-1.08974591e+00,  5.27680347e-02],
       [-5.87296332e-01,  2.30103127e-01],
       [ 4.84993882e-01,  7.55703098e-01],
       [ 1.03579101e+00, -6.65077974e-02],
       [-7.22392010e-01,  1.63678547e-01],
       [-9.38885641e-01,  3.69241945e-01],
       [-1.01355562e+00,  8.35816319e-03],
       [-1.

## Another Method in Python Using sklearn

In [79]:
from sklearn.decomposition import PCA

sample = X[['JPM', 'GS']]
scaler.fit(sample)

sample = pd.DataFrame(scaler.transform(sample))

pca = PCA(svd_solver='full')

pca.fit(sample.values)

pca_factors = pca.transform(sample.values)

pca_factors

array([[-1.51491135e+00, -2.39831925e-01],
       [-6.80498033e-03,  1.72746649e-01],
       [ 1.18813070e+00, -3.06479563e-01],
       [-1.15480522e+00,  3.01546793e-02],
       [-1.26928512e+00, -9.84121932e-01],
       [ 8.14418552e-02, -3.26112061e-03],
       [ 1.74099691e-01,  4.71605949e-01],
       [ 1.46815941e+00,  1.52598222e-02],
       [-6.24664064e-01, -7.50172064e-02],
       [ 6.25545530e-01,  5.42848249e-02],
       [-8.84519348e-01, -1.64677457e-01],
       [ 7.36561986e-01,  6.22368685e-01],
       [-1.59722391e+00, -4.47563785e-01],
       [ 2.07712096e+00, -3.83658851e-01],
       [ 1.35085032e+00,  2.47511727e-01],
       [-5.85214547e-01, -2.78655647e-02],
       [-1.08974591e+00, -5.27680347e-02],
       [-5.87296332e-01, -2.30103127e-01],
       [ 4.84993882e-01, -7.55703098e-01],
       [ 1.03579101e+00,  6.65077974e-02],
       [-7.22392010e-01, -1.63678547e-01],
       [-9.38885641e-01, -3.69241945e-01],
       [-1.01355562e+00, -8.35816319e-03],
       [-1.

##### Both methods give the same result. Notice that the opposite signs are irrelevant for principal components

## Practical Application of PCA

#### Let's run PCA on the complete dataset for equities

In [82]:
scaler.fit(data)

sample = pd.DataFrame(scaler.transform(data))

pca = PCA(svd_solver='full')

pca.fit(sample.values)

pca_factors = pd.DataFrame(pca.transform(data.values), 
                           columns=['PC {}'.format(i+1) for i in range(pca.n_components_)], 
                           index = pd.to_datetime(data.index))
display(pca_factors.head().style.format('{:,.4f}'))

Unnamed: 0_level_0,PC 1,PC 2,PC 3,PC 4,PC 5,PC 6,PC 7,PC 8,PC 9,PC 10,PC 11,PC 12,PC 13,PC 14,PC 15,PC 16,PC 17,PC 18,PC 19,PC 20,PC 21,PC 22,PC 23,PC 24,PC 25,PC 26,PC 27,PC 28,PC 29,PC 30,PC 31,PC 32,PC 33,PC 34,PC 35,PC 36,PC 37,PC 38,PC 39,PC 40,PC 41,PC 42,PC 43,PC 44,PC 45,PC 46,PC 47,PC 48,PC 49,PC 50,PC 51,PC 52,PC 53,PC 54,PC 55,PC 56,PC 57,PC 58,PC 59,PC 60,PC 61,PC 62,PC 63,PC 64,PC 65,PC 66,PC 67,PC 68,PC 69,PC 70,PC 71,PC 72,PC 73,PC 74,PC 75,PC 76,PC 77,PC 78,PC 79,PC 80,PC 81,PC 82,PC 83,PC 84,PC 85,PC 86,PC 87,PC 88,PC 89,PC 90,PC 91,PC 92,PC 93,PC 94,PC 95,PC 96,PC 97,PC 98,PC 99,PC 100,PC 101,PC 102,PC 103,PC 104,PC 105,PC 106,PC 107,PC 108,PC 109,PC 110,PC 111,PC 112,PC 113,PC 114,PC 115,PC 116,PC 117,PC 118,PC 119,PC 120,PC 121,PC 122,PC 123,PC 124,PC 125,PC 126,PC 127,PC 128,PC 129,PC 130,PC 131,PC 132,PC 133,PC 134,PC 135,PC 136,PC 137,PC 138,PC 139,PC 140,PC 141,PC 142,PC 143,PC 144,PC 145,PC 146,PC 147,PC 148,PC 149,PC 150,PC 151,PC 152,PC 153,PC 154,PC 155,PC 156,PC 157,PC 158,PC 159,PC 160,PC 161,PC 162,PC 163,PC 164,PC 165,PC 166,PC 167,PC 168,PC 169,PC 170,PC 171,PC 172,PC 173,PC 174,PC 175,PC 176,PC 177,PC 178,PC 179,PC 180,PC 181,PC 182,PC 183,PC 184,PC 185,PC 186,PC 187,PC 188,PC 189,PC 190,PC 191,PC 192,PC 193,PC 194,PC 195,PC 196,PC 197,PC 198,PC 199,PC 200,PC 201,PC 202,PC 203,PC 204,PC 205,PC 206,PC 207,PC 208,PC 209,PC 210,PC 211,PC 212,PC 213,PC 214,PC 215,PC 216,PC 217,PC 218,PC 219,PC 220,PC 221,PC 222,PC 223,PC 224,PC 225,PC 226,PC 227,PC 228,PC 229,PC 230,PC 231,PC 232,PC 233,PC 234,PC 235,PC 236,PC 237,PC 238,PC 239,PC 240,PC 241,PC 242,PC 243,PC 244,PC 245,PC 246,PC 247,PC 248,PC 249,PC 250,PC 251,PC 252,PC 253,PC 254,PC 255,PC 256,PC 257,PC 258,PC 259,PC 260,PC 261,PC 262,PC 263,PC 264,PC 265,PC 266,PC 267,PC 268,PC 269,PC 270,PC 271,PC 272,PC 273,PC 274,PC 275,PC 276,PC 277,PC 278,PC 279,PC 280,PC 281,PC 282,PC 283,PC 284,PC 285,PC 286,PC 287,PC 288,PC 289,PC 290,PC 291,PC 292,PC 293,PC 294,PC 295,PC 296,PC 297,PC 298,PC 299,PC 300,PC 301,PC 302,PC 303,PC 304,PC 305,PC 306,PC 307,PC 308,PC 309,PC 310,PC 311,PC 312,PC 313,PC 314,PC 315,PC 316,PC 317,PC 318,PC 319,PC 320,PC 321,PC 322,PC 323,PC 324,PC 325,PC 326,PC 327,PC 328,PC 329,PC 330,PC 331,PC 332,PC 333,PC 334,PC 335
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1,Unnamed: 199_level_1,Unnamed: 200_level_1,Unnamed: 201_level_1,Unnamed: 202_level_1,Unnamed: 203_level_1,Unnamed: 204_level_1,Unnamed: 205_level_1,Unnamed: 206_level_1,Unnamed: 207_level_1,Unnamed: 208_level_1,Unnamed: 209_level_1,Unnamed: 210_level_1,Unnamed: 211_level_1,Unnamed: 212_level_1,Unnamed: 213_level_1,Unnamed: 214_level_1,Unnamed: 215_level_1,Unnamed: 216_level_1,Unnamed: 217_level_1,Unnamed: 218_level_1,Unnamed: 219_level_1,Unnamed: 220_level_1,Unnamed: 221_level_1,Unnamed: 222_level_1,Unnamed: 223_level_1,Unnamed: 224_level_1,Unnamed: 225_level_1,Unnamed: 226_level_1,Unnamed: 227_level_1,Unnamed: 228_level_1,Unnamed: 229_level_1,Unnamed: 230_level_1,Unnamed: 231_level_1,Unnamed: 232_level_1,Unnamed: 233_level_1,Unnamed: 234_level_1,Unnamed: 235_level_1,Unnamed: 236_level_1,Unnamed: 237_level_1,Unnamed: 238_level_1,Unnamed: 239_level_1,Unnamed: 240_level_1,Unnamed: 241_level_1,Unnamed: 242_level_1,Unnamed: 243_level_1,Unnamed: 244_level_1,Unnamed: 245_level_1,Unnamed: 246_level_1,Unnamed: 247_level_1,Unnamed: 248_level_1,Unnamed: 249_level_1,Unnamed: 250_level_1,Unnamed: 251_level_1,Unnamed: 252_level_1,Unnamed: 253_level_1,Unnamed: 254_level_1,Unnamed: 255_level_1,Unnamed: 256_level_1,Unnamed: 257_level_1,Unnamed: 258_level_1,Unnamed: 259_level_1,Unnamed: 260_level_1,Unnamed: 261_level_1,Unnamed: 262_level_1,Unnamed: 263_level_1,Unnamed: 264_level_1,Unnamed: 265_level_1,Unnamed: 266_level_1,Unnamed: 267_level_1,Unnamed: 268_level_1,Unnamed: 269_level_1,Unnamed: 270_level_1,Unnamed: 271_level_1,Unnamed: 272_level_1,Unnamed: 273_level_1,Unnamed: 274_level_1,Unnamed: 275_level_1,Unnamed: 276_level_1,Unnamed: 277_level_1,Unnamed: 278_level_1,Unnamed: 279_level_1,Unnamed: 280_level_1,Unnamed: 281_level_1,Unnamed: 282_level_1,Unnamed: 283_level_1,Unnamed: 284_level_1,Unnamed: 285_level_1,Unnamed: 286_level_1,Unnamed: 287_level_1,Unnamed: 288_level_1,Unnamed: 289_level_1,Unnamed: 290_level_1,Unnamed: 291_level_1,Unnamed: 292_level_1,Unnamed: 293_level_1,Unnamed: 294_level_1,Unnamed: 295_level_1,Unnamed: 296_level_1,Unnamed: 297_level_1,Unnamed: 298_level_1,Unnamed: 299_level_1,Unnamed: 300_level_1,Unnamed: 301_level_1,Unnamed: 302_level_1,Unnamed: 303_level_1,Unnamed: 304_level_1,Unnamed: 305_level_1,Unnamed: 306_level_1,Unnamed: 307_level_1,Unnamed: 308_level_1,Unnamed: 309_level_1,Unnamed: 310_level_1,Unnamed: 311_level_1,Unnamed: 312_level_1,Unnamed: 313_level_1,Unnamed: 314_level_1,Unnamed: 315_level_1,Unnamed: 316_level_1,Unnamed: 317_level_1,Unnamed: 318_level_1,Unnamed: 319_level_1,Unnamed: 320_level_1,Unnamed: 321_level_1,Unnamed: 322_level_1,Unnamed: 323_level_1,Unnamed: 324_level_1,Unnamed: 325_level_1,Unnamed: 326_level_1,Unnamed: 327_level_1,Unnamed: 328_level_1,Unnamed: 329_level_1,Unnamed: 330_level_1,Unnamed: 331_level_1,Unnamed: 332_level_1,Unnamed: 333_level_1,Unnamed: 334_level_1,Unnamed: 335_level_1
2016-01-15 00:00:00,0.5536,0.1996,0.1756,-0.0119,0.1602,0.0061,0.0919,-0.0631,-0.0479,0.0532,0.1451,0.0709,0.1282,0.1493,0.199,-0.12,0.0099,-0.0661,-0.1897,0.0398,-0.0514,-0.1171,-0.0096,-0.0326,-0.0086,-0.0326,0.1444,0.0718,0.1438,0.0563,0.0274,-0.0022,0.0678,-0.0126,-0.0563,-0.003,-0.1377,-0.0939,0.042,0.0222,0.1148,0.0322,-0.1148,0.0131,-0.1046,-0.0212,-0.0523,0.0898,0.0782,0.0002,0.0161,-0.0902,-0.0034,0.0804,0.0008,-0.0255,0.0151,-0.1366,0.0181,0.037,-0.0479,-0.0403,-0.0304,0.0788,-0.0005,0.0791,-0.0227,0.0618,0.0145,0.0397,0.0505,0.0582,-0.0024,0.0164,-0.0313,-0.0042,0.0259,-0.0054,-0.0248,0.1054,-0.0488,-0.0476,0.0164,-0.0655,0.0335,0.0416,-0.0873,0.0864,0.0297,-0.0525,0.0039,0.0434,0.0093,-0.0314,0.0102,0.0141,-0.0266,-0.0166,0.0411,-0.0714,0.0951,-0.0199,-0.0311,0.0328,-0.0001,0.0653,-0.0253,-0.0178,0.0196,-0.0505,0.0062,-0.0262,0.0627,0.0025,-0.0406,-0.0009,0.0145,-0.0121,0.0251,0.0457,0.0391,-0.0561,-0.0181,0.0506,0.0509,-0.0243,-0.0303,0.0665,-0.0104,-0.0257,-0.0145,-0.0088,0.0007,0.0168,0.0475,-0.0037,0.0048,-0.0328,-0.0178,0.0368,-0.0601,0.0602,-0.0276,0.045,0.0045,0.0164,-0.0541,0.0018,0.0195,0.0194,-0.0407,0.0409,-0.0228,0.0564,0.0337,-0.0111,0.0105,-0.0011,0.0035,0.0304,-0.0165,-0.0181,-0.0302,0.0556,-0.0624,-0.0365,-0.0321,0.0169,0.0122,-0.0254,0.0006,-0.0281,-0.0446,0.0016,-0.0037,-0.0017,0.0273,-0.0021,-0.019,0.0061,0.0478,-0.0245,-0.0088,0.0232,0.0104,-0.0329,0.0016,-0.0057,-0.021,-0.013,0.0128,-0.0053,-0.0085,-0.0257,0.023,-0.0149,0.0233,-0.0178,0.0211,-0.0227,0.0174,-0.0414,-0.0011,-0.0162,0.0145,0.0429,-0.0103,0.0076,0.0126,0.0458,0.004,-0.0135,-0.0246,-0.0352,-0.0029,0.0246,-0.0055,-0.0188,0.0114,-0.0111,-0.0037,-0.0251,0.0154,-0.0062,-0.042,0.0234,-0.0208,0.009,-0.0064,-0.0321,-0.0123,-0.0396,0.004,0.0047,0.0047,0.025,-0.0096,-0.0187,-0.0077,0.0016,-0.0021,-0.0168,-0.0032,0.0347,0.0164,0.0069,-0.0027,0.0255,0.0096,-0.0191,-0.0119,-0.0124,-0.0164,0.0015,-0.0022,0.0001,0.0027,-0.0217,0.0242,-0.0028,-0.0107,0.0062,-0.0078,0.0311,-0.0186,0.0016,-0.0029,-0.0092,0.0199,-0.0065,0.0282,0.0098,0.0129,-0.0131,-0.002,-0.0145,0.0058,-0.0119,-0.0199,0.004,0.0124,0.0123,0.0225,0.0168,-0.0002,-0.0103,0.0142,0.0006,-0.02,0.0198,-0.0134,0.0075,0.0234,0.0185,-0.0051,0.0023,-0.0006,-0.017,0.006,-0.0083,-0.002,-0.0036,0.0039,-0.0076,-0.0025,0.0023,0.0047,-0.0001,0.0164,-0.005,0.0207,-0.0144,0.0094,-0.0133,0.0046,-0.0173,0.0025,-0.0021,-0.0032,-0.0034,-0.0175,0.0003,-0.024,-0.0055,-0.0024,-0.0082,0.0057,-0.0026,-0.0078,-0.013,-0.0036,0.0041,-0.0046,0.0124,0.0039
2016-01-22 00:00:00,-0.2768,0.1104,-0.0742,-0.0005,-0.1703,-0.091,-0.0845,0.0714,0.0234,-0.1221,0.0535,-0.0961,-0.0299,-0.0161,0.031,0.1175,-0.0103,0.0365,0.0011,-0.0652,-0.1325,0.1235,0.0384,-0.0507,-0.1074,-0.0651,-0.0351,0.0279,0.0106,-0.0999,0.0738,-0.1061,-0.0423,-0.0399,-0.0348,-0.0058,0.043,0.0083,0.1301,-0.0591,0.0984,-0.1055,0.0519,-0.049,0.024,0.0234,0.0344,0.0284,-0.0322,-0.0009,0.0059,0.1024,-0.0167,0.0244,0.1008,-0.003,-0.0128,-0.0184,0.1786,-0.0409,-0.0249,-0.0219,0.011,0.0018,0.0286,0.0002,-0.058,0.0276,-0.0695,0.012,-0.0198,0.0931,0.0236,0.0482,0.1069,-0.012,-0.0075,-0.0079,-0.0105,-0.0331,0.0912,0.0539,-0.1085,-0.0193,0.0349,-0.1107,0.0931,-0.0619,0.0283,-0.02,-0.0059,0.0089,-0.0099,-0.0087,-0.0774,-0.0723,-0.0754,-0.0651,-0.0146,0.0073,0.0071,0.0083,0.0121,-0.098,-0.0064,0.0645,-0.008,-0.0278,-0.0582,0.0269,0.0198,0.0164,0.0416,-0.0151,-0.0234,0.0235,-0.0521,-0.018,0.0408,-0.0432,-0.0209,0.0294,-0.0065,0.0291,0.0195,-0.0184,0.0137,0.0053,-0.0455,-0.03,-0.015,0.0052,0.0225,-0.0023,-0.0049,-0.022,-0.0447,0.0646,-0.0118,-0.0118,-0.0289,-0.0415,-0.0035,0.0086,0.0022,-0.0072,-0.0024,0.0476,-0.0635,-0.0108,-0.0424,0.0076,-0.0207,0.0588,0.0213,-0.0033,-0.0114,-0.0001,-0.0101,-0.0373,-0.009,0.0112,-0.0439,0.0117,-0.0529,-0.0005,0.0037,-0.0414,0.0044,0.0215,0.0122,-0.011,-0.004,-0.0201,-0.0073,-0.0089,-0.009,-0.0014,-0.0231,0.0148,-0.0134,-0.0004,-0.0089,0.0026,-0.0108,-0.0299,0.0189,-0.0137,-0.0095,0.0333,-0.0216,-0.0315,-0.0074,-0.0133,-0.002,-0.0294,0.0371,0.001,-0.0325,0.0044,-0.0283,0.0059,0.028,-0.0057,0.0132,0.0008,-0.0244,0.0366,-0.0102,0.0092,-0.0254,0.0038,0.0248,-0.0137,0.003,-0.0346,-0.002,0.0215,0.0024,-0.012,0.0019,-0.0201,-0.0309,-0.0163,-0.0099,0.022,0.003,-0.0217,-0.0059,0.002,-0.0082,-0.0057,-0.0084,-0.0039,-0.014,-0.0312,0.0051,-0.0035,-0.0212,-0.0057,0.0277,-0.0039,0.0176,0.0176,-0.0019,0.0006,0.005,0.0187,-0.0243,-0.0007,0.0115,0.0008,-0.0019,-0.0133,-0.0257,-0.0115,-0.0115,0.0128,0.0071,-0.0157,0.0267,0.0001,0.0072,0.0129,0.0116,-0.0207,0.0052,0.0008,0.0175,-0.0081,0.0016,-0.0095,-0.0143,-0.0072,0.0132,-0.024,-0.0113,0.0042,0.0018,0.0019,0.0133,-0.0014,0.0012,-0.0021,0.0242,0.0223,-0.0098,0.013,-0.0035,0.0076,0.0042,-0.0121,-0.0162,0.0108,0.0,-0.0135,-0.0099,-0.0041,-0.0112,-0.0101,0.0082,-0.0085,-0.0062,0.0155,0.0038,0.0186,-0.0021,0.0127,0.0006,-0.0131,-0.0073,-0.0002,0.011,0.0176,0.0004,-0.0053,0.0135,0.0011,0.0,-0.0108,0.0002,-0.0106,0.0047,0.0047,0.012,0.0075,0.0025,-0.0022,-0.0046,0.0037,0.0007,0.0008,-0.0083,-0.0073,-0.0082
2016-01-29 00:00:00,-0.4282,0.034,0.0934,0.1851,0.0729,-0.0879,-0.284,-0.04,-0.0751,0.0947,-0.0296,-0.1731,-0.1358,0.2377,0.0816,0.2249,0.1191,-0.074,0.1269,0.1651,0.0038,0.0806,-0.0628,0.2706,0.0633,-0.0968,-0.0328,0.0114,-0.0373,0.0499,0.1567,-0.0652,-0.0065,0.0492,0.0851,0.1418,0.0694,-0.0759,-0.028,-0.0249,0.0873,-0.0231,-0.1305,-0.164,-0.0558,0.0937,0.0411,-0.1279,-0.0903,0.0956,0.011,-0.1886,0.0203,0.0357,-0.0508,0.044,-0.0121,0.0194,-0.1041,0.0065,0.1939,-0.0402,0.0409,-0.0467,0.0618,0.0024,0.0435,0.0308,-0.0095,-0.0229,0.0712,-0.046,-0.0432,0.0006,0.0415,-0.0081,-0.035,0.0142,-0.0638,-0.1523,-0.0464,0.0218,0.012,0.0258,0.0012,0.015,-0.0395,-0.0323,-0.0552,0.0244,-0.0339,-0.0112,0.0081,-0.042,0.0336,0.0275,-0.0182,-0.0727,-0.0167,-0.0095,-0.0001,-0.0358,-0.063,0.0532,0.0269,-0.0383,-0.0229,0.0289,-0.0347,0.0017,-0.0539,-0.0141,0.0046,-0.048,-0.0094,0.0291,-0.0186,0.01,-0.0182,0.0552,-0.0079,0.016,0.0052,0.0478,0.0005,0.0044,0.0191,0.0078,-0.0009,-0.0088,0.0403,-0.0142,-0.0048,0.0275,0.0265,0.0418,-0.0261,0.0043,0.0154,-0.0122,0.0259,0.0139,0.0091,0.0079,-0.0123,0.007,-0.0196,0.0235,-0.0052,0.0188,0.0126,0.0104,-0.0356,0.01,-0.0109,0.0116,0.0189,-0.0064,0.0328,-0.0161,-0.0343,-0.0013,-0.0576,0.006,0.0494,-0.0409,0.0261,0.0027,0.0238,0.0012,0.0144,-0.0132,0.0156,-0.0107,-0.007,0.0051,0.0241,-0.0168,-0.0134,0.0063,-0.002,-0.0382,0.0045,0.0068,0.0033,0.0096,0.0284,-0.009,0.0081,0.0244,0.007,0.0008,-0.0379,-0.0362,0.0004,-0.0152,-0.0003,0.0234,0.0371,0.0082,0.0226,0.0011,0.0348,0.0086,-0.0195,0.0146,0.0006,-0.0049,0.0282,-0.0168,-0.0036,-0.0101,0.0048,0.0184,0.0014,0.0166,-0.0028,0.0064,0.0183,-0.0119,-0.027,-0.0095,0.0155,0.0247,0.0126,-0.0116,-0.0348,-0.0035,0.0134,0.0075,0.0106,0.0016,-0.0097,0.003,0.0128,-0.0214,-0.009,-0.0072,0.0057,0.0019,0.0247,0.0184,0.0056,0.0018,-0.0033,0.0054,-0.0026,0.0033,-0.0031,0.0238,0.0156,-0.0104,-0.0015,0.0217,0.0013,-0.0146,0.011,-0.0006,-0.0122,-0.0106,-0.0056,0.0087,-0.011,0.007,-0.0147,-0.0135,0.001,0.0136,0.0001,-0.0049,0.0008,-0.0106,-0.0082,0.0003,-0.0102,0.008,0.0027,0.0115,0.0127,0.0082,-0.0085,-0.0072,0.0074,0.025,-0.0185,0.0026,0.0044,0.0136,0.0027,-0.004,-0.0045,-0.0014,0.0101,0.0037,-0.0129,0.0063,-0.0103,0.0028,-0.0203,0.0114,0.0006,0.0053,-0.012,-0.0096,-0.0128,0.0197,0.0057,0.0119,0.0067,-0.0003,0.0035,0.0134,-0.0026,0.0128,-0.0091,0.0043,0.0013,0.0025,-0.0131,-0.0122,-0.0006,-0.0037,0.0186,0.0126,-0.0065,0.0018,-0.0059,0.0043,-0.0043,-0.0051,0.01,0.004,-0.0017,-0.0032,0.0086
2016-02-05 00:00:00,0.5419,0.0213,0.2968,0.1669,0.3421,0.1535,-0.1369,0.1273,0.1938,0.3975,0.083,0.013,0.1796,0.1352,-0.0391,0.041,0.0268,0.0596,0.223,0.277,0.238,-0.0039,0.0394,-0.0511,-0.0079,0.329,0.3305,-0.0778,-0.009,0.131,-0.0056,0.0397,0.0096,-0.0577,-0.1031,-0.0963,0.0125,-0.0373,0.1196,0.085,-0.0603,-0.0875,0.0736,0.0852,0.1001,-0.0586,0.0451,0.0153,0.0592,-0.0289,0.0581,-0.0444,-0.0655,0.0618,0.05,0.1232,0.0691,-0.0533,-0.0322,-0.0804,0.0732,0.0423,0.0333,0.0027,0.1192,0.0024,0.0215,-0.0851,-0.0151,-0.0287,0.0276,0.018,-0.069,0.0093,0.0035,0.0318,0.0719,0.0747,-0.0082,0.0119,0.0047,0.0235,-0.0362,-0.0179,0.0072,-0.0462,-0.0363,0.0031,-0.0041,0.0301,-0.0811,0.0295,-0.0232,-0.0165,-0.0301,-0.052,-0.0375,0.0217,0.0156,-0.0041,-0.0191,0.0154,0.0268,0.034,-0.047,-0.0129,-0.0193,0.0522,-0.0235,-0.0071,0.0217,0.0375,0.0021,0.0232,-0.0363,0.0152,0.0131,-0.0091,-0.0352,0.0177,-0.0344,-0.0062,-0.0073,0.0068,0.0267,-0.0114,0.0303,-0.0047,0.0502,-0.0129,0.0219,0.0053,0.0012,0.0323,-0.0497,0.0266,-0.004,-0.0165,0.0155,-0.0125,0.0192,0.028,-0.0265,-0.0114,0.0176,-0.0187,0.0183,-0.0199,-0.0134,-0.0,0.0305,0.0295,0.023,0.0153,0.008,-0.0026,-0.0118,0.0303,0.0377,0.0106,-0.0087,0.0222,-0.0309,0.0319,-0.0262,0.0108,-0.0367,0.0359,0.0025,-0.0304,0.0044,-0.0112,0.0255,-0.0011,0.0058,-0.0184,-0.011,-0.014,-0.0425,-0.0422,0.0565,-0.0069,0.0169,0.0053,-0.0203,0.0024,0.0343,0.0002,-0.0038,0.0186,0.0191,-0.0028,-0.0217,-0.0033,-0.0024,-0.0094,0.01,-0.0235,0.026,-0.0095,-0.0002,-0.004,-0.0128,-0.0497,-0.0049,0.0052,-0.0221,0.0032,0.0076,0.0154,0.0119,0.015,-0.0117,-0.0061,0.0122,0.0011,-0.0109,0.0013,-0.0136,0.0333,0.0022,-0.0317,0.0256,0.0096,-0.0051,0.0007,-0.008,0.0035,0.0253,0.0025,-0.0167,-0.0003,0.0145,-0.0115,-0.0021,-0.0002,-0.0272,0.0017,-0.0062,0.0104,0.0262,-0.0055,0.0015,0.0096,-0.0162,0.008,-0.0062,0.0058,-0.007,-0.0098,-0.0095,-0.0232,-0.0093,0.0179,-0.0209,0.0261,0.0048,0.0003,0.0163,-0.0009,0.0033,0.018,-0.0179,-0.0043,-0.0131,-0.0191,0.0098,0.012,-0.0259,-0.0117,-0.0093,0.0161,0.019,-0.0135,-0.0327,-0.0051,-0.0154,-0.0056,0.0057,0.0107,0.0133,-0.0035,0.0268,-0.0139,-0.0087,-0.0185,0.0033,-0.0258,0.0032,-0.0106,0.0126,0.0356,0.0047,-0.0249,-0.0089,0.01,0.0085,-0.0155,0.0252,-0.0207,-0.0185,0.0139,-0.0031,0.0149,0.0096,-0.0007,-0.0017,-0.0231,0.009,0.0001,-0.0042,-0.006,-0.0017,-0.0062,0.006,-0.0147,-0.009,0.0022,0.002,-0.0028,-0.0076,-0.0128,-0.0008,-0.0071,-0.0003,-0.0154,-0.0264,0.0113,0.006,-0.0069,-0.0148,0.0149,-0.0008,-0.0198,0.0084
2016-02-12 00:00:00,0.2734,0.1438,-0.1272,0.1009,0.177,-0.1064,0.009,-0.1665,-0.1276,-0.0367,0.2453,0.0842,0.0422,0.0929,0.1385,-0.0398,0.049,0.0791,0.0326,0.0356,-0.1282,-0.0958,0.0994,0.0553,0.2244,-0.2496,0.0132,0.1527,0.0051,0.1564,-0.2182,0.0498,0.0112,0.0286,0.0922,0.0211,0.0231,-0.005,-0.0859,0.0948,-0.0146,0.1026,0.1153,0.1653,0.0573,-0.0988,0.005,0.0983,-0.0569,0.0788,0.0887,-0.0116,0.0816,-0.1478,-0.0487,0.1551,-0.0175,-0.1062,-0.0445,0.0258,-0.0066,0.0218,0.0206,-0.0005,0.05,-0.1116,0.0285,0.0012,0.0171,0.0658,0.0857,-0.0355,0.1098,0.0379,0.0301,-0.0268,-0.0618,-0.0231,-0.0348,-0.0037,-0.0881,0.0903,-0.0762,0.0129,0.027,-0.0312,-0.0108,0.0095,-0.0602,-0.0232,0.0046,0.0601,0.0212,0.0363,0.042,-0.0577,-0.0012,-0.0735,0.0218,-0.0318,0.0046,-0.037,0.0132,0.0477,-0.0285,0.0628,0.0115,0.018,-0.0549,0.0207,0.0207,0.029,-0.0161,0.0476,-0.0264,-0.0019,0.0394,0.0149,-0.0762,0.0343,-0.024,0.0133,0.0095,0.0213,0.0403,0.0257,-0.0419,-0.0708,-0.0162,-0.0078,-0.0664,-0.0399,-0.0128,0.0188,-0.0253,0.0363,-0.0123,-0.0008,-0.0032,-0.0395,0.0219,-0.0273,-0.014,0.0019,-0.0106,-0.015,-0.0257,0.0227,-0.0202,0.0077,0.0003,-0.016,-0.0271,-0.0155,-0.0571,0.0115,-0.0462,0.0064,0.018,0.0201,-0.0256,-0.0046,0.0381,0.0167,0.0007,-0.0268,-0.0147,0.0012,-0.001,0.043,0.0232,0.0069,-0.0153,0.0068,-0.0375,-0.0035,0.0183,-0.0115,-0.0237,-0.0224,-0.0346,0.0383,-0.0144,0.0044,-0.0077,0.0003,0.0105,-0.0359,-0.0141,0.0144,0.0215,0.006,0.0111,0.0152,-0.0204,-0.014,-0.007,0.0022,0.002,-0.0065,-0.0097,-0.0197,-0.0086,0.0056,-0.0033,-0.0013,-0.0113,0.0011,0.0115,0.0218,0.0015,0.0014,-0.0335,0.0141,-0.0207,0.0454,0.0051,0.0019,-0.0067,0.0115,-0.0241,0.0002,0.0004,-0.0143,0.0096,-0.0391,0.0123,0.0113,-0.011,-0.0096,-0.0054,0.0032,-0.0022,0.0001,-0.0137,0.0069,0.015,0.0093,0.0144,-0.0075,0.01,-0.0068,0.0003,-0.0096,0.008,0.0003,0.0087,0.0172,-0.0073,0.0041,-0.022,-0.0003,-0.0108,0.0116,-0.004,0.0101,-0.0086,-0.0042,0.0114,0.007,-0.0009,0.0254,-0.0044,0.006,0.0046,0.0136,0.0045,-0.0052,0.0016,-0.0144,0.01,-0.0066,-0.01,-0.0018,0.0312,0.0017,-0.0087,-0.0101,-0.0127,-0.0002,-0.0069,-0.0009,0.0013,-0.0043,-0.0136,-0.0125,-0.0092,-0.001,-0.0145,0.0111,-0.0027,-0.0122,0.0196,0.0199,-0.0093,0.015,-0.0055,-0.0081,0.0021,-0.0181,0.0032,-0.0075,-0.009,-0.0292,0.0039,0.0113,0.0152,0.0006,-0.0128,0.0223,0.0052,0.0062,0.0064,-0.0006,0.0056,-0.0073,0.0092,0.0132,0.0086,-0.004,0.0055,0.0027,-0.0235,0.0048,0.0006,-0.009,-0.0019,-0.0179,-0.0076,-0.013,-0.0018,-0.0077,0.0102,0.0032,-0.0009


### Explained Variance

Explained Variance is defined as $\frac{\lambda_i}{\sum_j \lambda_j}$

In [83]:
explained_var = pd.DataFrame(data = pca.explained_variance_ratio_,
                                 index = pca_factors.columns, 
                                 columns = ['Explained Variance'])
explained_var['Cumulative Explained Variance'] = explained_var['Explained Variance'].cumsum()

display(explained_var.style.format('{:,.2%}'))

Unnamed: 0,Explained Variance,Cumulative Explained Variance
PC 1,41.12%,41.12%
PC 2,5.97%,47.09%
PC 3,4.40%,51.49%
PC 4,2.65%,54.14%
PC 5,1.58%,55.71%
PC 6,1.37%,57.08%
PC 7,1.27%,58.35%
PC 8,1.17%,59.51%
PC 9,1.05%,60.56%
PC 10,1.01%,61.57%


The first 30 factors capture about 75% of the variation. If we run the regression model using these factors: 

In [93]:
X = sm.add_constant(pca_factors.iloc[:, :30])
y = spy

ols_model = sm.OLS(endog=y, exog=X).fit()
print(ols_model.summary())

                            OLS Regression Results                            
Dep. Variable:                    SPY   R-squared:                       0.986
Model:                            OLS   Adj. R-squared:                  0.985
Method:                 Least Squares   F-statistic:                     720.1
Date:                Fri, 09 Sep 2022   Prob (F-statistic):          2.74e-263
Time:                        12:12:20   Log-Likelihood:                 1489.6
No. Observations:                 335   AIC:                            -2917.
Df Residuals:                     304   BIC:                            -2799.
Df Model:                          30                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.0009      0.000     -5.398      0.0

## Exercise:

Split the data such that you train the model on the first 80% of the observations. Run two models: with PCA and without PCA. Test both models on the remaining 20% of the data by calculating the $R^2_{oos}$. What do you observe?

## Important Points to Consider About PCA:

#### 1. PCA requires the data to be standardized, otherwise the algorithms discussed above do not give the correct result.

#### 2. It is important to note that PCA is not a data reduction technique. It is a data manipulation technique.

#### 3. PCA will not be applicable if factors have too much noise which is uncorrelated with the dependent variable. For example, analyzing S&P 500 index against 400 equities and 200 Commodities

#### 4. If a factor selection technique is required, Lasso or Ridge regression is a better approach