## Penalized Regression Methods Example Code

### Imports

In [535]:
import pandas as pd
import numpy as np 

import yfinance as yf

from sklearn.linear_model import Ridge, RidgeCV, Lasso, LassoCV, LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

### Data
- Use data from demo regression, but shorten the time frame so the data is more noisy (beneficial for penalized regression)

In [565]:
df = yf.download("AAPL F GM IVV MSFT GOOGL SPY BTC-USD", start="2020-06-30", end="2021-06-30")['Adj Close']

data = df.dropna().pct_change().dropna()

data.head()

[*********************100%***********************]  8 of 8 completed


Unnamed: 0_level_0,AAPL,BTC-USD,F,GM,GOOGL,IVV,MSFT,SPY
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2020-07-01,-0.001892,0.009885,-0.016447,-0.013439,0.016889,0.006846,0.005847,0.007005
2020-07-02,0.0,-0.011369,0.011706,0.011218,0.019369,0.00526,0.007621,0.005507
2020-07-06,0.02675,0.027628,0.02314,0.019414,0.020219,0.015345,0.021526,0.015437
2020-07-07,-0.003103,-0.01314,-0.011309,-0.023319,-0.006488,-0.010306,-0.011628,-0.010314
2020-07-08,0.02329,0.019028,-0.004902,-0.007561,0.009182,0.007461,0.021993,0.007649


**Regressand and regressors set up:**
> Let's use securities to replicate BTC-USD

In [566]:
y = data['BTC-USD']

x = data[['AAPL','F','GM','GOOGL','IVV','MSFT','SPY']]

Let's split the data into train (first 75% of data) and test (25%) sets:

In [567]:
x_train, x_test , y_train, y_test = train_test_split(x, y, test_size=0.25, shuffle = False)

### Penalized Regression
These methods deal with multicollinearity by penalizing when more factors/regressors are added.

**Lasso:** Lasso uses a penalty that is the absolute value of a coefficient multiplied by a parameter $\lambda$
> $\beta^{LASSO}$ minimizes $(y-X\beta)^{2} + \lambda|\beta|$

**Example Code**

In [568]:
### Alpha is the lambda parameter above, if alpha = 0 we are running OLS 
Lasso_model = Lasso(alpha = 0.00005)
Lasso_res = Lasso_model.fit(x_train, y_train)

### Lasso coefficients
Lasso_res.coef_

array([ 0.21858932, -0.00757204,  0.12843141,  0.04505072,  0.        ,
        0.        ,  0.        ])

In [576]:
### We can optimize the alpha parameter by using cross validation
### Run 10 cross validation simulations
Lasso_cv = LassoCV(alphas = None, cv = 10, max_iter = 100000)
Lasso_cv_model = Lasso_cv.fit(x_train,y_train)
Lasso_cv_model.coef_

array([ 0.19908325, -0.        ,  0.10306768,  0.01140893,  0.        ,
        0.        ,  0.        ])

In [577]:
### Optimal parameter
Lasso_cv_model.alpha_

6.720575171941788e-05

**Ridge:** Ridge uses a penalty that is the square of a coefficient multiplied by a parameter $\lambda$
> $\beta^{RIDGE}$ minimizes $(y-X\beta)^{2} + \lambda\beta^{2}$

**Example Code:**

In [578]:
### Alpha is the lambda parameter above, if alpha = 0 we are running OLS 
Ridge_model = Ridge(alpha = 0.005)
Ridge_res = Ridge_model.fit(x_train, y_train)

### Ridge coefficients
Ridge_res.coef_

array([ 0.16406925, -0.32258409,  0.28691113,  0.04768535,  0.36066716,
       -0.08657664,  0.34537472])

In [579]:
### We can optimize the alpha parameter by inputting a set of possible alphas and a scoring selection metric
alphas = 10**np.linspace(10,-2,100)*0.5

Ridge_cv = RidgeCV(alphas = alphas, scoring = 'neg_mean_squared_error')
Ridge_cv_model = Ridge_cv.fit(x_train,y_train)
Ridge_cv_model.coef_

array([ 0.12416993, -0.07191499,  0.12953284,  0.06755291,  0.09039052,
        0.05541742,  0.08951268])

In [580]:
### Optimal parameter
Ridge_cv_model.alpha_

0.08148754173103201

**Out of Sample MSE Performance for Ridge vs. OLS:**

In [581]:
### Ridge out of sample mean squared error
ridge_oos = Ridge(alpha = Ridge_cv_model.alpha_, normalize = True)
ridge_oos.fit(x_train, y_train)

mean_squared_error(y_test, ridge_oos.predict(x_test))

0.003300072449996496

In [582]:
OLS_oos = LinearRegression().fit(x_train, y_train)

mean_squared_error(y_test, OLS_oos.predict(x_test))

0.0033692932864966122

Note that Ridge or Lasso will not be very effective in this use case as there is not high multicollinearity between the x variables or many x variables, this code mostly serves as a demo for how to implement Ridge or Lasso.