## Workshop - Regularization

In this workshop, we are going to:

1. Tune an elastic-net regression 
2. Compare the following models:
    1. The null model
    2. The tuned elastic-net model
    3. The trimmed non-regularized model with standardized features
    4. The trimmed non-regularized model with non-standardized features
    
# Preliminaries

- Load any necessary packages and/or functions
- Load in and prepare the class data
- Create x and y with a label of `pct_d_rgdp`
- Create `x_train`, `x_test`, `y_train`, `y_test` with
    * training size of two-thirds
    * random state of 490
- Standardize the features
- Add constants

In [2]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn import linear_model as lm

In [3]:
df = pd.read_csv('C:/Users/Devin/Documents/Classes/Current/ECON490/class_data.csv')
df.columns

Index(['fips', 'year', 'GeoName', 'pct_d_rgdp', 'urate_bin', 'pos_net_jobs',
       'emp_estabs', 'estabs_entry_rate', 'estabs_exit_rate', 'pop',
       'pop_pct_black', 'pop_pct_hisp', 'lfpr', 'density'],
      dtype='object')

In [4]:
df_prepped = df.drop(columns = ['urate_bin', 'year']).join([
    pd.get_dummies(df['urate_bin'], drop_first = True),
    pd.get_dummies(df.year, drop_first = True)    
])
df_prepped.drop(columns = ['GeoName'], inplace = True)

In [5]:
y = df_prepped['pct_d_rgdp']
x = df_prepped.drop(columns = 'pct_d_rgdp')

x_train, x_test, y_train, y_test = train_test_split(x, y, train_size = 2/3, random_state = 490)

x_train_std = x_train.apply(lambda x: (x - np.mean(x))/np.std(x), axis = 0)
x_test_std  = x_test.apply(lambda x: (x - np.mean(x))/np.std(x), axis = 0)

x_train_std = sm.add_constant(x_train_std)
x_test_std  = sm.add_constant(x_test_std)
x_train     = sm.add_constant(x_train)
x_test      = sm.add_constant(x_test)

Take a look at `lm.ElasticNet?` and 
```
fit = sm.OLS(y_train, x_train)
fit.fit_regularized?
```
Determine which coefficients are the same, but named differently.
Specifically, $\alpha$ and the weight on the different constraints (i.e. $||\beta||_2$ and $||\beta||_1$).

In [None]:
fit = sm.OLS(y_train, x_train)
fit_reg = fit.fit_regularized(method = 'elastic_net')
print(fit_reg.params)

Perform a 5-fold cross-validation grid search with a random state of 490. 
Identify the optimally tuned hyperparameters.
Use this grid:
```
param_grid = {'alpha': 10.**np.arange(-5, -1, 1), 
              'l1_ratio': np.arange(0, 1, 0.1)}
```
You will get a warning message about convergence.
We will discuss it after the workshop.
Think about why it occuring.

In [23]:
param_grid = {'alpha': 10.**np.arange(-5, -1, 1), 
              'l1_ratio': np.arange(0, 1, 0.1)}

cv_elastic = lm.ElasticNet(fit_intercept = False, normalize = False, random_state = 490)

grid_search = GridSearchCV(cv_elastic, param_grid, cv = 5, scoring = 'neg_root_mean_squared_error')

grid_search.fit(x_train_std, y_train)
best = grid_search.best_params_['alpha']
best

  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)
  positive)


0.01

****
# Question

How many models did we just fit?

***
Using the tuned hyperparameters, fit your elastic net model with `statsmodels`

Using the selected features refit

- the non-regularized model with standardized features
- the non-regularized model with non-standardized features

Compare the percent improvement from the null model RMSE to the elastic-net and OLS model.