# Shrinkage Methods
* subset selection methods described before uses OLS to fit a linear model that contains a subset of predictors

* an alternative is to fit a model containing all predictors using a technique that constrains, or 'regularizes,' the coefficient estimates, or equivalently, shrinks the coefficient estimates towards zero

* previously, we wanted to find the most parsimonious model from following dataset:

In [1]:
import pandas as pd
import numpy as np
import patsy
import statsmodels.api as sm

# download dataset and view first five observations
df = pd.read_stata('http://fmwww.bc.edu/ec-p/data/wooldridge/hprice2.dta')
df.head()

Unnamed: 0,price,crime,nox,rooms,dist,radial,proptax,stratio,lowstat,lprice,lnox,lproptax
0,24000.0,0.006,5.38,6.57,4.09,1.0,29.6,15.3,4.98,10.08581,1.682688,5.69036
1,21599.0,0.027,4.69,6.42,4.97,2.0,24.200001,17.799999,9.14,9.980402,1.545433,5.488938
2,34700.0,0.027,4.69,7.18,4.97,2.0,24.200001,17.799999,4.03,10.4545,1.545433,5.488938
3,33400.0,0.032,4.58,7.0,6.06,3.0,22.200001,18.700001,2.94,10.41631,1.521699,5.402678
4,36199.0,0.069,4.58,7.15,6.06,3.0,22.200001,18.700001,5.33,10.49679,1.521699,5.402678


* in the best, forward, and backward subset search, we found the best model to be the following:

$ lprice = \beta_0 + \beta_1 lnox + \beta_2 lproptax + \beta_3 crime + \beta_4 rooms + \beta_5 dist + \beta_6 radial + \beta_7 stratio + \beta_8 lowstat + e$

* this model, however, excludes the possibility of interaction terms among predictors

* consider a most completel potential model with all possible cross-products among regressors after they're recentered at mean

$ lprice = \beta_0 + \beta_1 lnox + \beta_2 lproptax + \beta_3 crime + \beta_4 rooms + \beta_5 dist + \beta_6 radial + \beta_7 stratio + \beta_8 lowstat + \beta_9 (lnox - u_{lnox})(crime - u_{crime}) + \beta_10 (lnox-u_{lnox})(crime - u_{crime}) + \dots + \beta_35 (radial-u_{radial})(lowstat - u_{lowstat}) + \beta_36 (stratio-u_{stratio})(lowstat - u_{lowstat}) + e$

* this model has 4x more regressors than the original number of predictors

* $\beta_2$ := constant elasticity of home prices with respect to property tax at meal value


In [2]:
# add all cross-products among demeaned set of predictors to original data frame
variables = ['lnox','lproptax','crime','rooms','dist','radial','stratio','lowstat']
for x in variables:
    df[x+'_dmean'] = df[x] - df[x].mean(skipna = True)
print(list(df))

['price', 'crime', 'nox', 'rooms', 'dist', 'radial', 'proptax', 'stratio', 'lowstat', 'lprice', 'lnox', 'lproptax', 'lnox_dmean', 'lproptax_dmean', 'crime_dmean', 'rooms_dmean', 'dist_dmean', 'radial_dmean', 'stratio_dmean', 'lowstat_dmean']
