<center><h1> Case Study 1</h1></center>
<center><h3> Week 1 (out of 5)</h3></center>

**Author(s):**
1. Robin Fu (robin.fu@emory.edu)
 
**Data Source**: W.C. Hunter and M.B. Walker (1996), [“*The Cultural Affinity Hypothesis and Mortgage Lending Decisions*,”](https://link.springer.com/article/10.1007/BF00174551) Journal of Real Estate Finance and Economics 13, 57-70.
 
**Book**: [Introductory Econometrics: A Modern Approach](https://economics.ut.ac.ir/documents/3030266/14100645/Jeffrey_M._Wooldridge_Introductory_Econometrics_A_Modern_Approach__2012.pdf) by Jeffrey Wooldridge

**Data Description**: ```http://fmwww.bc.edu/ec-p/data/wooldridge/loanapp.dta```

```
  Obs:  1989

  1. occ                       occupancy0hb
  2. loanamt                   loan amt in thousands
  3. action                    type of action taken
  4. msa                       msa number of property
  5. suffolk                   =1 if property in Suffolk County
  6. race                      race of applicant
  7. gender                    gender of applicant
  8. appinc                    applicant income, $1000s
  9. typur                     type of purchaser of loan
 10. unit                      number of units in property
 11. married                   =1 if applicant married
 12. dep                       number of dependents
 13. emp                       years employed in line of work
 14. yjob                      years at this job
 15. self                      self-employment dummy
 16. atotinc                   total monthly income
 17. cototinc                  coapp total monthly income
 18. hexp                      propose housing expense
 19. price                     purchase price
 20. other                     other financing, $1000s
 21. liq                       liquid assets
 22. rep                       no. of credit reports
 23. gdlin                     credit history meets guidelines
 24. lines                     no. of credit lines on reports
 25. mortg                     credit history on mortgage paym
 26. cons                      credit history on consumer stuf
 27. pubrec                    =1 if filed bankruptcy
 28. hrat                      housing exp, % total inccome
 29. obrat                     other oblgs,  % total income
 30. fixadj                    fixed or adjustable rate?
 31. term                      term of loan in months
 32. apr                       appraised value
 33. prop                      type of property
 34. inss                      PMI sought
 35. inson                     PMI approved
 36. gift                      gift as down payment
 37. cosign                    is there a cosigner
 38. unver                     unverifiable info
 39. review                    number of times reviewed
 40. netw                      net worth
 41. unem                      unemployment rate by industry
 42. min30                     =1 if minority pop. > 30%
 43. bd                        =1 if boarded-up val > MSA med
 44. mi                        =1 if tract inc > MSA median
 45. old                       =1 if applic age > MSA median
 46. vr                        =1 if tract vac rte > MSA med
 47. sch                       =1 if > 12 years schooling
 48. black                     =1 if applicant black
 49. hispan                    =1 if applicant Hispanic
 50. male                      =1 if applicant male
 51. reject                    =1 if action == 3
 52. approve                   =1 if action == 1 or 2
 53. mortno                    no mortgage history
 54. mortperf                  no late mort. payments
 55. mortlat1                  one or two late payments
 56. mortlat2                  > 2 late payments
 57. chist                     =0 if accnts deliq. >= 60 days
 58. multi                     =1 if two or more units
 59. loanprc                   amt/price
 60. thick                     =1 if rep > 2
 61. white                     =1 if applicant white
 62. obwhte                    obrat*awhite
 ```

Original Research Variables to Include:
1. approve (=1 if action ==1 or 2)
2. hrat
3. obrat
4. mhist (=1 if 2 or tewer mortgage payments recorded as late, 0 otherwise); mortlat2 = 0 when mhist = 1
5. pubrec (public records history, 1 if public record cefault, 0 otherwise)
6. self (1 if self-employed, 0 otherwise)
7. chist (1 if no history of delinquent credits, 0 otherwise)
8. unem (estimate of probability of unemployment by industry), time 
9. multi (dummy)
10. cosign (dummy)
11. married (dummy)
12. loanprc 
13. dep (factor)
14. sch (dummy)
15. thick (dummy)
16. white (=1 if white, 0 otherwise)
17. gender (=1 if male, 0 otherwise)
18. vr (=1 if tract vacancy < MSA, 0 otherwise)
19. school * white
20. school * gender
21. School * thick
22. school * vr
23. chist * white
24. chist * gender
25. chist * thick
26. chist * vr
27. obwhte
28. race (note: race = 5 is white, race = 4 is hispan, race = 3 is black), included per assignment 

In your new job as a data analyst, you have been given this data set and asked to _build_ a machine (in this case a **linear probability model**) to aid your client to automatize their loan approval decisions. Using your knowledge of the Ridge, Lasso, and Elastic Net estimators find an economic sound model with the smallest *mean squared errors* when 20% of the observations in this data set are kept to validate the proposed model using a seed equal to 42.

<h4>Things to consider:</h4>

    1. Read the original paper to understand how the variables were constructed and the feature in their model.
    2. Read section 7.4 titled "Interactions Involving Dummy Variables" in 'Introductory Econometrics: A Modern Approach' by Jeffrey Wooldridge.
    3. Read section 7.5 titled "A Binary Dependent Variable: The Linear Probability Model" in 'Introductory Econometrics: A Modern Approach' by Jeffrey Wooldridge.
    4. You should try standardizing as well as normalizing your chosen features when trying different estimators.
    5. Always estimate an intercept.
    6. Your answers should include the final chosen specification and the reported mean squared error of your validation data set.

# Data Preprocessing

In [1]:
import pandas as pd
import numpy as np
import patsy
from sklearn.metrics import mean_squared_error

#Loading data and creating variables in original study and race
initial = pd.read_stata('http://fmwww.bc.edu/ec-p/data/wooldridge/loanapp.dta')
df =  initial.loc[:,['approve','hrat','obrat','mortlat2','pubrec','self','chist',
      'unem','multi','cosign','married','loanprc','dep','sch','thick','white','male','vr','obwhte','race']]

temp = ['white','male','thick','vr']
new = []
for elem1 in ['sch','chist']:
    for elem2 in temp:
        df[elem1[:3] + elem2[:2]] = df[elem1] * df[elem2]
        new.append(elem1[:3] + elem2[:2])
df = df.drop(columns = 'white')

#Clearing observations that have NA
df = df.dropna()

#Creating 6 dummy variables by race and gender, setting white_male to be base category
dummies = pd.get_dummies(df.race.astype(str) + df.male.astype(str))
dummies.columns = ['black_female','black_male',
                   'hispan_female','hispan_male',
                   'white_female','white_male']

df[dummies.columns] = dummies
df = df.drop(columns = ['male','race','white_male'])

tmp = list(df.columns)[1:]
for x in tmp:
    df[x+'_dmean'] = df[x] - df[x].mean()
print(list(df))

['approve', 'hrat', 'obrat', 'mortlat2', 'pubrec', 'self', 'chist', 'unem', 'multi', 'cosign', 'married', 'loanprc', 'dep', 'sch', 'thick', 'vr', 'obwhte', 'schwh', 'schma', 'schth', 'schvr', 'chiwh', 'chima', 'chith', 'chivr', 'black_female', 'black_male', 'hispan_female', 'hispan_male', 'white_female', 'hrat_dmean', 'obrat_dmean', 'mortlat2_dmean', 'pubrec_dmean', 'self_dmean', 'chist_dmean', 'unem_dmean', 'multi_dmean', 'cosign_dmean', 'married_dmean', 'loanprc_dmean', 'dep_dmean', 'sch_dmean', 'thick_dmean', 'vr_dmean', 'obwhte_dmean', 'schwh_dmean', 'schma_dmean', 'schth_dmean', 'schvr_dmean', 'chiwh_dmean', 'chima_dmean', 'chith_dmean', 'chivr_dmean', 'black_female_dmean', 'black_male_dmean', 'hispan_female_dmean', 'hispan_male_dmean', 'white_female_dmean']


In [2]:
artregs = ['hrat','obrat','mortlat2','pubrec','self','chist',
      'unem','multi','cosign','married','loanprc','dep','sch','thick',
      'vr','obwhte'] + new

tmpstr = []
for reg1 in artregs:
    for reg2 in dummies.columns.difference(['white_male']):
        tmpstr.append('('+reg1+'_dmean:'+reg2+'_dmean'+')')
        
f = 'approve ~ -1 + ' + ''.join([x+' + ' for x in tmp]) + ''.join([x+' + ' for x in tmpstr])[:-2]

y, X = patsy.dmatrices(f, data=df, return_type='dataframe')

# Regressions using Standardization

In [3]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_stnd = pd.DataFrame(scaler.fit_transform(X), columns = X.columns)
y_stnd = pd.DataFrame(scaler.fit_transform(y), columns = y.columns)

Xs_train, Xs_test, ys_train, ys_test = train_test_split(X_stnd, y_stnd, test_size = 0.2, random_state = 42)

StMSE = pd.DataFrame(columns = ["MSE", "Number of Regs"]) #Create empty dataframe to store MSE values and Regressors

### Comment

You standardized the model before but in the options of  Ridge, LASSO, and Elastic regression you chose ```normalized = True```. In the web-page of the library the authors say:  If you wish to standardize, please use ```sklearn.preprocessing.StandardScaler``` before calling fit on an estimator with ```normalize=False```. You standardize the data yourself but it still applies.

### *Ridge Regression*

In [4]:
from sklearn.linear_model import Ridge, RidgeCV #RidgeCV is Ridge Cross Validation

alphas = np.linspace(0.5,1.5,50)

ridgecv = RidgeCV(alphas = alphas+0.01, scoring = 'neg_mean_squared_error', normalize = True) #RidgeCV cannot take alpha = 0
ridgecv.set_params(fit_intercept = True)
ridgecv.fit(Xs_train, ys_train) #Function auto does n-fold cross validation
print('Ridge Alpha:',ridgecv.alpha_) #calculates everything and returns optimal alpha after fit by above

ridge = Ridge(alpha = ridgecv.alpha_, normalize = True)
ridge.set_params(fit_intercept = True)
ridge.fit(Xs_train, ys_train)
StMSE.loc['Ridge Regression','MSE'] = mean_squared_error(ys_test, ridge.predict(Xs_test))
print('Mean Squared Error:',mean_squared_error(ys_test, ridge.predict(Xs_test)))

ridgeregs = pd.DataFrame(ridge.coef_.ravel(), index = X.columns)
print(ridgeregs)
StMSE.loc['Ridge Regression','Number of Regs'] = len(ridgeregs)

Ridge Alpha: 1.1834693877551021
Mean Squared Error: 0.8934063378888983
                                        0
hrat                            -0.007069
obrat                           -0.058842
mortlat2                        -0.007476
pubrec                          -0.084088
self                            -0.025456
...                                   ...
chivr_dmean:black_female_dmean  -0.001438
chivr_dmean:black_male_dmean     0.010136
chivr_dmean:hispan_female_dmean  0.000434
chivr_dmean:hispan_male_dmean    0.004132
chivr_dmean:white_female_dmean  -0.004819

[149 rows x 1 columns]


### *Lasso Regression*

In [5]:
from sklearn.linear_model import Lasso, LassoCV

alphas = np.linspace(0.0005,0.002,50, endpoint = True)

lasso = Lasso(max_iter = 10000, normalize = True) #iterations used to minimize bc not differentiable; j testing shit or smthn

lassocv = LassoCV(alphas = list(alphas), cv = 10, max_iter = 100000, normalize = True)
lassocv.set_params(fit_intercept=True)
lassocv.fit(Xs_train, ys_train.values.ravel())
print('Lasso Alpha:',lassocv.alpha_)

lasso.set_params(alpha=lassocv.alpha_,fit_intercept=True)
lasso.fit(Xs_train, ys_train.values.ravel())  # need .ravel() to avoid warning
StMSE.loc['Lasso Regression','MSE'] = mean_squared_error(ys_test, lasso.predict(Xs_test))
print('Mean Squared Error:',mean_squared_error(ys_test, lasso.predict(Xs_test))) #0.09458367849741435

#Selecting non-zero coefficients
lasregs = pd.DataFrame(lasso.coef_, index = X.columns)[list(pd.DataFrame(lasso.coef_, index = X.columns).T.any())]
print(lasregs)
StMSE.loc['Lasso Regression','Number of Regs'] = len(lasregs) 

Lasso Alpha: 0.0010510204081632653
Mean Squared Error: 0.879160955394095
                                          0
obrat                             -0.130277
pubrec                            -0.159693
self                              -0.014746
chist                              0.111748
multi                             -0.031258
loanprc                           -0.037058
vr                                -0.023290
obwhte                             0.066721
chiwh                              0.027850
black_male                        -0.012509
hrat_dmean:black_female_dmean     -0.002633
obrat_dmean:black_female_dmean    -0.018934
obrat_dmean:hispan_female_dmean   -0.024761
mortlat2_dmean:black_female_dmean  0.014932
self_dmean:black_male_dmean        0.023469
chist_dmean:black_male_dmean       0.011572
chist_dmean:hispan_male_dmean      0.036474
unem_dmean:black_female_dmean      0.019863
unem_dmean:hispan_male_dmean      -0.011833
multi_dmean:black_female_dmean    -0.042469
loa

### *Elastic Net Regression*

In [6]:
from sklearn.linear_model import ElasticNet, ElasticNetCV
enetcv = ElasticNetCV(cv=5, random_state=42,fit_intercept=True,normalize = True)
enetcv.fit(Xs_train, ys_train.values.ravel())
print('Elastic Alpha:',enetcv.alpha_)
print('Elastic l1 Ratio:',enetcv.l1_ratio)
StMSE.loc['Elastic Net','MSE'] = mean_squared_error(ys_test, enetcv.predict(Xs_test))
print('Mean Squared Error:',mean_squared_error(ys_test, enetcv.predict(Xs_test)))

enet = ElasticNet(alpha=enetcv.alpha_, l1_ratio=enetcv.l1_ratio, fit_intercept = True, normalize = True)
enet.fit(Xs_train,ys_train)
enetregs = pd.DataFrame(enet.coef_, index = X.columns)[list(pd.DataFrame(enet.coef_, index = X.columns).T.any())]
print(enetregs)
StMSE.loc['Elastic Net','Number of Regs'] = len(enetregs)

Elastic Alpha: 0.001042554701391982
Elastic l1 Ratio: 0.5
Mean Squared Error: 0.8854208508532732
                                       0
obrat                          -0.066141
mortlat2                       -0.001812
pubrec                         -0.096377
self                           -0.018672
chist                           0.060588
...                                  ...
chima_dmean:hispan_male_dmean   0.013064
chima_dmean:white_female_dmean -0.016409
chith_dmean:black_female_dmean  0.003607
chith_dmean:black_male_dmean    0.006697
chith_dmean:hispan_male_dmean   0.013892

[81 rows x 1 columns]


## *Conclusions about Standardization*

First, to compare the results from the different shrinkage methods above, we can refer to the different MSEs from each of the methods above.

In [7]:
StMSE

Unnamed: 0,MSE,Number of Regs
Ridge Regression,0.893406,149
Lasso Regression,0.879161,25
Elastic Net,0.885421,81


For most purposes, the Ridge Regression will not be used regardless of the MSE it produces. This is due to the issue that the Ridge Regression does not effectively help with model selection as it will include *all* regressors in the final model, a phenomenon we do not want when we are trying to select a more parsimonious model.

Comparing the results above, we can see that, after standardization, the Lasso Regression produces the lowest MSE, signifying that it has the highest prediction power. The Lasso Regression is also preferred to the ELastic Net as it selects fewer regressors while achieving this lower MSE, selecting a more parsimonious and predictively powerful model. The list of regressors picked via the Lasso Regression can be found below:

In [8]:
lasregs.index

Index(['obrat', 'pubrec', 'self', 'chist', 'multi', 'loanprc', 'vr', 'obwhte',
       'chiwh', 'black_male', 'hrat_dmean:black_female_dmean',
       'obrat_dmean:black_female_dmean', 'obrat_dmean:hispan_female_dmean',
       'mortlat2_dmean:black_female_dmean', 'self_dmean:black_male_dmean',
       'chist_dmean:black_male_dmean', 'chist_dmean:hispan_male_dmean',
       'unem_dmean:black_female_dmean', 'unem_dmean:hispan_male_dmean',
       'multi_dmean:black_female_dmean', 'loanprc_dmean:hispan_female_dmean',
       'sch_dmean:black_female_dmean', 'schth_dmean:black_male_dmean',
       'schth_dmean:white_female_dmean', 'chith_dmean:hispan_male_dmean'],
      dtype='object')

In [9]:
#OLS of the model selected by lasso regression via standardization
import statsmodels.api as sm
print(sm.OLS(y_stnd,sm.add_constant(X_stnd[lasregs.index])).fit().summary())

                            OLS Regression Results                            
Dep. Variable:                approve   R-squared:                       0.193
Model:                            OLS   Adj. R-squared:                  0.183
Method:                 Least Squares   F-statistic:                     18.55
Date:                Sun, 29 Mar 2020   Prob (F-statistic):           6.08e-73
Time:                        17:58:57   Log-Likelihood:                -2571.9
No. Observations:                1961   AIC:                             5196.
Df Residuals:                    1935   BIC:                             5341.
Df Model:                          25                                         
Covariance Type:            nonrobust                                         
                                        coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------------------------------------------------------------------------
const 

# Regressions using Normalization

In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)
NoMSE = pd.DataFrame(columns = ["MSE", "Number of Regs"]) #Create empty dataframe to store MSE values

### *Ridge Regression*

In [11]:
from sklearn.linear_model import Ridge, RidgeCV #RidgeCV is Ridge Cross Validation

alphas = np.linspace(0,1,50)

ridgecv = RidgeCV(alphas = alphas+0.01, scoring = 'neg_mean_squared_error', normalize = True) #RidgeCV cannot take alpha = 0
ridgecv.set_params(fit_intercept = True)
ridgecv.fit(X_train, y_train) #Function auto does n-fold cross validation
print('Ridge Alpha:',ridgecv.alpha_) #calculates everything and returns optimal alpha after fit by above

ridge = Ridge(alpha = ridgecv.alpha_, normalize = True)
ridge.set_params(fit_intercept = True)
ridge.fit(X_train, y_train)
NoMSE.loc['Ridge Regression','MSE'] = mean_squared_error(y_test, ridge.predict(X_test))
print('Mean Squared Error:',mean_squared_error(y_test, ridge.predict(X_test)))

ridgeregs = pd.DataFrame(ridge.coef_.ravel(), index = X.columns)
print(ridgeregs)
NoMSE.loc['Ridge Regression','Number of Regs'] = len(ridgeregs) 

Ridge Alpha: 1.01
Mean Squared Error: 0.09658594399678525
                                        0
hrat                            -0.000276
obrat                           -0.002515
mortlat2                        -0.024839
pubrec                          -0.117294
self                            -0.027242
...                                   ...
chivr_dmean:black_female_dmean  -0.009922
chivr_dmean:black_male_dmean     0.028586
chivr_dmean:hispan_female_dmean  0.004159
chivr_dmean:hispan_male_dmean    0.016632
chivr_dmean:white_female_dmean  -0.011541

[149 rows x 1 columns]


### *Lasso Regression*

In [12]:
from sklearn.linear_model import Lasso, LassoCV

alphas = np.linspace(0.00015,0.0002310344827586207,50, endpoint = True) #0.0002310 used because it minized MSE

lasso = Lasso(max_iter = 10000, normalize = True) #iterations used to minimize bc not differentiable; j testing shit or smthn

lassocv = LassoCV(alphas = list(alphas), cv = 10, max_iter = 100000, normalize = True)
lassocv.set_params(fit_intercept=True)
lassocv.fit(X_train, y_train.values.ravel())
print('Lasso Alpha:',lassocv.alpha_)

lasso.set_params(alpha=lassocv.alpha_,fit_intercept=True)
lasso.fit(X_train, y_train.values.ravel())  # need .ravel() to avoid warning
NoMSE.loc['Lasso Regression','MSE'] = mean_squared_error(y_test, lasso.predict(X_test))
print('Mean Squared Error:',mean_squared_error(y_test, lasso.predict(X_test))) 

#Selecting non-zero coefficients
lasregs = pd.DataFrame(lasso.coef_, index = X.columns)[list(pd.DataFrame(lasso.coef_, index = X.columns).T.any())]
print(lasregs)
NoMSE.loc['Lasso Regression','Number of Regs'] = len(lasregs) 

Lasso Alpha: 0.0002310344827586207
Mean Squared Error: 0.09487900787003778
                                          0
obrat                             -0.006056
pubrec                            -0.215768
self                              -0.028861
chist                              0.118984
unem                              -0.000542
multi                             -0.039883
married                            0.007395
loanprc                           -0.079429
vr                                -0.022255
obwhte                             0.002160
black_male                        -0.017743
hrat_dmean:black_female_dmean     -0.005434
hrat_dmean:hispan_male_dmean      -0.002261
obrat_dmean:black_female_dmean    -0.002880
obrat_dmean:hispan_female_dmean   -0.013709
mortlat2_dmean:black_female_dmean  2.018365
mortlat2_dmean:black_male_dmean   -0.085127
mortlat2_dmean:hispan_male_dmean  -0.150520
pubrec_dmean:white_female_dmean   -0.003135
self_dmean:black_male_dmean        0.131985
c

### *Elastic Net Regression*

In [13]:
from sklearn.linear_model import ElasticNet, ElasticNetCV
enetcv = ElasticNetCV(cv=5, random_state=42,fit_intercept=True,normalize = True)
enetcv.fit(X_train, y_train.values.ravel())
print('Elastic Alpha:',enetcv.alpha_)
print('Elastic l1 Ratio:',enetcv.l1_ratio)
NoMSE.loc['Elastic Net','MSE'] = mean_squared_error(y_test, enetcv.predict(X_test))
print('Mean Squared Error:',mean_squared_error(y_test, enetcv.predict(X_test)))

enet = ElasticNet(alpha=enetcv.alpha_, l1_ratio=enetcv.l1_ratio, fit_intercept = True, normalize = True)
enet.fit(X,y)
enetregs = pd.DataFrame(enet.coef_, index = X.columns)[list(pd.DataFrame(enet.coef_, index = X.columns).T.any())]
print(enetregs)
NoMSE.loc['Elastic Net','Number of Regs'] = len(enetregs) 

Elastic Alpha: 0.0005211767826562577
Elastic l1 Ratio: 0.5
Mean Squared Error: 0.09530639167634154
                                          0
obrat                             -0.002628
pubrec                            -0.153191
self                              -0.007615
chist                              0.052343
unem                              -0.001217
multi                             -0.035860
married                            0.008370
loanprc                           -0.082684
vr                                -0.007374
obwhte                             0.000185
schwh                              0.004587
schvr                             -0.003460
chiwh                              0.043300
chima                              0.003527
black_female                      -0.016851
black_male                        -0.022853
hispan_male                       -0.010191
hrat_dmean:hispan_male_dmean      -0.001714
obrat_dmean:black_female_dmean    -0.004040
obrat_dmean:hispan_fe

## *Conclusions about Normalization*

First, to compare the results from the different shrinkage methods above, we can refer to the different MSEs from each of the methods above.

In [14]:
NoMSE

Unnamed: 0,MSE,Number of Regs
Ridge Regression,0.0965859,149
Lasso Regression,0.094879,39
Elastic Net,0.0953064,50


As explained above, the Ridge Regression will not be seriously considered as an optimal method for model selection. It can be also noted above that the Ridge Regression has the highest MSE.

Comparing the results above, we can see that, after standardization, the Lasso Regression again produces the lowest MSE, signifying that it has the highest prediction power. The Lasso Regression is also preferred to the ELastic Net as it selects fewer regressors while achieving this lower MSE, selecting a more parsimonious and predictively powerful model. The list of regressors picked via the Lasso Regression can be found below:

In [15]:
lasregs.index

Index(['obrat', 'pubrec', 'self', 'chist', 'unem', 'multi', 'married',
       'loanprc', 'vr', 'obwhte', 'black_male',
       'hrat_dmean:black_female_dmean', 'hrat_dmean:hispan_male_dmean',
       'obrat_dmean:black_female_dmean', 'obrat_dmean:hispan_female_dmean',
       'mortlat2_dmean:black_female_dmean', 'mortlat2_dmean:black_male_dmean',
       'mortlat2_dmean:hispan_male_dmean', 'pubrec_dmean:white_female_dmean',
       'self_dmean:black_male_dmean', 'chist_dmean:black_male_dmean',
       'chist_dmean:hispan_male_dmean', 'unem_dmean:black_female_dmean',
       'unem_dmean:hispan_male_dmean', 'multi_dmean:black_female_dmean',
       'multi_dmean:black_male_dmean', 'married_dmean:black_male_dmean',
       'married_dmean:hispan_male_dmean', 'married_dmean:white_female_dmean',
       'loanprc_dmean:hispan_female_dmean', 'loanprc_dmean:white_female_dmean',
       'sch_dmean:black_female_dmean', 'sch_dmean:black_male_dmean',
       'vr_dmean:black_male_dmean', 'schth_dmean:black_male_

Note that there are differences between the regressors selected via normalization and via standardization above. Standardizing the data (as was done in the first section) allows the coefficients to be interpretted on the basis of how a change in 1 standard deviation for a regressor impacts the outcome variable's change by standard deviation. Normalization, however, does not have this same interpretation using standard deviations. As such, this produced the different models selected by either method. 

In [16]:
#OLS of the model selected by lasso regression via normalization
import statsmodels.api as sm
print(sm.OLS(y,sm.add_constant(X[lasregs.index])).fit().summary())

                            OLS Regression Results                            
Dep. Variable:                approve   R-squared:                       0.204
Model:                            OLS   Adj. R-squared:                  0.188
Method:                 Least Squares   F-statistic:                     12.64
Date:                Sun, 29 Mar 2020   Prob (F-statistic):           6.19e-70
Time:                        17:58:59   Log-Likelihood:                -377.93
No. Observations:                1961   AIC:                             835.9
Df Residuals:                    1921   BIC:                             1059.
Df Model:                          39                                         
Covariance Type:            nonrobust                                         
                                        coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------------------------------------------------------------------------
const 