Diego - Last time I ate tacos like a month ago and I play videogames to pass time

## Workshop - Regression-Based Classification

Does `statsmodels` marginal effect use the average of covariates or the average predicted values? 
- Use the class data.
- Show your work.

Load the necessary packages and data:

In [1]:
import numpy as np
import pandas as pd
from tqdm import tqdm # progress bar

import statsmodels.api as sm
from sklearn import linear_model as lm

from sklearn.model_selection import GridSearchCV, train_test_split, KFold
from sklearn.metrics import confusion_matrix, plot_confusion_matrix

import matplotlib.pyplot as plt
import seaborn as sns
sns.set(rc = {'axes.titlesize': 24,
             'axes.labelsize': 20,
             'xtick.labelsize': 12,
             'ytick.labelsize': 12,
             'figure.figsize': (8, 4.5)})
sns.set_style("white") # for plot_confusion_matrix()

In [3]:
df = pd.read_pickle("C:/Users/dp846/OneDrive/Desktop/ECON490ML/class data/class_data.pkl")
df_prepped = df.drop(columns = ['urate_bin', 'year']).join([
    pd.get_dummies(df['urate_bin'], drop_first = True),
    pd.get_dummies(df.year, drop_first = True)    
])
y = df_prepped['pos_net_jobs'].astype(float)
x = df_prepped.drop(columns = 'pos_net_jobs')

x_train, x_test, y_train, y_test = train_test_split(x, y, train_size = 2/3, random_state = 490)

x_train_std = x_train.apply(lambda x: (x - np.mean(x))/np.std(x), axis = 0)
x_test_std  = x_test.apply(lambda x: (x - np.mean(x))/np.std(x), axis = 0)

x_train_std = sm.add_constant(x_train_std)
x_test_std  = sm.add_constant(x_test_std)
x_train     = sm.add_constant(x_train)
x_test      = sm.add_constant(x_test)


Fit a logistic regression using either `sm.Logit()` or `smf.logit()`.

In [4]:
fit_logit = sm.Logit(y_train, x_train).fit()

Optimization terminated successfully.
         Current function value: 0.599666
         Iterations 6


Get the marginal effects (`.get_margeff()`). Print the summary (`.summary()`).

In [5]:
fit_logit.summary2()

0,1,2,3
Model:,Logit,Pseudo R-squared:,0.126
Dependent Variable:,pos_net_jobs,AIC:,40700.1748
Date:,2021-03-02 14:48,BIC:,40936.2385
No. Observations:,33889,Log-Likelihood:,-20322.0
Df Model:,27,LL-Null:,-23242.0
Df Residuals:,33861,LLR p-value:,0.0
Converged:,1.0000,Scale:,1.0
No. Iterations:,6.0000,,

0,1,2,3,4,5,6
,Coef.,Std.Err.,z,P>|z|,[0.025,0.975]
const,-1.9804,0.1454,-13.6244,0.0000,-2.2653,-1.6955
pct_d_rgdp,0.0159,0.0014,11.0808,0.0000,0.0131,0.0187
emp_estabs,0.0383,0.0029,13.2672,0.0000,0.0327,0.0440
estabs_entry_rate,0.1938,0.0054,35.7595,0.0000,0.1832,0.2045
estabs_exit_rate,-0.1625,0.0060,-27.1367,0.0000,-0.1743,-0.1508
pop,0.0000,0.0000,2.7815,0.0054,0.0000,0.0000
pop_pct_black,-0.0037,0.0009,-3.9538,0.0001,-0.0055,-0.0019
pop_pct_hisp,0.0065,0.0010,6.6630,0.0000,0.0046,0.0084
lfpr,0.0014,0.0014,1.0370,0.2997,-0.0012,0.0041


***
# Covariate Averages
$$
\frac{\partial p(x_i)}{\partial \beta_1} \approx \frac{e^{\hat{\beta}_0 + \bar{x}\hat{\beta}_1 + \bar{x}\hat{\beta_2}}}{(1 + e^{\hat{\beta}_0 + \bar{x}\hat{\beta}_1 + \bar{x}\hat{\beta_2}})^2}\hat{\beta}
$$

***
# Predicted values Averages
$$
\frac{\partial p(x_i)}{\partial \beta_1} \approx \frac{1}{n} \sum_{i=1}
^n \frac{e^{\hat{y}_i}}{1 + e^{\hat{y}_i}}\hat{\beta}
$$

In [None]:
fit_logit.predict(x_test).describe()

In [6]:
fit_logit.get_margeff().summary()

0,1
Dep. Variable:,pos_net_jobs
Method:,dydx
At:,overall

Unnamed: 0,dy/dx,std err,z,P>|z|,[0.025,0.975]
pct_d_rgdp,0.0033,0.0,11.15,0.0,0.003,0.004
emp_estabs,0.0079,0.001,13.394,0.0,0.007,0.009
estabs_entry_rate,0.0401,0.001,38.375,0.0,0.038,0.042
estabs_exit_rate,-0.0336,0.001,-28.236,0.0,-0.036,-0.031
pop,3.029e-08,1.09e-08,2.782,0.005,8.95e-09,5.16e-08
pop_pct_black,-0.0008,0.0,-3.957,0.0,-0.001,-0.0
pop_pct_hisp,0.0013,0.0,6.678,0.0,0.001,0.002
lfpr,0.0003,0.0,1.037,0.3,-0.0,0.001
density,1.59e-06,1.77e-06,0.898,0.369,-1.88e-06,5.06e-06
lower,0.0709,0.007,10.278,0.0,0.057,0.084


*** 
# Interpretation

Interpret the marginal effect on one feature.