# Logit Multinomial: Decisiones escolares y laborales para hombres jóvenes

## Dataset y objetivo del ejemplo:
* Los datos (un subconjunto del trabajo de Keane y Wolpin, 1997, "The Career Decisions of Young Men", Journal of Political Economy, Vol. 105, No. 3, pp. 473-522) contienen historial de empleo y escolaridad para una muestra de hombres para los años 1981 a 1987. 

* Utilizamos los datos para 1987. Los tres resultados posibles se inscribieron en la escuela (estado = 0), no en la escuela y no trabajando (estado = 1), y trabajando (estado = 2). 

* Las variables explicativas son educación, experiencia de trabajo cuadrática y un indicador binario de si el individuo es de raza negra. 

## 1. Importación de Bibliotecas:

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm

# Omiting WARNINGS
import warnings
warnings.filterwarnings('ignore')

### Descripción de variables:
* id: identifier
* numyrs: number of years in sample
* year: 81 to 87
* choice: sch=1,home=2,wc=3,bc=4,serv=5
* wage: annual wage, 1987 \$
* educ: years of schooling
* expwc: experience in white collar
* expbc: experience in blue collar
* expser: experience in services
* manuf: =1 if in manufacturing
* black: =1 if black
* lwage: log(wage)
* y81: =1 if year == 81
* ... y87
* enroll: =1 if choice == 1
* employ: =1 if choice == 3, 4, or 5
* attrit: =1 if attrit in next year
* exper: expwc + expbc + expser
* expersq: exper^2
* status: sch=1,home=2,work=3

In [2]:
#
keane_df = pd.read_stata('keane.dta')
keane_df.head()

Unnamed: 0,id,numyrs,year,choice,wage,educ,expwc,expbc,expser,manuf,...,y84,y85,y86,y87,enroll,employ,attrit,exper,expersq,status
0,1,9,81,2.0,,10,0,0,0,0.0,...,0,0,0,0,0,0,0,0,0,2.0
1,1,9,82,2.0,,10,0,0,0,0.0,...,0,0,0,0,0,0,0,0,0,2.0
2,1,9,83,2.0,,10,0,0,0,0.0,...,0,0,0,0,0,0,0,0,0,2.0
3,1,9,84,1.0,,10,0,0,0,0.0,...,1,0,0,0,1,0,0,0,0,1.0
4,1,9,85,2.0,,11,0,0,0,0.0,...,0,1,0,0,0,0,0,0,0,2.0


In [3]:
# Selección de datos:
keane_df2 = keane_df.loc[keane_df.y87 == 1]

In [4]:
#
keane_df2["estado"] = 0
keane_df2.loc[keane_df2.status == 2.0, "estado"] = 1
keane_df2.loc[keane_df2.status == 3.0, "estado"] = 2

In [5]:
#
keane_df2

Unnamed: 0,id,numyrs,year,choice,wage,educ,expwc,expbc,expser,manuf,...,y85,y86,y87,enroll,employ,attrit,exper,expersq,status,estado
6,1,9,87,2.0,,11,0,0,0,0.0,...,0,0,1,0,0,0,0,0,2.0,1
13,2,9,87,4.0,15841.410156,12,0,3,2,0.0,...,0,0,1,0,1,0,5,25,3.0,2
21,4,11,87,5.0,6093.600098,9,0,0,0,0.0,...,0,0,1,0,1,0,0,0,3.0,2
28,5,9,87,3.0,11017.230469,9,2,1,4,0.0,...,0,0,1,0,1,0,7,49,3.0,2
35,6,9,87,2.0,,8,0,4,0,0.0,...,0,0,1,0,0,0,4,16,2.0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12688,2226,8,87,1.0,,13,1,1,0,0.0,...,0,0,1,1,0,0,2,4,1.0,0
12701,2228,8,87,3.0,25762.279297,16,0,0,0,1.0,...,0,0,1,0,1,0,0,0,3.0,2
12708,2229,10,87,4.0,23576.240234,11,0,4,2,1.0,...,0,0,1,0,1,0,6,36,3.0,2
12715,2230,11,87,3.0,23886.910156,12,3,4,0,0.0,...,0,0,1,0,1,0,7,49,3.0,2


In [6]:
#
keane_df2 = keane_df2[['estado', 'educ', 'exper', 'expersq', 'black']]
keane_df2.head()

Unnamed: 0,estado,educ,exper,expersq,black
6,1,11,0,0,1
13,2,12,5,25,1
21,2,9,0,0,1
28,2,9,7,49,1
35,1,8,4,16,1


In [7]:
pd.crosstab(keane_df2["estado"], keane_df2["educ"], margins = True)

educ,7,8,9,10,11,12,13,14,15,16,17,18,19,All
estado,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
0,0,1,3,2,5,13,6,9,23,21,22,10,5,120
1,12,23,34,61,53,84,17,12,10,20,4,2,0,332
2,5,35,72,78,136,473,96,78,52,210,36,13,2,1286
All,17,59,109,141,194,570,119,99,85,251,62,25,7,1738


## 2. Estimación 

In [8]:
# Definición de variables:
y = keane_df2["estado"]

X = keane_df2[['educ', 'exper', 'expersq', 'black']]

X = sm.add_constant(X)

In [9]:
# Estimación:
mdl = sm.MNLogit(y, X)
 
mdl_fit = mdl.fit()

Optimization terminated successfully.
         Current function value: 0.560293
         Iterations 7


In [10]:
# Imprimiendo Resultados:
print(mdl_fit.summary())

                          MNLogit Regression Results                          
Dep. Variable:                 estado   No. Observations:                 1738
Model:                        MNLogit   Df Residuals:                     1728
Method:                           MLE   Df Model:                            8
Date:                Wed, 19 May 2021   Pseudo R-squ.:                  0.2257
Time:                        08:41:56   Log-Likelihood:                -973.79
converged:                       True   LL-Null:                       -1257.7
Covariance Type:            nonrobust   LLR p-value:                1.968e-117
  estado=1       coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          9.4428      0.950      9.937      0.000       7.580      11.305
educ          -0.6168      0.059    -10.400      0.000      -0.733      -0.501
exper         -0.1479      0.156     -0.947      0.3

In [11]:
# Efectos Marginnales:
mdl_margeff = mdl_fit.get_margeff()
print(mdl_margeff.summary())

       MNLogit Marginal Effects      
Dep. Variable:                 estado
Method:                          dydx
At:                           overall
  estado=0      dy/dx    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
educ           0.0185      0.003      6.407      0.000       0.013       0.024
exper         -0.0351      0.007     -4.796      0.000      -0.050      -0.021
expersq        0.0033      0.001      3.006      0.003       0.001       0.006
black          0.0155      0.013      1.225      0.221      -0.009       0.040
------------------------------------------------------------------------------
  estado=1      dy/dx    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
educ          -0.0433      0.003    -14.521      0.000      -0.049      -0.037
exper         -0.0993      0.010    -10.317      0.000    