# Estimating the Likelihood of Debt Issue

As a complement to the panel models in *Debt_on_TEL*, this Notebook renders debt issues into a discrete variable.  If debt issues occur (`TOT_DEBT > 1`), it scored as one (zero otherwise).

In [6]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import statsmodels.api as sm
import statsmodels.formula.api as smf

## Data Input

We will use the same data set that has been used in *Debt_on_TEL*.

In [3]:
#Read in data
data=pd.read_csv('../data/debt_mod.csv')

#Generate new binary indicator of debt issue
data['BIN_DEBT_ISSUE']=np.where(data['TOT_DEBT']>0,1,0)

data['BIN_DEBT_ISSUE'].describe()

count    21816.000000
mean         0.888797
std          0.314390
min          0.000000
25%          1.000000
50%          1.000000
75%          1.000000
max          1.000000
Name: BIN_DEBT_ISSUE, dtype: float64

## Logistic Model Design

We have three specifications we will be estimating, based upon three sets of TEL measures.

In [4]:
#Define specifications
spc1='BIN_DEBT_ISSUE~TYPE1+TYPE2+TYPE2_Y+RESPOP+RESPOP2+DENSITY+POPGROWTH+HSLD_PERS+PRE1940+PYOUNG+'+\
     'PVT_SCH+POP65+PC_INC+POVERTY+PC_SSI+DIVERSITY+EMP_RES+MANU_RES+RETL_RES+SERV_RES+BIN_REC+'+\
     'OSRC_GAP+REAL_RATE+R_CTY_INT_DIFF+GEN_REV+TAX_EFFORT+IGR_ST+REAL_RATE_CHANGE+TOT_DEBT_OUTST+C(FIPSST)'
spc2='BIN_DEBT_ISSUE~LIMITS+BOTH+RESPOP+RESPOP2+DENSITY+POPGROWTH+HSLD_PERS+PRE1940+PYOUNG+'+\
     'PVT_SCH+POP65+PC_INC+POVERTY+PC_SSI+DIVERSITY+EMP_RES+MANU_RES+RETL_RES+SERV_RES+BIN_REC+'+\
     'OSRC_GAP+REAL_RATE+R_CTY_INT_DIFF+GEN_REV+TAX_EFFORT+IGR_ST+REAL_RATE_CHANGE+TOT_DEBT_OUTST+C(FIPSST)'
spc3='BIN_DEBT_ISSUE~RATE_L+ASMT_L+GP_LMT+SC_LMT+RESPOP+RESPOP2+DENSITY+POPGROWTH+HSLD_PERS+PRE1940+PYOUNG+'+\
     'PVT_SCH+POP65+PC_INC+POVERTY+PC_SSI+DIVERSITY+EMP_RES+MANU_RES+RETL_RES+SERV_RES+BIN_REC+'+\
     'OSRC_GAP+REAL_RATE+R_CTY_INT_DIFF+GEN_REV+TAX_EFFORT+IGR_ST+REAL_RATE_CHANGE+TOT_DEBT_OUTST+C(FIPSST)'
        
#Capture in dict
spec_dict={1:spc1,
           2:spc2,
           3:spc3}

We can capture the results from these specs in another dict.

In [12]:
mod_dict={1:smf.glm(formula=spec_dict[1],data=data,family=sm.families.Binomial()).fit(),
          2:smf.glm(formula=spec_dict[2],data=data,family=sm.families.Binomial()).fit(),
          3:smf.glm(formula=spec_dict[3],data=data,family=sm.families.Binomial()).fit()}

In [13]:
mod_dict[1].summary()

0,1,2,3
Dep. Variable:,BIN_DEBT_ISSUE,No. Observations:,21816.0
Model:,GLM,Df Residuals:,21808.0
Model Family:,Binomial,Df Model:,7.0
Link Function:,logit,Scale:,1.0
Method:,IRLS,Log-Likelihood:,-6944.5
Date:,"Sun, 15 Nov 2015",Deviance:,13889.0
Time:,18:32:11,Pearson chi2:,19300.0
No. Iterations:,100,,

0,1,2,3,4,5
,coef,std err,z,P>|z|,[95.0% Conf. Int.]
Intercept,6.3610,0.634,10.032,0.000,5.118 7.604
C(FIPSST)[T.4],-1.2077,0.348,-3.472,0.001,-1.889 -0.526
C(FIPSST)[T.5],0.1317,0.236,0.557,0.577,-0.332 0.595
C(FIPSST)[T.6],-0.5299,0.277,-1.911,0.056,-1.073 0.013
C(FIPSST)[T.8],-1.0690,0.308,-3.467,0.001,-1.673 -0.465
C(FIPSST)[T.9],0.8698,0.341,2.547,0.011,0.201 1.539
C(FIPSST)[T.10],0.1036,0.418,0.248,0.804,-0.717 0.924
C(FIPSST)[T.12],-0.3190,0.221,-1.441,0.150,-0.753 0.115
C(FIPSST)[T.13],0.4761,0.221,2.153,0.031,0.043 0.910


In [14]:
mod_dict[2].summary()

0,1,2,3
Dep. Variable:,BIN_DEBT_ISSUE,No. Observations:,21816.0
Model:,GLM,Df Residuals:,21808.0
Model Family:,Binomial,Df Model:,7.0
Link Function:,logit,Scale:,1.0
Method:,IRLS,Log-Likelihood:,-6968.6
Date:,"Sun, 15 Nov 2015",Deviance:,13937.0
Time:,18:32:20,Pearson chi2:,19400.0
No. Iterations:,100,,

0,1,2,3,4,5
,coef,std err,z,P>|z|,[95.0% Conf. Int.]
Intercept,6.7757,0.647,10.477,0.000,5.508 8.043
C(FIPSST)[T.4],0.1093,0.286,0.382,0.703,-0.452 0.670
C(FIPSST)[T.5],0.3106,0.249,1.248,0.212,-0.177 0.798
C(FIPSST)[T.6],0.6799,0.209,3.252,0.001,0.270 1.090
C(FIPSST)[T.8],0.1130,0.257,0.439,0.660,-0.391 0.617
C(FIPSST)[T.9],1.1641,0.359,3.246,0.001,0.461 1.867
C(FIPSST)[T.10],0.3064,0.436,0.703,0.482,-0.548 1.160
C(FIPSST)[T.12],0.1207,0.209,0.577,0.564,-0.289 0.530
C(FIPSST)[T.13],0.6946,0.242,2.871,0.004,0.220 1.169


In [15]:
mod_dict[3].summary()

0,1,2,3
Dep. Variable:,BIN_DEBT_ISSUE,No. Observations:,21816.0
Model:,GLM,Df Residuals:,21808.0
Model Family:,Binomial,Df Model:,7.0
Link Function:,logit,Scale:,1.0
Method:,IRLS,Log-Likelihood:,-6550.8
Date:,"Sun, 15 Nov 2015",Deviance:,13102.0
Time:,18:32:28,Pearson chi2:,18500.0
No. Iterations:,100,,

0,1,2,3,4,5
,coef,std err,z,P>|z|,[95.0% Conf. Int.]
Intercept,6.5794,0.680,9.672,0.000,5.246 7.913
C(FIPSST)[T.4],-0.6879,0.354,-1.942,0.052,-1.382 0.006
C(FIPSST)[T.5],0.4253,0.331,1.286,0.199,-0.223 1.074
C(FIPSST)[T.6],-0.0636,0.298,-0.214,0.831,-0.647 0.520
C(FIPSST)[T.8],0.4276,0.388,1.103,0.270,-0.332 1.187
C(FIPSST)[T.9],1.2915,0.399,3.236,0.001,0.509 2.074
C(FIPSST)[T.10],0.3640,0.464,0.785,0.432,-0.544 1.272
C(FIPSST)[T.12],-0.0097,0.381,-0.025,0.980,-0.755 0.736
C(FIPSST)[T.13],0.7814,0.301,2.596,0.009,0.191 1.371


In [16]:
dir(mod_dict[1])

['__class__',
 '__delattr__',
 '__dict__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__module__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_cache',
 '_data_attr',
 '_data_weights',
 '_endog',
 '_get_robustcov_results',
 'aic',
 'bic',
 'bse',
 'conf_int',
 'cov_kwds',
 'cov_params',
 'cov_type',
 'data_in_cache',
 'deviance',
 'df_model',
 'df_resid',
 'f_test',
 'family',
 'fit_history',
 'fittedvalues',
 'initialize',
 'k_constant',
 'llf',
 'llnull',
 'load',
 'model',
 'mu',
 'nobs',
 'normalized_cov_params',
 'null',
 'null_deviance',
 'params',
 'pearson_chi2',
 'pinv_wexog',
 'predict',
 'pvalues',
 'remove_data',
 'resid_anscombe',
 'resid_deviance',
 'resid_pearson',
 'resid_response',
 'resid_working',
 'save',
 'scale',
 'summary',
 'summary2',
 't_test',
 'tvalues',
 'use_t',
 'wald_test']