## Analyzing the Impact of TELs on Debt Issues

This Notebook uses the data constructed in [sas2csv](https://github.com/choct155/TELs_debt/blob/master/code/sas2csv.ipynb) and [DebtDataSeries](https://github.com/choct155/TELs_debt/blob/master/code/DebtDataSeries.ipynb) to evaluate the impact of tax and expenditure limitations on debt issues by county.  This Notebook will do the following:

1. Subset to the variables critical to our analysis (**Data Input**);
2. Build specifications that feature a set of debt related dependent variables (**Model Design**);
3. Estimate the relationship between TELs and debt by way of pooled and fixed effect models (**Estimation**).

In [1]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import seaborn as sb
import statsmodels.api as sm
import statsmodels.formula.api as smf

%pylab inline

Populating the interactive namespace from numpy and matplotlib


## Data Input

Our data is housed in ... the **`data/`** directory.  We are looking for `debt_out.csv` which has aggregate debt issue, institutional, socioeconomic, and spatial information aggregated to the county level.

In [3]:
!ls -l ../data/

total 152104
-rw-r--r-- 1 root root   165888 Nov 10 17:20 13slsstab1a.xls
-rw-r--r-- 1 root root    93112 Nov 10 17:20 2013_GFS_debt.xcf
-rw-r--r-- 1 root root 12226235 Nov 10 17:20 bonds.csv
-rwxr-xr-x 1 root root  1730054 Nov 11 10:44 cty_coverage.csv
-rw-r--r-- 1 root root        0 Nov 10 17:20 current_issue_geocode_list.csv
-rwxr-xr-x 1 root root 25615061 Nov 11 10:44 debt_out.csv
-rw-r--r-- 1 root root 47620501 Nov 10 17:20 debt_ts_pre_fips.csv
-rw-r--r-- 1 root root 49023971 Nov 10 17:20 debt_w_fips.csv
-rw-r--r-- 1 root root   104148 Nov 10 17:20 fips_st_co_02_07.csv
-rw-r--r-- 1 root root     7578 Nov 10 17:20 g_api_college.csv
-rw-r--r-- 1 root root   103193 Nov 10 17:20 g_api_rando.csv
-rw-r--r-- 1 root root  2874978 Nov 10 17:20 geocorr12.csv
-rw-r--r-- 1 root root    51068 Nov 11 10:43 state_coverage.csv
-rwxr-xr-x 1 root root 16116782 Nov 11 10:40 tel_data.csv


Let's go ahead and read in the data.

In [7]:
data_in=pd.read_csv('../data/debt_out.csv')

print sorted(data_in.columns),'\n\n',data_in.info()

['ASMT_L', 'ASMT_L2', 'ASMT_L3', 'BOTH', 'CB_E', 'CB_E2', 'CB_E3', 'CB_E4', 'CB_G', 'CB_G2', 'CFDISC_L', 'CGEXP_L', 'CH_HS_UNT', 'CLEVY_L', 'CLEVY_L2', 'CLEVY_L3', 'CLEVY_L4', 'CRATE_L', 'CRATE_L2', 'CREVU_L', 'D_GEN_EXP', 'FFDISC_L', 'FIPS', 'FIPSCO', 'FIPSST', 'FIPST_N', 'GEN_REV', 'GEXP_L', 'GO', 'GO_City, Town Vlg', 'GO_Co-op Utility', 'GO_College or Univ', 'GO_County/Parish', 'GO_Development', 'GO_Direct Issuer', 'GO_District', 'GO_Education', 'GO_Electric Power', 'GO_Environmental Facilities', 'GO_General Purpose', 'GO_Healthcare', 'GO_Housing', 'GO_Indian Tribe', 'GO_Local Authority', 'GO_Public Facilities', 'GO_State Authority', 'GO_State/Province', 'GO_Transportation', 'GO_Utilities', 'GP_GEXP', 'GP_LEVY', 'GP_LMT', 'GP_RATE', 'GP_REVU', 'HOME_STEAD', 'HOME_STEAD2', 'HOME_STEAD3', 'HSG_UNITS', 'HSLD_PERS', 'IGR_ST', 'LANDAREA', 'LEVY_L', 'LIMITS', 'MDHOMEVAL', 'MED_INC', 'MFDISC_L', 'MFG_EMP', 'MGEXP_L', 'MGEXP_L2', 'MLEVY_L', 'MLEVY_L2', 'MLEVY_L3', 'MLEVY_L4', 'MRATE_L', 'MR

The set of variables in play appear in the table below:

**DEPENDENT VARIABLES**

Concept|Input Variables
-------|---------------
Per capita GO debt issued|*Variables beginning with GO* & `RES_POP`
Per capita revenue debt issued|*Variables beginning with RV* & `RES_POP`
Ratio of GO to revenue debt issued|*Variables beginning with GO or RV*

**INSTITUTIONAL VARIABLES**

Concept|Input Variables
-------|---------------
Any TEL|`LIMITS`
Non-binding TEL|`TYPE1`
Potentially binding TEL|`TYPE2`
Both `TYPE1` & `TYPE2`|`BOTH`
Years since `TYPE2` enacted|`TYPE2_y`
Overall property tax rate limit|`RATE_L`
Overall assessment limit|`SC_LMT`
Limit applied to general purpose gov|`GP_LMT`
Limit applied to school district|`SC_LMT`

*Note that all limits above can be interacted with primary county status (`PRIMARY`; see spatial table below), in which case we append an `i` to the variable name.*

**SCALE & SUPPLY MEASURES**

Concept|Input Variables
-------|---------------
Population|`RES_POP`
<span style="color:red">Population$^2$</span>|`RES_POP2`
Population density|`DENSITY`
Population growth rate|`POPGROW`
Household size|`PERS_HLD`
Pre-1940 housing stock|`PRE1940`

**DEMAND MEASURES**

Concept|Input Variables
-------|---------------
Population under 17|`PYOUNG`
Private school enrollment|`PVT_SCH`
Population over 65|`POP65`
Per capita income|`PCINC`
Povery rate|`POVERTY`
Average monthly Social Security payments (to recipients)|`PC_SSI`
Per capita income weighted by poverty rate|`DIVERSITY`

**ECONOMIC ACTIVITY**

Concept|Input Variables
-------|---------------
Employment to population ratio|`EMP_RESI`
Manufacturing employment to population ratio|`MANU_RES`
Retail employment to population ratio|`RETL_RES`
Service employment to population ratio|`SERV_RES`

**SPATIAL CHARACTERISTICS**

Concept|Input Variables
-------|---------------
Primary central county in 1974|`PRIMARY`
Co-central county in 1974|`CO_PRIM`
Urban fringe county in 1974|`FRINGE`

Let's grab these in category lists to make them more accessible.

In [24]:
#Capture issuer suffixes
issuers=['City, Town Vlg','Co-op Utility','College or Univ','County/Parish','Direct Issuer','District',\
         'Indian Tribe','Local Authority','State Authority','State/Province']
purposes=['Development','Education','Electric Power','Environmental Facilities','General Purpose','Healthcare',\
          'Housing','Public Facilities','Transportation','Utilities']

#Capture dependent variables
debt_vars=['GO','RV']
go_vars={'iss':['GO_'+var for var in issuers],
         'pur':['GO_'+var for var in purposes]}
rv_vars={'iss':['RV_'+var for var in issuers],
         'pur':['RV_'+var for var in purposes]}

#Capture independnet vars
tel_vars={'types':['TYPE1','TYPE2','TYPE2_y'],
          'either':['LIMITS','BOTH'],
          'hi_res':['RATE_L','ASMT_L','GP_LMT','SC_LMT']}
supply_vars=['RES_POP','DENSITY','POPGROW','PERS_HLD','PRE1940']
demand_vars=['PYOUNG','PVT_SCH','POP65','PCINC','POVERTY','PC_SSI','DIVERSITY']
economic_vars=['EMP_RESI','MANU_RES','RETL_RES','SERV_RES']
spatial_vars=['PRIMARY','CO_PRIM','FRINGE']

#Capture all modeling variables in a single list
mod_vars=debt_vars+go_vars['iss']+go_vars['pur']+rv_vars['iss']+rv_vars['pur']+tel_vars['types']+\
         tel_vars['either']+tel_vars['hi_res']+supply_vars+demand_vars+economic_vars+spatial_vars
    
#For each model variable...
for var in mod_vars:
    #...tell me if it's not in the set
    if var not in data_in.columns:
        print "Is "+var+" in the data set??  Maaaan, we ain't found shit!"

Is TYPE2_y in the data set??  Maaaan, we ain't found shit!
Is DENSITY in the data set??  Maaaan, we ain't found shit!
Is POPGROW in the data set??  Maaaan, we ain't found shit!
Is PERS_HLD in the data set??  Maaaan, we ain't found shit!
Is PRE1940 in the data set??  Maaaan, we ain't found shit!
Is PYOUNG in the data set??  Maaaan, we ain't found shit!
Is PVT_SCH in the data set??  Maaaan, we ain't found shit!
Is POP65 in the data set??  Maaaan, we ain't found shit!
Is PCINC in the data set??  Maaaan, we ain't found shit!
Is POVERTY in the data set??  Maaaan, we ain't found shit!
Is PC_SSI in the data set??  Maaaan, we ain't found shit!
Is DIVERSITY in the data set??  Maaaan, we ain't found shit!
Is EMP_RESI in the data set??  Maaaan, we ain't found shit!
Is MANU_RES in the data set??  Maaaan, we ain't found shit!
Is RETL_RES in the data set??  Maaaan, we ain't found shit!
Is SERV_RES in the data set??  Maaaan, we ain't found shit!
Is PRIMARY in the data set??  Maaaan, we ain't found sh

Huh?  Are there more variables in the Mikesell set?  (Test is taken from Mikesell's data as written in [sas2csv](https://github.com/choct155/TELs_debt/blob/master/code/sas2csv.ipynb)).

In [25]:
test=[u'ACQ_ValAss', u'ASMT_L', u'ASMT_L2', u'ASMT_L3', u'Area', u'BOTH', u'BURDEN05', u'BURDEN06', u'BURDEN99', \
      u'CB', u'CB_E', u'CB_E2', u'CB_E3', u'CB_E4', u'CB_G', u'CB_G2', u'CB_share', u'CFDISC_L', u'CGEXP_L', \
      u'CLEVY_L', u'CLEVY_L2', u'CLEVY_L3', u'CLEVY_L4', u'CRATE_L', u'CRATE_L2', u'CREVU_L', u'CV_99BURDEN', \
      u'County', u'Def', u'Density', u'Dillon_all', u'EAST', u'EDU_MAND', u'EmptoResPop', u'FFDISC_L', u'FIPSCO', \
      u'FIPSST', u'GEXP_L', u'GL', u'GP_GEXP', u'GP_LEVY', u'GP_LMT', u'GP_RATE', u'GP_REVU', u'GST', \
      u'HOME_STEAD', u'HOME_STEAD2', u'HOME_STEAD3', u'IGR', u'IIT', u'LEVY_L', u'LIMITS', u'LIT', u'LST', u'ME', \
      u'MFDISC_L', u'MGEXP_L', u'MGEXP_L2', u'MIDDLE', u'MLEVY_L', u'MLEVY_L2', u'MLEVY_L3', u'MLEVY_L4', \
      u'MRATE_L', u'MRATE_L2', u'MREVU_L', u'M_H_INC05', u'M_H_INC06', u'M_H_INC99', u'M_P_TAX05', u'M_P_TAX06', \
      u'M_P_TAX99', u'M_VOOH05', u'M_V_OOH06', u'M_V_OOH99', u'MfgEmpto_TotEmp', u'NAME', u'NE', u'N_COUNTIES', \
      u'P65__05', u'PC_SCHENROL', u'PL', u'Pop_05', u'Popgrow2000_05', u'RATE05', u'RATE06', u'RATE99', u'RATE_L', \
      u'RATE_L2', u'REVU_L', u'RM', u'Rate_ch_05_06', u'RetWhoEmptoTotEmp', u'SCHENROL_2000', u'SC_LMT', u'SE', \
      u'SFDISC_L', u'SGEXP_L', u'SGEXP_L2', u'SLEVY_L', u'SLEVY_L2', u'SLEVY_L3', u'SLEVY_L4', u'SOUTH', u'SOUTH_1', \
      u'SPC_RATE', u'SPT', u'SRATE_L', u'SRATE_L2', u'SREVU_L', u'SUMLEV', u'SW', u'State', u'TREND', u'TYPE1', \
      u'TYPE2', u'TYPE2_Y', u'TnT', u'Tot_Emp', u'UC_1M', u'UC_UH150', u'UH150_UH70', u'UH300_UH150', u'UH300_UH70', \
      u'UH_150K', u'UH_300K', u'UH_70K', u'UI_1M', u'UI_25M', u'UI_UH150', u'VAM', u'VAM_POP', u'VAmanuf_2002', \
      u'Y2006', u'Y20061', u'YEAR', u'_11Forest_Fish', u'_21Mining', u'_22Utilities', u'_23Construction', u'_31Manuf', \
      u'_42Wholesale', u'_44Retail', u'_48Trans_Wharehouse', u'_51Information', u'_52Fin_Insurance', u'_53RealEstate', \
      u'_54_Prof_Sci_Tec', u'_55Managment', u'_56Adm_Support_Waste', u'_61Education', u'_62Health_Social', \
      u'_71Arts_Rec', u'_72Hotel_Food', u'_81OtherServices', u'_99Unclassified']

In [26]:
#For each model variable...
for var in mod_vars:
    #...tell me if it's not in the set
    if var not in test:
        print "Is "+var+" in the data set??  Maaaan, we ain't found shit!"

Is GO in the data set??  Maaaan, we ain't found shit!
Is RV in the data set??  Maaaan, we ain't found shit!
Is GO_City, Town Vlg in the data set??  Maaaan, we ain't found shit!
Is GO_Co-op Utility in the data set??  Maaaan, we ain't found shit!
Is GO_College or Univ in the data set??  Maaaan, we ain't found shit!
Is GO_County/Parish in the data set??  Maaaan, we ain't found shit!
Is GO_Direct Issuer in the data set??  Maaaan, we ain't found shit!
Is GO_District in the data set??  Maaaan, we ain't found shit!
Is GO_Indian Tribe in the data set??  Maaaan, we ain't found shit!
Is GO_Local Authority in the data set??  Maaaan, we ain't found shit!
Is GO_State Authority in the data set??  Maaaan, we ain't found shit!
Is GO_State/Province in the data set??  Maaaan, we ain't found shit!
Is GO_Development in the data set??  Maaaan, we ain't found shit!
Is GO_Education in the data set??  Maaaan, we ain't found shit!
Is GO_Electric Power in the data set??  Maaaan, we ain't found shit!
Is GO_Envir