# Debt Data Time Series

This script seeks to grab the subset of relevant variables from each year, so that we have a set across all years that can be readily merged with the TEL/ACS data.

In [2]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import glob

## What are the common variables?

The first thing we will do is see if we can get what we need from just the common variables across all sets.  There are a few reasons why variables may not align every year:

1. Reuters doesn't offer the entire set of variables every year;
2. Variable names with long words may have been hyphenated in different ways across years;
3. If variables appeared more than once, the parsing routine appended the variable position to the name to create a unique variable.  If the position varies across years, so will the variable name.

To grab the columns in lightweight fashion, we will just read in the first couple lines for each set.

In [8]:
files[0][13:-4]

'1988to1989'

In [11]:
#Grab list of files
files=glob.glob('../debt_data/*.csv')

#Create a dictionary to hold columns from each year
col_dict={}

#For each file...
for f in files:
    #...read in the first couple rows...
    tmp_df=pd.read_csv(f,nrows=2)
    #...capture the columns...
    col_dict.update({f[13:-4]:list(tmp_df.columns)})
    #...and dump the partial data set
    del tmp_df
    
#Create a container for the variable sets within each file
var_sets=[]

#For each file...
for f in col_dict.keys():
    #...add the variable set to var_sets
    var_sets.append(set(col_dict[f]))
    
#Capture the intersection of variables across all years
common_vars=sorted(list(set.intersection(*var_sets)))

print 'There are '+str(len(common_vars))+' variables common to all sets.'
print common_vars

There are 254 variables common to all sets.
['# of Mgrs', '$ Amount of Highest Cpn Maturity', '144A FLAG', '501c3', '8-Digit CUSIP25', '8-Digit CUSIP26', 'Accumulator Amt ($ Mil)', 'All Use of Proceeds (Code)', 'All Use of Proceeds (Desc)', 'All Use of Proceeds (Number)', 'Amount at Maturity ($ mils)', 'Amount of Final Maturity ($mils)', 'Amount of Issue ($ mils)', 'Amount of Maturity ($ mils)', 'Ant- ici- pa- tion Type', 'Asset Backed Indicator Flag (Y/N)', 'Auction Rate', 'Aver- age Life', 'Average Take Down', 'Bank Qual', 'Beginning Price/ Yield', 'Beginning Serial Coupon', 'Beginning Serial Maturity', 'Bid', 'Bk Elig', 'Bk En- try', 'Bnk Mgd', 'Bond Buyer ALL UOP', 'Bond Buyer GO Index', 'Bond Buyer Region118', 'Bond Buyer Region119', 'Bond Buyer Rev. Index', 'Bond Buyer UOP142', 'Bond Buyer UOP143', 'Bond Buyer UOP30', 'Bond Counsel Deal(Y/N)', 'CD-ROM Number', 'CUSIP of Insti- tutional Backer', 'Call Date', 'Call Issue', 'Call Price', 'Callable at Par', 'Co-Managers', 'Comb. Gros

Ok, we are looking for aggregations of debt by county.  In particular, we want to capture activity by concepts:

1. Type of Debt (General Obligation or Revenue; latter can be split by )
2. Issuer Type (General purpose gov, school district, special district, or private entity)
3. Purpose of the Issue
4. Volume of Issue

For the latter two, we also want variables that split out GO versus revenue bonds.  For example, we would want to know the volume of GO debt issued by general purpose jurisdictions, or the revenue debt issued in service of transportation infrastructure.  The following table maps concepts to variables.

Concept|Variable|Possible Values
-------|--------|---------------
Debt Type|`Security Type`| GO<br>RV
Issuer Type|`Issuer Type Description`|District<br>City, Town Vlg<br>Local Authority<br>State Authority<br>County/Parish<br>College or Univ<br>State/Province<br>Direct<br>Indian Tribe<br>Co-op Utility
Purpose|`Bond Buyer UOP30`|Development<br>Education<br>Electric Power<br>Environmental Facilities<br>General Purpose<br>Healthcare<br>Housing<br>Public Facilities<br>Transportation<br>Utilities
Volume|`Amount of Maturity (M)`|Continuous
County|`County`|Any county in the US
State|`State`|Any state in the US
Issue Date|`Sale Date`|Continuous (we only need the year)

Fortunately, all of these variables appear in the common set.

In [18]:
#Define required variables
req_vars=['Security Type','Issuer Type Description','Bond Buyer UOP30',\
          'Amount of Maturity ($ mils)','County','State','Sale Date']
print 'All the requisite variables are in the common set:',np.array([var in common_vars for var in req_vars]).all()

All the requisite variables are in the common set: True


That makes things easier.  Let's just go ahead and read the data in from all years, keeping only the variables in `req_vars`.

In [24]:
#Create a container for DFs from all years
df_list=[]

#For each file...
for f in files:
    #...throw the subset into df_list
    df_list.append(pd.read_csv(f,usecols=req_vars))
    
#Concatenate all the years together
debt=pd.concat(df_list)

#Convert sale date to datetime
debt['Sale Date']=debt['Sale Date'].apply(lambda x: pd.to_datetime(x))

#Generate a year variable
debt['Year']=debt['Sale Date'].apply(lambda x: x.year)

#Jettison Sale Date
debt.pop('Sale Date')

print debt.info()
debt.head()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 465391 entries, 0 to 36385
Data columns (total 7 columns):
Bond Buyer UOP30               465388 non-null object
Amount of Maturity ($ mils)    465391 non-null object
County                         461215 non-null object
Issuer Type Description        465357 non-null object
State                          465365 non-null object
Security Type                  465391 non-null object
Year                           465391 non-null int64
dtypes: int64(1), object(6)
memory usage: 28.4+ MB
None


Unnamed: 0,Bond Buyer UOP30,Amount of Maturity ($ mils),County,Issuer Type Description,State,Security Type,Year
0,Utilities,0.48,Callaway,District,MO,RV,1988
1,Utilities,0.05,Cass,"City, Town Vlg",MO,RV,1988
2,General Purpose,5.175,Gunnison,District,CO,GO,1988
3,Education,0.273,Clermont/Warren,District,OH,GO,1988
4,Transportation,0.22,Bartholomew,District,IN,GO,1988


We need to merge in FIPS codes, which are conveniently held by Census on a public site.

In [46]:
#Define names for fields
fips_names=['state','fips_st','fips_co','county','unknown']

#Capture dtypes of fips code variables (to keep the zeroes)
fips_dtypes={'fips_st':str,
             'fips_co':str}

#Read in fips
fips=pd.read_csv('http://www2.census.gov/geo/docs/reference/codes/files/national_county.txt',
                 names=fips_names,dtype=fips_dtypes)

#Remove 'County' and 'Parish' from the county names
fips['county']=fips['county'].str.replace(' County','')
fips['county']=fips['county'].str.replace(' Parish','')

#Create composite name and fips variables
fips['st_cou']=fips.apply(lambda row: row['state']+'_'+row['county'],axis=1)
fips['fips']=fips.apply(lambda row: row['fips_st']+row['fips_co'],axis=1)

#Capture dict to map composite names to composite fips codes
fips_dict=dict(zip(fips['st_cou'],fips['fips']))

#Generate composite name for the debt data
debt['st_cou']=debt.apply(lambda row: str(row['State'])+'_'+str(row['County']),axis=1)

#Map in fips codes
debt['FIPS']=debt['st_cou'].map(fips_dict)

debt

Unnamed: 0,Bond Buyer UOP30,Amount of Maturity ($ mils),County,Issuer Type Description,State,Security Type,Year,st_cou,fips,FIPS
0,Utilities,0.48,Callaway,District,MO,RV,1988,MO_Callaway,29027,29027
1,Utilities,0.05,Cass,"City, Town Vlg",MO,RV,1988,MO_Cass,29037,29037
2,General Purpose,5.175,Gunnison,District,CO,GO,1988,CO_Gunnison,08051,08051
3,Education,0.273,Clermont/Warren,District,OH,GO,1988,OH_Clermont/Warren,,
4,Transportation,0.22,Bartholomew,District,IN,GO,1988,IN_Bartholomew,18005,18005
5,Education,1.798,Lake,District,IN,GO,1988,IN_Lake,18089,18089
6,Healthcare,0.09,Carver,"City, Town Vlg",MN,RV,1988,MN_Carver,27019,27019
7,General Purpose,1.32,Platte,"City, Town Vlg",NE,GO,1988,NE_Platte,31141,31141
8,General Purpose,3.68,Grundy,County/Parish,IL,GO,1988,IL_Grundy,17063,17063
9,General Purpose,0.955,St Croix,"City, Town Vlg",WI,GO,1988,WI_St Croix,,


Ok, that got about three quarters of the records.  Let's try to get the rest.  Many issuers come from the `State Authority` or the `State` outright.  Pretty much, if `State` is anywhere in the description, no single county can be affiliated with the issue. So, let's allocate the state FIPS to all of them.

In [87]:
#Capture states
states=sorted(set(debt['State']))[1:]

#For each state...
for st in states:
    #...capture the keys associated with that state...
    st_keys=[item for item in fips_dict.items() if str(st)+'_' in item[0]]
    try:
        #...extract the state portion of the value associated with the first member of the list...
        st_key_part=st_keys[0][1][:2]
        #...and assign the state fips code
        debt.ix[(debt['State']==st) & ('State' in debt['Issuer Type Description']),'FIPS']=st_key_part+'000'
#         debt.ix[(debt['State']==st) & (debt['County']=='State'),'FIPS']=st_key_part+'000'
    except:
        print 'Problem state >>> ',st

debt[debt['County']=='State Authority']

Problem state >>>  MR
Problem state >>>  TT


Unnamed: 0,State,Bond Buyer UOP30,Amount of Maturity ($ mils),County,Issuer Type Description,Security Type,Year,st_cou,fips,FIPS
0,AK,Housing,43,State Authority,State Authority,RV,1988,AK_State Authority,,02000
2,AK,Housing,9.86,State Authority,State Authority,RV,1988,AK_State Authority,,02000
6,AK,Housing,43.385,State Authority,State Authority,RV,1988,AK_State Authority,,02000
7,AK,Education,58.850,State Authority,State Authority,RV,1988,AK_State Authority,,02000
12,AK,Development,1.175,State Authority,State Authority,RV,1988,AK_State Authority,,02000
13,AK,Development,1.81,State Authority,State Authority,GO,1988,AK_State Authority,,02000
14,AK,Housing,21,State Authority,State Authority,RV,1988,AK_State Authority,,02000
16,AK,Housing,38.25,State Authority,State Authority,RV,1988,AK_State Authority,,02000
18,AK,Housing,24.72,State Authority,State Authority,RV,1988,AK_State Authority,,02000
19,AK,Education,5,State Authority,State Authority,RV,1988,AK_State Authority,,02000


In [99]:
debt.ix[(debt['FIPS'].isnull()) & (debt['State']=='AK')]

Unnamed: 0,State,Bond Buyer UOP30,Amount of Maturity ($ mils),County,Issuer Type Description,Security Type,Year,st_cou,fips,FIPS
1,AK,Utilities,7.815,Fairbanks No Star,"City, Town Vlg",RV,1988,AK_Fairbanks No Star,,
3,AK,General Purpose,27.813,Fairbanks No Star,"City, Town Vlg",GO,1988,AK_Fairbanks No Star,,
4,AK,General Purpose,16.06,Fairbanks No Star,"City, Town Vlg",GO,1988,AK_Fairbanks No Star,,
5,AK,Education,23.000,Anchorage,"City, Town Vlg",GO,1988,AK_Anchorage,,
8,AK,General Purpose,14.500,Anchorage,"City, Town Vlg",GO,1988,AK_Anchorage,,
9,AK,Public Facilities,2.000,Matanuska-Susitna,"City, Town Vlg",RV,1988,AK_Matanuska-Susitna,,
10,AK,General Purpose,86,North Slope,"City, Town Vlg",GO,1988,AK_North Slope,,
11,AK,General Purpose,25,North Slope,"City, Town Vlg",GO,1988,AK_North Slope,,
15,AK,General Purpose,3.145,Juneau,State Authority,RV,1988,AK_Juneau,,
17,AK,Utilities,5,Fairbanks No Star,"City, Town Vlg",RV,1988,AK_Fairbanks No Star,,


In [None]:
debt.ix[(debt['FIPS'].isnull()) & ('State' in debt['Issuer Type Description'])]

In [98]:
print debt.ix[(debt['FIPS'].isnull())].ix[26]
'State' in debt.ix[(debt['FIPS'].isnull())].ix[26]['Issuer Type Description']

State                                       AK
Bond Buyer UOP30               General Purpose
Amount of Maturity ($ mils)               2.71
County                                  Juneau
Issuer Type Description        State Authority
Security Type                               GO
Year                                      1989
st_cou                               AK_Juneau
fips                                       NaN
FIPS                                       NaN
Name: 26, dtype: object


True

In [97]:
states

['AK',
 'AL',
 'AR',
 'AS',
 'AZ',
 'CA',
 'CO',
 'CT',
 'DC',
 'DE',
 'FL',
 'GA',
 'GU',
 'HI',
 'IA',
 'ID',
 'IL',
 'IN',
 'KS',
 'KY',
 'LA',
 'MA',
 'MD',
 'ME',
 'MI',
 'MN',
 'MO',
 'MR',
 'MS',
 'MT',
 'N',
 'NC',
 'ND',
 'NE',
 'NH',
 'NJ',
 'NM',
 'NV',
 'NY',
 'OH',
 'OK',
 'OR',
 'PA',
 'PR',
 'RI',
 'SC',
 'SD',
 'TN',
 'TT',
 'TX',
 'UT',
 'VA',
 'VI',
 'VT',
 'WA',
 'WI',
 'WV',
 'WY']