## temp

For our project, we would like to collect data to see if voter preferences correlate to how affected certain populations were by the Great Recession. We can measure “recession effect” in a number of different ways. For example, we can use statistics from the US Bureau of Labor to map certain geographical areas, and we can compare statistics from pre-2008 and post-2008 to view how the recession affected jobs in those regions. From there, we could look at polls or voter data to see how those regions voted in either the 2010 midterm elections or the 2012 Presidential election. Comparing these two sets of data, we would be interested in finding out which parties/administrations people blamed more for the recession, and whether or not voting habits changed in areas most affected by the recession, and why

This sounds good, but because political landscapes are changing constantly, it is difficult to assign certain changes as being "caused" to the 2008 recession as opposed to just brought on by other factors that happen to correlate with areas hit by the recession (the classic correlation/causation issue). But if you keep this fact in mind, then I think the analysis can be interesting.

1. recession effect data
	- industries effected by 08’ recession
	- recovery rate of effected industries (pre ’08 / post ’08 industry stats)
	- outsourcing in respective industries
    - [BLS api](http://www.bls.gov/developers/api_signature_v2.htm#multiple)
    - [BLS series](http://www.bls.gov/help/hlpforma.htm#OE)
    - [BLS python API example](http://www.bls.gov/developers/api_python.htm#python2)
2. voter preference data
	-  pre / post ’08
3. find way to partition each data set geographically
	- west coast / midwest / east coast
	- rural / metropolitan


## Analyzing the Election

To analyze the election data, we'll be looking towards [Politico](http://www.politico.com/) and the [New York Times](http://www.nytimes.com) and their election coverage pages. Unfortunately, due to formatting changes over the years, it's a little more difficult than we'd like it to be to fetch county voter data from the 2008, 2012, and 2016 elections with one script. However, it's not too hard to write scripts for each of those elections individually.

To help us, we'll be using the [Selenium WebDriver API](http://selenium-python.readthedocs.io/) and [PhantomJS](http://phantomjs.org/) to GET some of the pages. You can install them like so (assuming you have a Mac and Homebrew)
```
pip install selenium
brew install phantomjs
```

In [41]:
import pandas as pd
import requests
import sklearn
import time
from bs4 import BeautifulSoup
from selenium import webdriver

# List of states and D.C.
states = [
    'Alabama','Alaska','Arizona','Arkansas','California','Colorado',
    'Connecticut','Delaware','District Of Columbia','Florida','Georgia','Hawaii','Idaho', 
    'Illinois','Indiana','Iowa','Kansas','Kentucky','Louisiana',
    'Maine','Maryland','Massachusetts','Michigan','Minnesota',
    'Mississippi', 'Missouri','Montana','Nebraska','Nevada',
    'New Hampshire','New Jersey','New Mexico','New York',
    'North Carolina','North Dakota','Ohio',    
    'Oklahoma','Oregon','Pennsylvania','Rhode Island',
    'South Carolina','South Dakota','Tennessee','Texas','Utah',
    'Vermont','Virginia','Washington','West Virginia',
    'Wisconsin','Wyoming'
]

First, we'll be collecting 2008 data from the New York Times page. In terms of formatting, we're hoping to get the year, state, county, and the number of votes for the Democratic and Republican parties (we'll be ignoring third-parties in this analysis). Since Alaska and Washington D.C. don't have counties, we'll just count the state (or district) itself as the county.

In [116]:
'''
Converts data taken from the table in the html
    Input: a table row of voting data, in html
    Output: a list of data in a more desirable format
'''
def voter_html_to_data_2008(voter_html):
    data = [datum.get_text().strip() for datum in voter_html.find_all('td')]
    blue_votes = int(data[2][:-5].replace(',',''))
    red_votes = int(data[4][:-5].replace(',',''))
    return [blue_votes, red_votes]

def get_voter_data_2008():
    # Set up our data frame
    df = pd.DataFrame(columns=('election_year', 'state', 'dem_votes', 'rep_votes'))
    
    # Base url we'll be getting data from 
    base_url = 'http://elections.nytimes.com/2008/results/states/president/'
    
    # Get state names for url endings
    state_urls = sorted([state.lower().replace(' ', '-') for state in states])

    # Iterate through the states (except for Alaska and D.C.)
    num_states = 0
    for state in state_urls:
        if state != 'alaska' and state != 'district-of-columbia':
            # Get data from the site
            response = requests.get(base_url + state + '.html')
            election_soup = BeautifulSoup(response.text, 'html.parser')

            # Data for all states except for Alaska and D.C.
            data_rows = election_soup.find(id='winners-by-county-table').tbody.find_all('tr')
            
            # Data that every row will have (election year and state)
            header_data = ['2008', state]
            state_vote_lists = [voter_html_to_data_2008(row) for row in data_rows]
            state_vote_counts = [sum([vote[0] for vote in state_vote_lists]), sum([vote[1] for vote in state_vote_lists])]
            voter_data = header_data + state_vote_counts
            df.loc[num_states] = voter_data
            num_states += 1
    
    # Since Alaska and D.C. don't have counties, we process them slightly differently
    # First, Alaska
    alaska_html = requests.get('http://elections.nytimes.com/2008/results/states/alaska.html')
    alaska_election_soup = BeautifulSoup(alaska_html.text, 'html.parser')
    
    alaska_obama = alaska_election_soup.find(id='presidential-results-table').tbody.find_all('tr')[:2][1].find_all('td')
    alaska_mccain = alaska_election_soup.find(id='presidential-results-table').tbody.find_all('tr')[:2][0].find_all('td')
    alaska_blue_votes = int(alaska_obama[1].get_text().strip().replace(',',''))
    alaska_red_votes = int(alaska_mccain[2].get_text().strip().replace(',',''))
    alaska_data = ['2008', 'alaska', alaska_blue_votes, alaska_red_votes]
    df.loc[num_states] = alaska_data
    num_states += 1
    
    # Finally, D.C.
    dc_html = requests.get('http://elections.nytimes.com/2008/results/states/district-of-columbia.html')
    dc_election_soup = BeautifulSoup(dc_html.text, 'html.parser')

    dc_obama = dc_election_soup.find(id='presidential-results-table').tbody.find_all('tr')[:2][0].find_all('td')
    dc_mccain = dc_election_soup.find(id='presidential-results-table').tbody.find_all('tr')[:2][1].find_all('td')
    dc_blue_votes = int(dc_obama[2].get_text().strip().replace(',',''))
    dc_red_votes = int(dc_mccain[1].get_text().strip().replace(',',''))
    dc_data = ['2008', 'district-of-columbia', dc_blue_votes, dc_red_votes]
    df.loc[num_states] = dc_data
    num_states += 1
    
    return df

In [117]:
df_2008 = get_voter_data_2008()

                                    dem_votes  rep_votes
election_year state                                     
2008          alabama                811764.0  1264879.0
              alaska                 122485.0   192631.0
              arizona                948648.0  1132560.0
              arkansas               418049.0   632672.0
              california            7441458.0  4554643.0
              colorado              1216793.0  1020135.0
              connecticut            994320.0   627688.0
              delaware               255394.0   152356.0
              district-of-columbia   210403.0    14821.0
              florida               4143957.0  3939380.0
              georgia               1843452.0  2048244.0
              hawaii                 324918.0   120309.0
              idaho                  235219.0   400989.0
              illinois              3319237.0  1981158.0
              indiana               1367264.0  1341101.0
              iowa             

In [119]:
df_2008

Unnamed: 0,election_year,state,dem_votes,rep_votes
0,2008,alabama,811764.0,1264879.0
1,2008,arizona,948648.0,1132560.0
2,2008,arkansas,418049.0,632672.0
3,2008,california,7441458.0,4554643.0
4,2008,colorado,1216793.0,1020135.0
5,2008,connecticut,994320.0,627688.0
6,2008,delaware,255394.0,152356.0
7,2008,florida,4143957.0,3939380.0
8,2008,georgia,1843452.0,2048244.0
9,2008,hawaii,324918.0,120309.0


For 2012, it'll be easier to use the Politico pages to collect the data.

In [122]:
def get_voter_data_2012():
    # Set up our data frame
    df = pd.DataFrame(columns=('election_year', 'state', 'dem_votes', 'rep_votes'))
    
    # Base url we'll be getting data from
    base_url = 'http://www.politico.com/2012-election/results/president/'
    
    # Get state names for url endings
    state_urls = sorted([state.lower().replace(' ', '-') for state in states])
    
    # Once again, Alaska and D.C. are slightly different
    num_states = 0
    for state in state_urls:
        # Get data from site
        try:
            response = requests.get(base_url + state + '/')
        except ConnectionError:
            time.sleep(2)
            response = requests.get(base_url + state + '/')
        election_soup = BeautifulSoup(response.text, 'html.parser')
        header_data = ['2012', state]
        data = election_soup.find('div', class_='state-results-macro').table.tbody
        blue_votes = int(data.find(class_='party-democrat').find(class_='results-popular').get_text().strip().replace(',',''))
        red_votes = int(data.find(class_='party-republican').find(class_='results-popular').get_text().strip().replace(',',''))
        voter_data = header_data + [blue_votes, red_votes]
        df.loc[num_states] = voter_data
        num_states += 1
        
    return df

In [123]:
df_2012 = get_voter_data_2012()

In [124]:
df_2012

Unnamed: 0,election_year,state,dem_votes,rep_votes
0,2012,alabama,793620.0,1252453.0
1,2012,alaska,102138.0,136848.0
2,2012,arizona,930669.0,1143051.0
3,2012,arkansas,389699.0,638467.0
4,2012,california,6493924.0,4202127.0
5,2012,colorado,1238490.0,1125391.0
6,2012,connecticut,912531.0,631432.0
7,2012,delaware,242547.0,165476.0
8,2012,district-of-columbia,222332.0,17337.0
9,2012,florida,4235270.0,4162081.0


Finally, for 2016, we'll be using Politico again.

In [125]:
def voter_html_to_data_2016(voter_html):
    blue_votes = int(voter_html.find(class_='type-democrat').find(class_='results-popular').get_text().replace(',',''))
    red_votes = int(voter_html.find(class_='type-republican').find(class_='results-popular').get_text().replace(',',''))
    return [blue_votes, red_votes]

def get_voter_data_2016():
    # Set up our data frame
    df = pd.DataFrame(columns=('election_year', 'state', 'dem_votes', 'rep_votes'))
    
    # Base url we'll be getting data from
    base_url = 'http://www.politico.com/2016-election/results/map/president/'
    
    # Get state names for url endings
    state_urls = sorted([state.lower().replace(' ', '-') for state in states])
    
    # This time, Alaska and D.C. aren't different!
    num_states = 0
    for state in state_urls:
        # Get data from site
        response = requests.get(base_url + state + '/')
        election_soup = BeautifulSoup(response.text, 'html.parser')

        data = election_soup.select('section.content-group.election-intro')[0].find('div', class_='overall')
        header_data = ['2016', state]
        voter_data = header_data + voter_html_to_data_2016(data)
        df.loc[num_states] = voter_data
        num_states += 1
    
    return df

In [None]:
df_2016 = get_voter_data_2016()

In [151]:
df_2016

Unnamed: 0,election_year,state,dem_votes,rep_votes
0,2016,alabama,718084.0,1306925.0
1,2016,alaska,93007.0,130415.0
2,2016,arizona,936250.0,1021154.0
3,2016,arkansas,378729.0,677904.0
4,2016,california,7362490.0,3916209.0
5,2016,colorado,1208095.0,1136354.0
6,2016,connecticut,884432.0,668266.0
7,2016,delaware,235581.0,185103.0
8,2016,district-of-columbia,260223.0,11553.0
9,2016,florida,4485745.0,4605515.0


To aid in our analysis, we'll create a simple function to find the percent of the state that voted blue, given an election year.

In [147]:
def find_percent_blue(state, year):
    if year == 2008:
        df = df_2008
    elif year == 2012:
        df = df_2012
    else:
        df = df_2016
    df = df.set_index('state')
    dem_votes = df.at[state,'dem_votes']
    rep_votes = df.at[state, 'rep_votes']
    return dem_votes / (dem_votes + rep_votes)

In [150]:
print find_percent_blue('california', 2012)
print find_percent_blue('california', 2016)

0.60713285679
0.652778303597


We accumulate the dataframes for construction, manufacturing, and transportation and trade into one dataframe describing the total change in employment in these 3 blue collar industries.

In [None]:
def accumulate_blue_collar_industries(construction_df, manufacturing_df, transport_and_trade_df):
    df = pd.Dataframe({"state": states})
    df[2004] = construction_df[2004] + manufacturing_df[2004] + transport_and_trade_df[2004]
    df[2008] = construction_df[2008] + manufacturing_df[2008] + transport_and_trade_df[2008]
    df[2012] = construction_df[2012] + manufacturing_df[2012] + transport_and_trade_df[2012]
    df[2016] = construction_df[2016] + manufacturing_df[2016] + transport_and_trade_df[2016]
    return df

blue_collar_df = accumulate_blue_collar_industries(construction_df, manufacturing_df, transport_and_trade_df)

We make the dataframes that will be input into the SVM.

In [None]:
def create_SVM_input(year, blue_collar_df, total_employment_df):
    df = pd.DataFrame({"state": states, 
                       "blue_collar_change_last_8_years": (blue_collar_df[year] - blue_collar_df[year - 8]) / blue_collar_df[year - 8],
                       "total_employment_change_last_8_years": (total_employment_df[year] - total_employment_df[year - 8]) / total_employment_df[year - 8]})
    df["proportion_blue_4_years_ago"] = df.apply(lambda row: find_percent_blue(row["state"], year))
    return df

df_predict_2012 = create_SVM_input(2012, blue_collar_df, total_employment_df, df_2008)
df_predict_2016 = create_SVM_input(2016, blue_collar_df, total_employment_df, df_2012)

Next we label each state with its result (1 for Democrat, 0 for Republican) in each of the last 3 presidential elections (2008, 2012, 2016).

In [None]:
results = {
    'Alabama': [0,0,0], 'Alaska': [0,0,0],
    'Arizona': [0,0,0], 'Arkansas': [0,0,0],
    'California': [1,1,1], 'Colorado': [1,1,1],
    'Connecticut': [1,1,1], 'Delaware': [1,1,1],
    'District Of Columbia': [1,1,1], 'Florida': [1,1,0],
    'Georgia': [0,0,0], 'Hawaii': [1,1,1],
    'Idaho': [0,0,0], 'Illinois': [1,1,1],
    'Indiana': [1,0,0], 'Iowa': [1,1,0],
    'Kansas': [0,0,0], 'Kentucky': [0,0,0],
    'Louisiana': [0,0,0], 'Maine': [1,1,1],
    'Maryland': [1,1,1],'Massachusetts': [1,1,1],
    'Michigan': [1,1,0],'Minnesota': [1,1,1],
    'Mississippi': [0,0,0], 'Missouri': [0,0,0],
    'Montana': [0,0,0], 'Nebraska': [0,0,0],
    'Nevada': [1,1,1], 'New Hampshire': [1,1,1],
    'New Jersey': [1,1,1],'New Mexico': [1,1,1],
    'New York': [1,1,1], 'North Carolina': [1,0,0],
    'North Dakota': [0,0,0],'Ohio': [1,1,0],    
    'Oklahoma': [0,0,0],'Oregon': [1,1,1],
    'Pennsylvania': [1,1,0],'Rhode Island': [1,1,1],
    'South Carolina': [0,0,0],'South Dakota': [0,0,0],
    'Tennessee': [0,0,0],'Texas': [0,0,0],
    'Utah': [0,0,0], 'Vermont': [1,1,1],
    'Virginia': [1,1,1],'Washington': [1,1,1],
    'West Virginia': [0,0,0], 'Wisconsin': [1,1,0],
    'Wyoming': [0,0,0]
}

def provide_labels(year):
    return np.array([v[(year - 2008) / 4] for k, v in results.iteritems()])
    
# df_predict_2012["alphabetical_state_results_2008"] = provide_labels(2008)
df_predict_2012["alphabetical_state_results_2012"] = provide_labels(2012)
df_predict_2016["alphabetical_state_results_2016"] = provide_labels(2016)

Now that we've extracted all the necessary training data, we train an SVM on the features and labels of our training set.

In [None]:
# classifier for training examples, which are 2012 states and labels
def learn_classifier(training_features, training_labels):
    svc = sklearn.svm.SVC()
    return svc.fit(training_features, training_labels)

trained_svc = learn_classifier(df_predict_2012["blue_collar_change_last_8_years", 
                                               "total_employment_change_last_8_years", 
                                               "proportion_blue_4_years_ago"], 
                               df_predict_2012["alphabetical_state_results_2012"])

Finally, we run our classifier on our testing set, the states in the 2016 election.

In [None]:
def eval_classifier(svc, testing_features, testing_labels):
    predicted_labels = svc.predict(testing_features)
    return float(np.sum(predicted_labels == testing_labels)) / testing_features.shape[0]

accuracy = eval_classifier(trained_svc, df_predict_2016["blue_collar_change_last_8_years", 
                                                        "total_employment_change_last_8_years", 
                                                        "proportion_blue_4_years_ago"], 
                           df["alphabetical_state_results_2016"])
print accuracy

---

## Scraping Labor Statistics

* getting labor statistics from every state, by employment industry, from 2006 to 2016

In [None]:
import json, string
import requests, tqdm

In [None]:
def opand(x,y): 
    return x and y

def flatten(xss):
    return [x for xs in xss for x in xs]


def nest(keys, d_keys, d):
    assert (len(keys) == len(d_keys))
    iters = string.join(["for x%d in d['%s']" % (i, d_key) for (i, d_key) in enumerate(d_keys)], sep=" ")
    args = string.join(['("' + keys[i] + '", x' + str(i) + ')' for i in range(len(keys))], sep=", ")
    command = '[dict([' +  args +  ']) ' + iters + ']'
    return eval(command)


# 500 requests per day
# 50 series per request
# 20 years per request
def request(series):
    """
    brief:
        - requests series in batches of 15, returns list of json_data, one for each request
    args:
        - series : list of bls series id's
    """
    registrationKeys = ["bc2b9775e9794f37a23c0f6b2a4659b1", "", ""]
    json_data = []
    for i in tqdm.tqdm(range(0,len(series),50)):
        headers = {'Content-type': 'application/json'}
        data = json.dumps({"seriesid": series[i:i+50],
                           "startyear":"2004", 
                           "endyear":"2016", 
                           "registrationKey":"bc2b9775e9794f37a23c0f6b2a4659b1"})
        response = requests.post('http://api.bls.gov/publicAPI/v2/timeseries/data/', data=data, headers=headers)

        json_data.append(json.loads(response.text))
    return json_data #TODO: recombine json_data 


def format(config):
    """
    brief:
        - formats a request for data from the bls state and area employment, hours, and earnings database
            if config['prefix'] == 'SM'
        - formats a request for data from the bls local area unemployment statistics database
            if config['prefix'] == 'SA'
    args:
        - config : {                                       # http://www.bls.gov/help/hlpforma.htm#SM
                    'prefix' : 'SM',
                    'seasonal_adjustment' : 'S' or 'U',
                    'state_code' : '01' to '50', 
                    'area_code' : '00000' to '99999',
                    'supersector' : '00' to '99', 
                    'industry' : '000000' to '999999',     
                    'datatype' : '00' to '99'              
                    }
                or {
                    'prefix' : 'LA',                       # http://www.bls.gov/help/hlpforma.htm#LA
                    'seasonal_adjustment' : 'S' or 'U',
                    'area_type' : 'ST' or 'MT', 
                    'state_code' : '01' to '50',
                    'area_code' : '000000000' or 'YYYYYYYYYYY'
                                   }
                    'measure': '03' to '06'                }
    """
    def valid(config):
        if 'prefix' in config:
            if 'seasonal_adjustment' in config and config['seasonal_adjustment'] in ['S', 'U']:
                if config['prefix'] == 'SM':
                    return ('state_code' in config and len(config['state_code']) == 2
                        and 'area_code' in config and len(config['area_code']) == 5
                        and 'supersector' in config and len(config['supersector']) == 2
                        and 'industry' in config and len(config['industry']) == 6
                        and 'datatype' in config and len(config['datatype']) == 2)

                elif config['prefix'] == 'LA':
                    return ('state_area_code' in config and len(config['state_area_code']) == 15
                        and 'measure' in config and config['measure'] in ['03','04','05','06'])
        return False
    if valid(config):
        # state and area employment, hours, and earnings
        if config['prefix'] == 'SM':
            # not using d.keys() here because keys must be in specific order
            keys = ['prefix','seasonal_adjustment','state_code','area_code','supersector','industry','datatype']
            series_id = string.join([config[key] for key in keys], sep="")
        # local area unemployment statistics
        elif config['prefix'] == 'LA':
            # not using d.keys() here because keys must be in specific order
            keys = ['prefix', 'seasonal_adjustment', 'state_area_code', 'measure']
            series_id = string.join([config[key] for key in keys], sep="")
        else:
            raise Exception("KeyError: unsupported prefix")
        return series_id
    else:
        raise Exception("KeyError: invalid parameters")


def series(config):
    """
    brief:
        - provided a dictionary of bls series id parameters,
          construct and request each possible series_id given parameters
    args:
        - config : {
                    'prefix' : ['SM'],
                    'seasonal_adjustments' : ['S' or 'U'],
                    'state_codes' : [string],
                    'area_codes' : [{string -> string}],
                    'supersectors' : [string],
                    'industries' : [string],
                    'datatypes' : [string]
                    }
                or {
                    'prefix' : ['LA'],
                    'seasonal_adjustment' : ['S' or 'U'],
                    'area_types' : ['ST' or 'MT'],
                    'state_codes' : ['01' to '50']
                    'area_codes' : [{'ST', 'xx' --> 'STxx00000000000', 
                                    'MT', 'xx' --> 'MTxxYYYYYYYYYYY'}]
                    'measures': [string]
                   }
    """
    def valid(config):
        if 'prefix' in config and config['prefix'] in [['SM'], ['LA']]:
            if ('seasonal_adjustments' in config 
                and config['seasonal_adjustments'] in [['S'],['U'],['S','U'],['U','S']]):
                if config['prefix'] == ['SM']:
                    return ('state_codes' in config 
                            and reduce(opand, [0 < int(state_code) 
                                and int(state_code) <= 50 and len(state_code) == 2 
                                    for state_code in config['state_codes']])
                        and 'area_codes' in config 
                            and reduce(opand, [len(area_code) == 5 for area_code in config['area_codes']])
                        and 'supersectors' in config 
                            and reduce(opand, [len(supersector) == 2 for supersector in config['supersectors']])
                        and 'industries' in config
                            and reduce(opand, [len(industry) == 6 for industry in config['industries']])
                        and 'datatypes' in config)
                elif config['prefix'] == ['LA']:
                    return ('state_area_codes' in config and reduce(opand, [len(state_area_code) == 15
                                                                            for state_area_code
                                                                            in config['state_area_codes']])
                        and 'measures' in config) 
        return False
    if valid(config):
        if config['prefix'] == ['SM']:
            keys = ['prefix','seasonal_adjustment','state_code','area_code','supersector','industry','datatype']
            param_keys = ['prefix','seasonal_adjustments','state_codes','area_codes',
                          'supersectors','industries','datatypes']
        elif config['prefix'] == ['LA']:
            keys = ['prefix','seasonal_adjustment','state_area_code','measure']
            param_keys = ['prefix','seasonal_adjustments','state_area_codes','measures']
        else:
            raise Exception("KeyError: unsupported prefix")
        return request([format(sid) for sid in nest(keys, param_keys, config)])
    else:
        raise Exception("KeyError: invalid parameters")





In [None]:
state_codes = map(lambda i : str(i).zfill(2), range(1,51))

with open('metro_codes.txt', 'r') as file:
    metro_codes = [string.split(line, sep='\t')[1] for line in string.split(file.read(), sep='\n')]

state_area_codes = {'ST' : dict(zip(state_codes, ['ST' + state_code + '00000000000' for state_code in state_codes])),
                    'MT' : dict(zip(state_codes, metro_codes))
                    }

#print map(lambda mt : (len(mt), mt), metro_codes)

stateIndustry_config = {'prefix' : ['SM'],
                        'seasonal_adjustments' : ['S', 'U'],
                        'state_codes' : state_codes,
                        'area_codes' : ['00000'],
                        'supersectors' : ['00','05','06','07','08','10','15','20','30','31','32',
                                          '40','41','42','43','50','55','60','65','70','80','90'],
                        'industries' : ['000000'],
                        'datatypes' : ['01', '02', '03', '04'] # industry employees in thousands,
                        }                                      # average weekly hours of all employees,
                                                               # average hourly earnings of all employees,
                                                               # average overtime hours of all employees
state_config = {'prefix' : ['LA'],
                'seasonal_adjustments' : ['S', 'U'],
                'state_area_codes' : [state_area_codes['ST'][state_code] for state_code in state_codes],
                'measures' : ['03', '04', '05', '06'] # unemployment_rate, unemployment, employment, labor_force
                }

stateMetro_config = {'prefix' : ['LA'],
                     'seasonal_adjustments' : ['S', 'U'],
                     'state_area_codes' : [state_area_codes['MT'][state_code] for state_code in state_codes],
                     'measures' : ['03', '04', '05', '06'] # unemployment_rate, unemployment, employment, labor_force
                     }

stateIndustry_json = series(stateIndustry_config)
state_json = series(state_config)
stateMetro_json = series(stateMetro_config)

#print stateIndustry_json
# print state_json
# print stateMetro_json

In [None]:
import prettytable

def saveTextFile(json_data):
    for series in json_data['Results']['series']:
        x=prettytable.PrettyTable(["series id","year","period","value","footnotes"])
        seriesId = series['seriesID']
        for item in series['data']:
            year = item['year']
            period = item['period']
            value = item['value']
            footnotes=""
            for footnote in item['footnotes']:
                if footnote:
                    footnotes = footnotes + footnote['text'] + ','
            if 'M01' <= period <= 'M12':
                x.add_row([seriesId,year,period,value,footnotes[0:-1]])
    output = open('data/' + seriesId + '.txt','w')
    output.write (x.get_string())
    output.close()

# saveTextFile(stateIndustry_json)
# saveTextFile(state_json)
# saveTextFile(stateMetro_json)

In [None]:
# TODO: pandas dataframes...

# [Jake]: I have a local implementation for converting to pandas dataframes, but its buggy, 
#         and I can't test it as I exceeded BLS's 500 daily query limit

# can one of you add another registrationKey (in the requests function, in the first BLS cell?)
# http://data.bls.gov/registrationEngine/