# Master's Thesis - James Quacinella


# Abstract

**Objectives:** This study will extend an established model for estimating the current living wage in 2015 to the past decade for the purpose of:

* an exploratory analysis trends in the gap between the estimated living wage and the minimum wage
* evaluating any correlation between the living wage gap and other economic metrics, including public funds spent on social services

**Methods:** The original data set for this model is for 2015. This study will extend the data sources of this model into the past to enable trend analysis. Data for economic metrics from public data sources will supplement this data for correlation analysis.


# Methods

## Model

The original model proposed estimated the living wage in terms of 9 variables:

** *basic_needs_budget* ** = *food_cost* + *child_care_cost* + ( *insurance_premiums* + *health_care_costs* ) + *housing_cost* + *transportation_cost* + *other_necessities_cost*

** *living_wage* ** = *basic_needs_budget* + ( *basic_needs_budget* \* *tax_rate* )

## Data Sources

The following data sources are used to find estimates of the model variables:

* The food cost is estimated from data from the USDA’s low-cost food plan national average in June 2014. 2
* Child care is based off state-level estimates published by the National Association of Child Care Resource and Referral Agencies. 3
* Insurance costs are based on the insurance component of the 2013 Medical Expenditure Panel Survey. 4
* Housing costs are estimated from the HUD Fair Market Rents (FMR) estimates
* Other variables are pulled from the 2014 Bureau of Labor Statistics Consumer Expenditure Survey. 5

These data sets extend into the past, allowing for calculating the model for years past. The data will also have to be adjusted for inflation 6.

## Analytic Approach

First, data will be gathered from the data sources of the original model but will be extended into the past. The methodology followed by the model will be replicated to come up with a data set representing estimates of the living wage across time. After the data set is prepared, the trend of the living wage as compared to minimum wage can be examined. Has the gap increased or decreased over time, and at what rate? Have certain areas seen larger than average increases or decreases in this gap? 

Once preliminary trend analysis is done, this data set will be analyzed in comparison to other economic trends to see if any interesting correlations can be found. Correlations to GDP growth rate and the national rate of unemployment can be made, but the primary investigation will be to see if the living wage gap correlates to national spending on SNAP (Food stamps). In other words, we will see if there is any (potentially time lagged) relationship between the living wage gap and how much the United States needs to spend to support those who cannot make ends meet. A relationship here can potentially indicate that shrinking this gap could lower public expenditures.


## Presentation Of Results

Results will be presented for both parts of the data analysis. For studying the living wage gap trends, this report will present graphs of time series, aggregated in different ways, of the living wage as well as the living wage gap. Some of these time series will be presented along side data on public expenditures on SNAP to visually inspect for correlations.

## Background / Sources

- Glasmeier AK, Nadeau CA, Schultheis E: LIVING WAGE CALCULATOR User’s Guide / Technical Notes 2014 Update
- USDA low-cost food plan, June, 2014
- Child Care in America 2014 State fact sheets
- 2013 Medical Expenditure Panel Survey Available
- Consumer Expenditure Survey
- Inflation Calculator

---------------------------

# Pre-Data Collection

Lets do all of our imports now:

In [29]:
import numpy as np
from prettytable import PrettyTable
from IPython.core.display import HTML
from collections import OrderedDict

Lets setup some inflation multipliers:

In [3]:
# Multiply a dollar value to get the equivalent 2014 dollars
inflation_multipliers = {
    2010: 1.092609, 
    2011: 1.059176,
    2012: 1.037701,
    2013: 1.022721
}

Lets setup regional differences for the food data:

In [18]:
# Multiply price of food by regional multipler to get better estimate of food costs
food_regional_multipliers = {
    'East': 0.08,
    'West': 0.11,
    'South': -0.07,
    'Midwest': -0.05,
}

------------------------

# Data Collection

The following sections will outline how I gathered the data for the various model parameters as well as other data we need to calculate their values. The original model was made for 2014 data and extending this data to the past means we need to be careful that any changes in the underlying data methodology of these parameters needs to be noted.

## County Data

*TODO*

In [23]:
# Counties dict will map county ID to useful infomation, mostly region
counties = { }

## Food

### Change of Methodology?

In 2006, the data from the USDA changed the age ranges for their halthy meal cost calculations. The differences in range are minimal and should not effect overall estimations

### Load Data

Data for the food calculations have been successfully downloaded in PDF form. The main way to calculate this is, from the PDF:

>Adult  food  consumption  costs  are  estimated  by  averaging  the  low - cost  plan  food  costs for  males  and  females  between  19  and  50

In [32]:
# The base food cost (not regionally weighed) for nation (data pulled manually from PDFs)
national_monthly_food_cost_per_year = {
    2015: {"base": np.average([240.90, 208.80])},
    2014: {"base": np.average([241.50, 209.80])},
    2013: {"base": np.average([234.60, 203.70])},
    2012: {"base": np.average([234.00, 203.00])},
    2011: {"base": np.average([226.80, 196.90])},
    2010: {"base": np.average([216.30, 187.70])},
    2009: {"base": np.average([216.50, 187.90])},
    2008: {"base": np.average([216.90, 189.60])},
    2007: {"base": np.average([200.20, 174.10])},
    2006: {"base": np.average([189.70, 164.80])},
    2005: {"base": np.average([186.20, 162.10])},
    2004: {"base": np.average([183.10, 159.50])},
    2003: {"base": np.average([174.20, 151.70])},
    2002: {"base": np.average([170.30, 148.60])},
    2001: {"base": np.average([166.80, 145.60])},
}

# Create ordered dict to make sure we process things in order
national_monthly_food_cost_per_year = OrderedDict(sorted(national_monthly_food_cost_per_year.items(), 
                                                        key=lambda t: t[0]))

# Regionally adjusted
for year in national_monthly_food_cost_per_year:
    national_monthly_food_cost_per_year[year]["regional"] = { }
    for region in food_regional_multipliers:
        national_monthly_food_cost_per_year[year]["regional"][region] = \
            national_monthly_food_cost_per_year[year]["base"] + (food_regional_multipliers[region] * national_monthly_food_cost_per_year[year]["base"])

# national_monthly_food_cost_per_year

# TODO: inflation adjusted

# Print it nicely
pt = PrettyTable()
pt.add_column("Year", national_food_cost_per_year.keys())
pt.add_column("Food Cost (per month)", [x["base"] for x in national_monthly_food_cost_per_year.values()])
for region in food_regional_multipliers:
    pt.add_column("Food Cost (%s)" % region, [x["regional"][region] for x in national_monthly_food_cost_per_year.values()])

# Print as HTML
HTML(pt.get_html_string())

Year,Food Cost (per month),Food Cost (West),Food Cost (East),Food Cost (Midwest),Food Cost (South)
2001,156.2,173.382,168.696,148.39,145.266
2002,159.45,176.9895,172.206,151.4775,148.2885
2003,162.95,180.8745,175.986,154.8025,151.5435
2004,171.3,190.143,185.004,162.735,159.309
2005,174.15,193.3065,188.082,165.4425,161.9595
2006,177.25,196.7475,191.43,168.3875,164.8425
2007,187.15,207.7365,202.122,177.7925,174.0495
2008,203.25,225.6075,219.51,193.0875,189.0225
2009,202.2,224.442,218.376,192.09,188.046
2010,202.0,224.22,218.16,191.9,187.86
