# Unemployment Analysis

Studying whether the introduction of Uber has led to a reduction in unemployment rate.

Methodology:
1. Analysisng the unemployment rates 3,6, and 12 months after the introduction of Uber.
2. If the unemployment rate decreases by 2%, then we assume that there is a significant effect on the unemployment rate. 

## State-wise Dataframe creation

For each state we create a dataframe that has the unemployment rate at the following time periods:
1. Uber_intro_month: month of introduction
2. Uber_intro_year: year of introduction
3. Uber_unemployment: the unemployment for the month uber was introduced
5. 3_before: 3 months before uber was introduced
6. 6_before: 6 months before uber was introduced
7. 12_before: 12 months before uber was introduced
8. 3_after: 3 months after uber was introduced
9. 6_after: 6 months after uber was introduced
10. 12_after: 12 months after uber was introduced

**Note: NaN values**

We only have data till 2017, thus, for any non existent data, we just fill the unemployment rate with the average unemployment rate for that state

In [1]:
import pandas as pd
import numpy as np
import os.path

uber = pd.read_csv('data/Uber-Introduction-Date.csv')
uber.head()

Unnamed: 0,State,Intro_month,Intro_month_number,Intro_year
0,Alabama,June,6,2018
1,Alaska,June,6,2017
2,Arizona,November,11,2017
3,Arkansas,July,7,2017
4,California,May,5,2010


In [2]:
def find_next_month(month, year, interval):
    newmonth = month + interval
    newyear = year
    if newmonth>12:
        newmonth = newmonth%12
        newyear = newyear+1
    if newmonth==0:
        newmonth =12
    return newmonth,newyear

def find_prev_month(month, year, interval):
    newmonth = month - interval
    newyear = year
    if newmonth <=0:
        if newmonth==0:
            newmonth =12
        else: 
            newmonth = newmonth%12
            newyear = newyear-1
        if newmonth==0:
            newmonth =12
    return newmonth, newyear



In [3]:
unemployment = pd.read_csv('data/StateUnemploymentRates2010.csv')
unemployment.head()

Unnamed: 0,State,Jan.,Feb.,Mar.,April,May,June,July,Aug.,Sept.,Oct.,Nov.,Dec.
0,Alabama,11.1,11.1,11.0,11.0,10.7,10.3,9.7,9.2,8.9,8.9,9.0,9.1
1,Alaska,8.5,8.5,8.5,8.4,8.2,7.9,7.7,7.7,7.8,7.9,8.0,8.1
2,Arizona,9.2,9.5,9.6,9.5,9.6,9.6,9.6,9.7,9.7,9.5,9.4,9.4
3,Arkansas,7.6,7.7,7.8,7.8,7.7,7.5,7.4,7.5,7.7,7.8,7.9,7.9
4,California,12.5,12.5,12.6,12.5,12.4,12.3,12.3,12.4,12.4,12.4,12.4,12.5


In [4]:
def unemployment_stats(state,month,year):
    month_dict = {1:'Jan.', 2:'Feb.', 3: 'Mar.', 4: 'April', 5:'May', 6:'June', 7:'July', 8:'Aug.', 9:'Sept.', 10:'Oct.', 11:'Nov.', 12:'Dec.'}
    file_name = 'data/StateUnemploymentRates'+str(year)+'.csv'
    if not os.path.isfile(file_name):
        return float('NaN')
    else:
        df = pd.read_csv(file_name)
        df = df.loc[df['State']==state]
        return df.iloc[0][month_dict[month]]

In [31]:
#data = [[state,uber_intro_month, uber_intro_year, uber_unemployment, 3_before, 6_before, 12_before, 3_after, 6_after, 12_after]]
data = []
for index,row in uber.iterrows():
    state, uber_month, uber_year = row['State'], row['Intro_month_number'], row['Intro_year']
    if state == "District of Columbia":
        state = "D.C."
    d=[state, uber_month, uber_year]
    d.append(unemployment_stats(state,uber_month, uber_year))
    #base_month , base_year = find_prev_month(uber_month, uber_year, 1)
    #d.append(unemployment_stats(state,base_month,base_year))
    for interval in [3,6,12]:
        month, year = find_prev_month(uber_month, uber_year, interval)
        d.append(unemployment_stats(state,month,year))
    for interval in [3,6,12]:
        month, year = find_next_month(uber_month, uber_year, interval)
        d.append(unemployment_stats(state,month,year))
    data.append(d)

unemployment_data = pd.DataFrame(data, columns = ['State','uber_intro_month', 'uber_intro_year', 'uber_unemployment', '3_before', '6_before', '12_before', '3_after', '6_after', '12_after'])

In [32]:
unemployment_data.head()

Unnamed: 0,State,uber_intro_month,uber_intro_year,uber_unemployment,3_before,6_before,12_before,3_after,6_after,12_after
0,Alabama,6,2018,,,,4.3,,,
1,Alaska,6,2017,7.2,7.1,7.2,6.9,7.2,7.2,
2,Arizona,11,2017,4.7,4.7,4.9,5.2,,,
3,Arkansas,7,2017,3.7,3.6,3.7,3.9,3.7,,
4,California,5,2010,12.4,12.5,12.3,11.5,12.4,12.4,11.7


In [33]:
#fill NaN values with the average unemployment of that state
for index,row in unemployment_data.iterrows():
    unemployment_data.iloc[index] = unemployment_data.iloc[index].fillna(row[['3_before', '6_before', '12_before', '3_after', '6_after', '12_after']].mean())

In [34]:
unemployment_data

Unnamed: 0,State,uber_intro_month,uber_intro_year,uber_unemployment,3_before,6_before,12_before,3_after,6_after,12_after
0,Alabama,6,2018,4.3,4.3,4.3,4.3,4.3,4.3,4.3
1,Alaska,6,2017,7.2,7.1,7.2,6.9,7.2,7.2,7.12
2,Arizona,11,2017,4.7,4.7,4.9,5.2,4.933333,4.933333,4.933333
3,Arkansas,7,2017,3.7,3.6,3.7,3.9,3.7,3.725,3.725
4,California,5,2010,12.4,12.5,12.3,11.5,12.4,12.4,11.7
5,Colorado,5,2014,5.8,6.1,6.5,6.9,5.1,4.1,4.3
6,Connecticut,4,2014,6.9,7.2,7.9,8.0,6.6,6.4,6.3
7,Delaware,9,2014,6.5,6.1,5.9,7.0,5.4,4.6,4.9
8,D.C.,12,2012,8.5,8.7,9.1,8.5,8.5,8.5,8.1
9,Florida,6,2014,6.2,6.3,5.6,7.1,6.1,5.6,5.5


In [35]:
unemployment_data.to_csv("data/unemployment_data.csv", index = False)

## Unemployment Trends

We check whether the unemployment rate has decreased after the intro of uber, in periods of 3, 6, and 12 months. In order to see if Uber has actually caused the decrement in unemployment rates, we compare the trend of unemployment rates before Uber was introduced as well.


In [38]:
#find if unemployment has decreased 

#percentage decrease in data

#if negative, then there has been a decreasing trend before uber was introduced
unemployment_data['3_before_dec'] = (unemployment_data['uber_unemployment']- unemployment_data['3_before'])/ unemployment_data['uber_unemployment']
unemployment_data['6_before_dec'] = (unemployment_data['uber_unemployment']- unemployment_data['6_before'])/ unemployment_data['uber_unemployment']
unemployment_data['12_before_dec'] = (unemployment_data['uber_unemployment']- unemployment_data['12_before'])/ unemployment_data['uber_unemployment']

#if negetive, then unemployment has decreased after uber was introduced
unemployment_data['3_after_dec'] =  (unemployment_data['3_after'] - unemployment_data['uber_unemployment'])/ unemployment_data['uber_unemployment']
unemployment_data['6_after_dec'] =  (unemployment_data['6_after'] - unemployment_data['uber_unemployment'])/ unemployment_data['uber_unemployment']
unemployment_data['12_after_dec'] =  (unemployment_data['12_after'] - unemployment_data['uber_unemployment'])/ unemployment_data['uber_unemployment']

In [40]:
#flags if there has been a decrease in unemployment
#dec_x_after: the unemployment rate has decreased (threshold = 0) x months after the introduction of Uber
#dec_x_before: the unemployment rate of the month when uber was introduces is less that (threshold 0) x months befoe (there has been a declining trend even before uber was introduced)


unemployment_data['dec_3_before'] = np.where((unemployment_data['3_before_dec']<0),1,0)
unemployment_data['dec_6_before'] = np.where((unemployment_data['6_before_dec']<0),1,0)
unemployment_data['dec_12_before'] = np.where((unemployment_data['12_before_dec']<0),1,0)

unemployment_data['dec_3_after'] = np.where((unemployment_data['3_after_dec']<0),1,0)
unemployment_data['dec_6_after'] = np.where((unemployment_data['6_after_dec']<0),1,0)
unemployment_data['dec_12_after'] = np.where((unemployment_data['12_after_dec']<0),1,0)

In [45]:
print(unemployment_data['dec_3_before'].value_counts())
print(unemployment_data['dec_6_before'].value_counts())
print(unemployment_data['dec_12_before'].value_counts())

print(unemployment_data['dec_3_after'].value_counts())
print(unemployment_data['dec_6_after'].value_counts())
print(unemployment_data['dec_12_after'].value_counts())

0    27
1    24
Name: dec_3_before, dtype: int64
1    30
0    21
Name: dec_6_before, dtype: int64
1    35
0    16
Name: dec_12_before, dtype: int64
1    31
0    20
Name: dec_3_after, dtype: int64
1    36
0    15
Name: dec_6_after, dtype: int64
1    37
0    14
Name: dec_12_after, dtype: int64


In [46]:
#final adjusted decrease: has uber led to a decrease in places where there was NOT already a decreasing trend?
unemployment_data['final_reduction_3'] = np.where(unemployment_data['dec_3_after'] ==1, np.where(unemployment_data['dec_3_before'] == 0, 1,0 ),0)
unemployment_data['final_reduction_6'] = np.where(unemployment_data['dec_6_after'] ==1, np.where(unemployment_data['dec_6_before'] == 0, 1,0 ),0)
unemployment_data['final_reduction_12'] = np.where(unemployment_data['dec_12_after'] ==1, np.where(unemployment_data['dec_12_before'] == 0, 1,0 ),0)

print(unemployment_data['final_reduction_3'].value_counts())
print(unemployment_data['final_reduction_6'].value_counts())
print(unemployment_data['final_reduction_12'].value_counts())

0    34
1    17
Name: final_reduction_3, dtype: int64
0    35
1    16
Name: final_reduction_6, dtype: int64
0    40
1    11
Name: final_reduction_12, dtype: int64


In [66]:
#save to another dataframe

data = []


for i in [3,6,12]:
    name1 = 'dec_'+ str(i) + '_before'
    name2 = 'dec_'+ str(i) + '_after'
    name3 = 'final_reduction_' + str(i)
    for name in [name1,name2,name3]:
        x = unemployment_data[name].value_counts(sort=False)
        if name==name1:
            y = "Before Uber"
        elif name ==name2:
            y = "After Uber"
        else:
            y = "Final Calculation"
        d = [y, i, x[0], x[1]]
        data.append(d)

value_counts = pd.DataFrame(data, columns =['Metric', 'Number of months', 'Increase in UER', 'Decrease in UER'])

In [67]:
value_counts

#Before Uber: How many states had an Increase/Decrease in unemployment rate the month uber was introduced compared to "number of months" months before 
#After Uber: How many states had an Increase/Decrease in unemployment rate "number of months" months after Uber was introduced
#Final Calculation: Adjusted Count: How many states had a decrease in UER after Uber was introduced given that the UER was not decreasing before Uber was introduced

Unnamed: 0,Metric,Number of months,Increase in UER,Decrease in UER
0,Before Uber,3,27,24
1,After Uber,3,20,31
2,Final Calculation,3,34,17
3,Before Uber,6,21,30
4,After Uber,6,15,36
5,Final Calculation,6,35,16
6,Before Uber,12,16,35
7,After Uber,12,14,37
8,Final Calculation,12,40,11


In [68]:
value_counts.to_csv("data/unemployment_counts.csv", index = False)