# In case you skipped the readme (not that anyone ever would), the goal of this project is to examine the impact of Covid on a variety of US economic data such as unemployment, cost of goods, wages and which demographics were impacted the most. Data is gathered from U.S. Bureau of Labor Statistics.

## Key Questions:
<ol> 
    <li> Unemployment </li>
        <ul>
            <li>How much did unemployment increase at it's peak?</li>
            <li>Did unemployment recover? If so, how long did it take?</li>
            <li>What age groups were most impacted by the pandemic?</li>
            <li>Which states were impacted most by the pandemic?</li>
        </ul>
    </br>
    <li>Cost of Goods vs. Wages</li>
        <ul>
            <li>Did goods increase in price over the pandemic? If so by how much?</li>
            <li>How did consumer wages compare to the price changes from above?</li>
            <li>Can we learn anything about cost to producers of the goods? When compared with product price increases, were producers at a net win or loss?</li>
        </ul>
     </br>
</ol>

In [184]:
import pandas as pd
import seaborn as sns
import plotly.express as px
from sklearn.linear_model import LinearRegression
import plotly.graph_objects as go

pd.options.display.float_format = '{:,.2f}'.format

## Unemployment Review

In [136]:
# data from bls.gov
civ_unemployment = pd.read_csv("civ_unemployment.csv")
# state_unemployment = pd.read_csv("state_unemployment.csv")
# employment_by_demographic = pd.read_csv("employment_by_demographic.csv")
unemployed_per_job_opening = pd.read_csv("unemployed_per_job_opening.csv")
reasons_for_unemployment = pd.read_csv("reasons_for_unemployment.csv")

In [137]:
# store constants
covid_start = pd.to_datetime('2020-02-01')

In [138]:
# remove NA and unused columns
civ_unemployment.drop(['Unnamed: 9', 'White', 'Black or African American', 'Asian', 'Hispanic or Latino'], axis=1, inplace=True)
unemployed_per_job_opening.drop('Unnamed: 2', axis=1, inplace=True)
reasons_for_unemployment.drop('Unnamed: 7', axis=1, inplace=True)

In [139]:
# defining data from set to evaluate. Adding rolling 12 trend lines
# focus is on > 20 age group as is likely the majority of workforce and to prevent skew on chart. Race ignored for now
groups = ['Total', 'Men, 20 years and over', 'Women, 20 years and over']
civ_unemployment[['Total_rolling12', 'MenOver20_rolling12', 'WomenOver20_rolling12']] = civ_unemployment[groups].rolling(12).mean()


In [140]:
# turn dates into datetime objects for future use
civ_unemployment['Date'] = pd.to_datetime(civ_unemployment.Date)
unemployed_per_job_opening['Date'] = pd.to_datetime(unemployed_per_job_opening.Date)
reasons_for_unemployment['Date'] = pd.to_datetime(reasons_for_unemployment.Date)

In [141]:
# difference in data
civ_unemployment_deltas = civ_unemployment[civ_unemployment.Date < covid_start].describe() - civ_unemployment[civ_unemployment.Date > covid_start].describe()
civ_unemployment_deltas

Unnamed: 0,Total,"Men, 20 years and over","Women, 20 years and over",16 to 19 years old,Total_rolling12,MenOver20_rolling12,WomenOver20_rolling12
count,22.0,22.0,22.0,22.0,11.0,11.0,11.0
mean,-1.13,-1.16,-1.29,1.21,-1.16,-1.17,-1.31
std,-2.12,-1.84,-2.41,-3.33,-1.19,-1.08,-1.32
min,0.1,0.1,0.0,2.8,0.09,0.03,0.07
25%,0.2,0.2,0.2,2.35,0.05,-0.06,0.01
50%,-0.1,-0.2,-0.1,2.8,-1.04,-1.08,-1.05
75%,-1.35,-1.45,-1.45,2.0,-2.06,-1.98,-2.35
max,-9.0,-7.7,-10.4,-14.2,-3.42,-3.18,-3.87


In [None]:
# find first instance that UE returns to mean
diff = civ_unemployment[civ_unemployment.Date < covid_start].Total.mean() - civ_unemployment.Total_rolling12
diff_df = pd.DataFrame(data={'Date': civ_unemployment.Date, 'Difference': diff})
fpv = diff_df.loc[diff_df.Difference >= 0]
fpv[fpv.Date > covid_start]

In [168]:
# graph of UE over time for each key age group
civ_ue_line = px.line(civ_unemployment, 
        x='Date', 
        y=groups,)

# adding rolling 12 line
civ_ue_line.add_trace(go.Scatter(x= civ_unemployment.Date,
                                 y=civ_unemployment.Total_rolling12,
                                 mode='lines',
                                 name='Rolling 12 Total UE'))

# average pre-covid UE rate
civ_ue_line.add_hline(civ_unemployment[civ_unemployment.Date < covid_start].Total.mean(),
                     line_dash='dash'
                     )

civ_ue_line.update_layout(yaxis_title = '% Unemployment',
                          )

civ_ue_line.show()

In [143]:
# finding answer to max increase in UE
ue_increase_over_mean = civ_unemployment.Total.max() - c iv_unemployment[civ_unemployment.Date < covid_start].Total.mean()
ue_increase_over_mean
# result = 10.298360655737703

10.298360655737703

In [183]:
# find first instance that total UE returns to pre-covid mean
diff = civ_unemployment[civ_unemployment.Date < covid_start].Total.mean() - civ_unemployment.Total
diff_df = pd.DataFrame(data={'Date': civ_unemployment.Date, 'Difference': diff})
fpv = diff_df.loc[diff_df.Difference >= 0]
fpv[fpv.Date > covid_start]
# First real intersection is 2021-11-01, 2020 value can be ignored as is not indicative of recovery

Unnamed: 0,Date,Difference
62,2020-03-01,0.0
82,2021-11-01,0.2
83,2021-12-01,0.5
84,2022-01-01,0.4
85,2022-02-01,0.6
86,2022-03-01,0.8
87,2022-04-01,0.8
88,2022-05-01,0.8
89,2022-06-01,0.8
90,2022-07-01,0.9


In [179]:
civ_unemployment['additional_ue_men'] = civ_unemployment['Men, 20 years and over'] - civ_unemployment[civ_unemployment.Date < covid_start]['Men, 20 years and over'].mean()
civ_unemployment['additional_ue_women'] = civ_unemployment['Women, 20 years and over'] - civ_unemployment[civ_unemployment.Date < covid_start]['Women, 20 years and over'].mean()
civ_unemployment['additional_ue_children'] = civ_unemployment['16 to 19 years old'] - civ_unemployment[civ_unemployment.Date < covid_start]['16 to 19 years old'].mean()

In [180]:
print(civ_unemployment.additional_ue_men.describe())
print(civ_unemployment.additional_ue_women.describe())
print(civ_unemployment.additional_ue_children.describe())

count   101.00
mean      0.44
std       1.68
min      -0.96
25%      -0.66
50%      -0.16
75%       0.64
max       8.94
Name: additional_ue_men, dtype: float64
count   101.00
mean      0.49
std       1.99
min      -0.89
25%      -0.59
50%      -0.09
75%       0.61
max      11.51
Name: additional_ue_women, dtype: float64
count   101.00
mean     -0.50
std       3.51
min      -5.21
25%      -2.81
50%      -1.21
75%       1.39
max      18.29
Name: additional_ue_children, dtype: float64


In [185]:
# graph of UE increase over mean for each category
ue_increase_groups = ['additional_ue_men', 'additional_ue_women', 'additional_ue_children']
ue_increase_line = px.line(civ_unemployment, 
        x='Date', 
        y=ue_increase_groups,)

ue_increase_line.add_hline(0,
                     line_dash='dash'
                     )

ue_increase_line.update_layout(yaxis_title = '% Unemployment above mean',
                          )

ue_increase_line.show()

## General Unemployment Analysis
<ul>
        <li>How much did unemployment increase at it's peak?</li>
          <p>  a) Peak unemployment was ~ 10.3% above the pre-covid average unemployment level. This is over a 3x increase and is a very sizeable shift to total UE. </p> </br>
        <li>Did unemployment recover? If so, how long did it take?</li>
    <p> a) Unemployment did ultimately recover. Using pre-covid UE mean as the baseline for UE, the data shows full recovery was acheived in November, 2021 (1 year 8 months from start). It should be noted that 'recovery' can be calculated many ways and lower UE is not always a healthy indicator. </p></br>
        <li>What age groups were most impacted by the pandemic?</li></br>
        a) Men and women over age 20 were impacted relatively similarly across the period with average increases in UE being slightly larger for women during the time period. For all groups, UE is now lower than pre-covid. It is noteable that children age 16-19 had the largest peak increase in unemployment (~18.3%), but saw a decrease in unemployment on average as indicated by the negative mean. One possible explanation for this oddity is that more children were seeking employement during covid as parents faced high UE (leading to a larger spike in youth UE by %) and businesses looking to save on wages by hiring less experienced workers (resulting in a faster decline in UE and more employment on average than pre-covid).
</ul>

In [186]:
reasons_for_unemployment

Unnamed: 0,Date,Job losers and persons who completed temporary jobs,Job losers not on temporary layoff,Job losers on temporary layoff,Job leavers,Reentrants,New entrants
0,2015-01-01,4219000,3308000,911000,868000,2747000,1009000
1,2015-02-01,4174000,3158000,1015000,900000,2630000,935000
2,2015-03-01,4150000,3134000,1015000,864000,2635000,812000
3,2015-04-01,4159000,3153000,1006000,841000,2676000,873000
4,2015-05-01,4403000,3274000,1129000,819000,2608000,978000
...,...,...,...,...,...,...,...
96,2023-01-01,2529000,1795000,734000,884000,1817000,531000
97,2023-02-01,2752000,1935000,816000,891000,1847000,515000
98,2023-03-01,2949000,2117000,833000,845000,1665000,492000
99,2023-04-01,2642000,1933000,709000,790000,1761000,531000
