<a href="https://colab.research.google.com/github/RahmanMonty/Unemployment/blob/master/Unemployment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import altair as alt
import seaborn as sns ; sns.set()

In [0]:
# As the CoronaVirus keeps many of us in quarantine and isolation, a lot of jobs that are considered non-essential restricts us
# from going to work and getting paid. As this problem continues, the nation is experiencing higher then usual unemployment claims. This project
# will explore the populations of states that have applied for unemployment benefits, the unique pay that each state offers, and how they compare
# to one another.

In [0]:
df = pd.read_csv("State Unemployment Data.csv")
df2 = pd.read_csv("Unemployment Max Pay.csv",encoding='latin1')

In [0]:
df.head()

In [0]:
df2[["State","Max.Weekly Benefits","Max. DA Allowance"]].head()

In [0]:
#Because some of the Max. DA Allowance does not have values present we will use the fillna method to fill them with zeros
df2.fillna(0)

In [14]:
# Here we want to determine the actual number of unemployed individual of the states population 
# and add it to our data frame that wasn't originally in the CSV file.
df["Unemployed"] = df["Pop"] * df["unemploymentRate"]
df.head()

Unnamed: 0,State,unemploymentRateRank,unemploymentRate,Pop,Unemployed
0,South Carolina,1,0.023,5210095,119832.185
1,Utah,1,0.023,3282115,75488.645
2,Vermont,1,0.023,628061,14445.403
3,North Dakota,4,0.024,761723,18281.352
4,Colorado,5,0.025,5845526,146138.15


In [0]:
df.tail()

In [0]:
# Now we will combine the two data frames to get an easier look at the data

join = pd.concat([df,df2[["Max.Weekly Benefits"]]], axis = 1)
join.head(50)

In [0]:
# In order to find the states that spend the most money on its citizens we must remove the $ sign from the Weekly Benefits Column
# So that we can do some manipulation of the data frames
join['Max.Weekly Benefits'] = join['Max.Weekly Benefits'].str.replace(',', '').str.replace('$', '').astype(int)

In [0]:
#Now with the data frames combined we can see which state pays the largest amount of unemployment to its citizens!
join["Most Money"] = join["Unemployed"] * join["Max.Weekly Benefits"]
join[["State", "Unemployed", "Max.Weekly Benefits", "Most Money"]]

In [19]:
# From this data we can see that California, Texas, and New York pay out the most unemployment to their citizens
# Taking into consideration the states population and the benefits they offer.
LuckyPeople = join.nlargest(5, "Most Money")
LuckyPeople[["State", "Unemployed", "Max.Weekly Benefits", "Most Money"]]

Unnamed: 0,State,Unemployed,Max.Weekly Benefits,Most Money
35,California,1557562.071,598,931422100.0
26,Texas,1031530.325,487,502355300.0
38,New York,777618.76,561,436244100.0
39,Ohio,493403.148,566,279266200.0
31,North Carolina,392638.894,696,273276700.0


In [0]:
LuckyPeople1 = LuckyPeople.nsmallest(10, "Most Money")

In [0]:
plt.figure(figsize=(10,10))
ax = sns.scatterplot(x = "State", y = "Most Money", data = LuckyPeople1, s =100 )

ax

In [0]:
join.describe()

In [0]:
#Due to the unavailabillity of state by state unemployment demographics I will be utilizing national 
#data to reflects each states unemployment demographic payout.

df3 = pd.read_csv("EPI Data Library - Unemployment.csv")


In [4]:
#Since Febuary is the most recent month, these percentages will be used to determine 
df3.head()

Unnamed: 0,Date,All,Black,Hispanic,White
0,Feb-20,3.60%,6.00%,4.30%,3.00%
1,Jan-20,3.60%,6.00%,4.30%,3.00%
2,Dec-19,3.70%,6.10%,4.30%,3.00%
3,Nov-19,3.70%,6.10%,4.30%,3.10%
4,Oct-19,3.70%,6.20%,4.30%,3.10%


In [38]:
unemployed = join[["State", "Unemployed"]]
unemployed.head()

Unnamed: 0,State,Unemployed
0,South Carolina,119832.185
1,Utah,75488.645
2,Vermont,14445.403
3,North Dakota,18281.352
4,Colorado,146138.15


In [0]:
df4= pd.read_csv("State Demographics.csv", encoding='latin1')
df4.head()
# df4["State", "Hispanic (of any race)", "Non-Hispanic White", "Non-Hispanic Black"]

In [0]:
df4[["State", "Hispanic (of any race)", "Non-Hispanic White", "Non-Hispanic Black"]]

In [65]:
#Utilizing the national unemployment numbers published by the economic policy institute for February 2020 I was able to 
# determine the number of unemployed members of each state for the demographics of black, hispanics, and white.
# We will be measuring these figures of the highest unemployment paying states. (North Carolina, Ohio, California, Texas, New York)

unemployed["African American UE"] = unemployed["Unemployed"] * .06
unemployed["Hispanic UE"] = unemployed["Unemployed"] * .043
unemployed["White UE"] = unemployed["Unemployed"] * .03
unemployed

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.


Unnamed: 0,State,Unemployed,African American UE,Hispanic UE,White UE
0,South Carolina,119832.185,7189.9311,5152.783955,3594.96555
1,Utah,75488.645,4529.3187,3246.011735,2264.65935
2,Vermont,14445.403,866.72418,621.152329,433.36209
3,North Dakota,18281.352,1096.88112,786.098136,548.44056
4,Colorado,146138.15,8768.289,6283.94045,4384.1445
5,Hawaii,36729.862,2203.79172,1579.384066,1101.89586
6,New Hampshire,35652.396,2139.14376,1533.053028,1069.57188
7,Virginia,224281.382,13456.88292,9644.099426,6728.44146
8,Alabama,127624.146,7657.44876,5487.838278,3828.72438
9,Iowa,82676.074,4960.56444,3555.071182,2480.28222
