## U.S. poverty data by age group and race

The following data describe levels of poverty at the U.S. national level. These were taken from the U.S. Census Current Population Survey (CPS).

Poverty thresholds to roughly translate poverty to income levels may be found [here](https://www.census.gov/data/tables/time-series/demo/income-poverty/historical-poverty-thresholds.html).

In [9]:
import pandas as pd

file = '../Data/hstpov3.xls'
xls = pd.ExcelFile(file)
df = xls.parse('Clean')

race = ""
drop_indices = []
for index, row in df.iterrows():
    if pd.isnull(row["Under 18 total"]):
        race = row["Year"]
        drop_indices.append(index)
    df.loc[index, 'Race'] = race 
df.drop(drop_indices, inplace = True)

df['total'] = df['Under 18 total'] + df['18 to 64 total'] + df['65 and older total']
df['below poverty'] = df['Under 18 below poverty'] + df['18 to 64 below poverty'] + df['65 and older below poverty']

df['Under 18 below poverty percentage'] = df['Under 18 below poverty'] / df['Under 18 total'] * 100
df['18 to 64 below poverty percentage'] = df['18 to 64 below poverty'] / df['18 to 64 total'] * 100
df['65 and older below poverty percentage'] = df['65 and older below poverty'] / df['65 and older total'] * 100
df['below poverty percentage'] = df['below poverty'] / df['total'] * 100

df.to_csv("../Data/us_poverty.csv")

### Checking initial data set

In [3]:
import pandas as pd

In [4]:
deathrate = pd.read_csv('../Data/deathrate.csv')

In [21]:
year = 1999
df_indices = (df['Year'] == year) & (df["Race"] == "ALL RACES")
df[df_indices]['below poverty']

20    32791.0
Name: below poverty, dtype: float64

In [23]:
deathrate.head(10)

Unnamed: 0.1,Unnamed: 0,Year,County,FIPS,Deathrate,Population,Poverty
0,1,1999,"Abbeville County, SC",45001,1,25921,3257.0
1,2,1999,"Acadia Parish, LA",22001,7,58762,12461.0
2,3,1999,"Accomack County, VA",51001,5,37614,6107.0
3,4,1999,"Ada County, ID",16001,7,294292,24964.0
4,5,1999,"Adair County, IA",19001,1,8298,697.0
5,6,1999,"Adair County, KY",21001,5,17054,3656.0
6,7,1999,"Adair County, MO",29001,3,24961,3284.0
7,8,1999,"Adair County, OK",40001,3,20904,4385.0
8,9,1999,"Adams County, CO",8001,9,354146,32040.0
9,10,1999,"Adams County, IA",19003,1,4498,510.0


In [31]:
deathrate.isna()

Unnamed: 0.1,Unnamed: 0,Year,County,FIPS,Deathrate,Population,Poverty
0,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False
5,False,False,False,False,False,False,False
6,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False
8,False,False,False,False,False,False,False
9,False,False,False,False,False,False,False


In [27]:
deathrate_indices = deathrate["Year"] == year
sum(deathrate[deathrate_indices]["Poverty"])

nan