# Child Poverty in the United States: <br>Contributing Factors and Geographical Variations

#### Principal Investigator: Yushi Wu <br> Email: yw2122@nyu.edu

## Introduction

Child poverty refers to the state of people under the age of 18 living in families with income below poverty level. Although the United States is one of the richest countries in the world, it has one of the highest child poverty rates, which is a very concerning fact.

This project will investigate factors that might affect child poverty in the United States and geographical variations at the county level. The main factors that are examined are races and family types (whether the child is raised by two or one parent in the family). Also, the geographical distribution and variation of child poverty rates seem to correlate to the demographics of different areas. This project will try to visualize the geographical variation and its correlation with one of the most outstanding demographics, which is racial concentration.

The key element of this project is the use of Census API providing access to data of poverty status in different aspects (age, family type, race, etc) at the county level, and the use of matplotlib and geopandas to plot comparing charts and national maps.

This project has three sections:

- Basic statistics about child poverty in the United States will be presented.

- A detailed examination of two factors, which are race and family type, in year 2016 (the most recent available data from American Community Survey from Census API) will be presented.

- National maps showing the relationship between geographical variations and racial concentration distribution will be presented.

### Requisite Packages

In [None]:
from IPython.display import display, Image # Displays things nicely
import pandas as pd # Data Package 
import matplotlib.pyplot as plt # Graphics
from matplotlib.patches import Ellipse 
import numpy as np # Numerical operations
import os

from census import Census # for grabbing data from Census API
from us import states

import fiona # Needed for geopandas to run
import geopandas as gpd # this is the main geopandas 
from shapely.geometry import Point, Polygon 

## Part 1: Basic Statistics about US Child Poverty
---
In this part, I will access and read data from Census API about total population under 18 and the population in poverty under 18. Then I will calculate the child poverty rate and young child(under 6 years old) poverty rate based on these data.

In [None]:
print(avg_b,avg_b_y)

ASIAN ALONE

In [None]:
code = ("NAME",
        "B17020D_011E", # Above Poverty Level, Asian:Under 5 years
        "B17020D_012E", # Above Poverty Level:5
        "B17020D_013E", # Above Poverty Level:6-11
        "B17020D_014E", # Above Poverty Level:12-17
        "B17020D_003E", # Below Poverty Level, Asian:Under 5 years
        "B17020D_004E", # Below Poverty Level:5
        "B17020D_005E", # Below Poverty Level:6-11
        "B17020D_006E", # Below Poverty Level:12-17
        
       )

a = c.acs5.get(code, {'for': 'county:*' }, year=2016)

a = pd.DataFrame(a)

a = a.rename(columns = 
               {"B17020D_011E":"Above Poverty Level, Asian:Under 5 years",
                "B17020D_012E":"Above Poverty Level:5",
                "B17020D_013E":"Above Poverty Level:6-11",
                "B17020D_014E":"Above Poverty Level:12-17",
                "B17020D_003E":"Below Poverty Level, Asian:Under 5 years",
                "B17020D_004E":"Below Poverty Level:5",
                "B17020D_005E":"Below Poverty Level:6-11",
                "B17020D_006E":"Below Poverty Level:12-17",
                })

# Calculating total children (Asian) below poverty line 
a["Total Children in Poverty"]=a["Below Poverty Level, Asian:Under 5 years"]+a["Below Poverty Level:5"]
a["Total Children in Poverty"]=a["Total Children in Poverty"]+a["Below Poverty Level:6-11"]
a["Total Children in Poverty"]=a["Total Children in Poverty"]+a["Below Poverty Level:12-17"]

# Calculating total children (Asian) above poverty line
a["Total Children not in Poverty"]=a["Above Poverty Level, Asian:Under 5 years"]+a["Above Poverty Level:5"]
a["Total Children not in Poverty"]=a["Total Children not in Poverty"]+a["Above Poverty Level:6-11"]
a["Total Children not in Poverty"]=a["Total Children not in Poverty"]+a["Above Poverty Level:12-17"]

# Calculating total population of children (Asian)
a["Total Asian Chidren Population"]=a["Total Children in Poverty"]+a["Total Children not in Poverty"]

# Calculating poverty rate within race group (Asian)
a["Child Poverty Rate(Asian)"] = a["Total Children in Poverty"]/a["Total Asian Chidren Population"]

# Calculating total young children(<= 5 years) (Asian)
a["Total Young Children"] = (a["Below Poverty Level, Asian:Under 5 years"]+a["Below Poverty Level:5"]
                            +a["Above Poverty Level, Asian:Under 5 years"]+a["Above Poverty Level:5"])

# Calculating young children poverty rate (Asian)
a["Young Children Poverty Rate"] =((a["Below Poverty Level, Asian:Under 5 years"]+
                                   a["Below Poverty Level:5"])/a["Total Young Children"])


In [None]:
a.set_index("state", inplace = True)

a.drop(["02","15","72"],inplace = True) # dropping Alaska, Hawaii and Puerto Rico

a.head()

In [None]:
# Calculating average child poverty rate and young child poverty rate
avg_a = (a["Total Children in Poverty"].sum())/(a["Total Asian Chidren Population"].sum())
avg_a_y = ((a["Below Poverty Level, Asian:Under 5 years"].sum() + a["Below Poverty Level:5"].sum())
           /(a["Total Young Children"].sum()))

In [None]:
print(avg_a,avg_a_y)

HISPANIC OR LATINO

In [None]:
code = ("NAME",
        "B17020I_011E", # Above Poverty Level, Hispanic or Latino:Under 5 years
        "B17020I_012E", # Above Poverty Level:5
        "B17020I_013E", # Above Poverty Level:6-11
        "B17020I_014E", # Above Poverty Level:12-17
        "B17020I_003E", # Below Poverty Level, Hispanic or Latino:Under 5 years
        "B17020I_004E", # Below Poverty Level:5
        "B17020I_005E", # Below Poverty Level:6-11
        "B17020I_006E", # Below Poverty Level:12-17
        
       )

hl = c.acs5.get(code, {'for': 'county:*' }, year=2016)

hl = pd.DataFrame(hl)

hl = hl.rename(columns = 
               {"B17020I_011E":"Above Poverty Level, His./Latino:Under 5 years",
                "B17020I_012E":"Above Poverty Level:5",
                "B17020I_013E":"Above Poverty Level:6-11",
                "B17020I_014E":"Above Poverty Level:12-17",
                "B17020I_003E":"Below Poverty Level, His./Latino:Under 5 years",
                "B17020I_004E":"Below Poverty Level:5",
                "B17020I_005E":"Below Poverty Level:6-11",
                "B17020I_006E":"Below Poverty Level:12-17",
                })

# Calculating total children (Hispanic or Latino) below poverty line 
hl["Total Children in Poverty"]=hl["Below Poverty Level, His./Latino:Under 5 years"]+hl["Below Poverty Level:5"]
hl["Total Children in Poverty"]=hl["Total Children in Poverty"]+hl["Below Poverty Level:6-11"]
hl["Total Children in Poverty"]=hl["Total Children in Poverty"]+hl["Below Poverty Level:12-17"]

# Calculating total children (Hispanic or Latino) above poverty line
hl["Total Children not in Poverty"]=hl["Above Poverty Level, His./Latino:Under 5 years"]+hl["Above Poverty Level:5"]
hl["Total Children not in Poverty"]=hl["Total Children not in Poverty"]+hl["Above Poverty Level:6-11"]
hl["Total Children not in Poverty"]=hl["Total Children not in Poverty"]+hl["Above Poverty Level:12-17"]

# Calculating total population of children (Hispanic or Latino)
hl["Total His./Latino Chidren Population"]=hl["Total Children in Poverty"]+hl["Total Children not in Poverty"]

# Calculating poverty rate within race group (Hispanic or Latino)
hl["Child Poverty Rate(Hispanic/Latino)"]=hl["Total Children in Poverty"]/hl["Total His./Latino Chidren Population"]

# Calculating total young children(< 5 years) (Hispanic or Latino)
hl["Total Young Children"] = (hl["Below Poverty Level, His./Latino:Under 5 years"]+hl["Below Poverty Level:5"]
                             +hl["Above Poverty Level, His./Latino:Under 5 years"]+hl["Above Poverty Level:5"])


# Calculating young children poverty rate (Hispanic or Latino)
hl["Young Children Poverty Rate"] = ((hl["Below Poverty Level, His./Latino:Under 5 years"]+hl["Below Poverty Level:5"])
                                     /hl["Total Young Children"])

In [None]:
hl.set_index("state", inplace = True)

hl.drop(["02","15","72"],inplace = True) # dropping Alaska, Hawaii and Puerto Rico

hl.head()

In [None]:
# Calculating average child poverty rate and young child poverty rate
avg_h = (hl["Total Children in Poverty"].sum())/(hl["Total His./Latino Chidren Population"].sum())
avg_h_y = ((hl["Below Poverty Level, His./Latino:Under 5 years"].sum()+hl["Below Poverty Level:5"].sum())
           /(hl["Total Young Children"].sum()))

In [None]:
print(avg_h,avg_h_y)

Then I create a new DataFrame summarizing the average child poverty rates and young child poverty rates for different races and plot a bar chart to visually show the results.

In [None]:
f_w.set_index("state", inplace = True)

f_w.drop(["02","15","72"],inplace = True) # dropping Alaska, Hawaii and Puerto Rico

In [None]:
f_w_1 = f_w[bp_1].sum()/f_w["Total Married Families with Children"].sum()
f_w_2 = f_w[bp_2].sum()/f_w["Total families with children with male householder only"].sum()
f_w_3 = f_w[bp_3].sum()/f_w["total families with children with female householder only"].sum()

In [None]:
# Black
code = ("NAME",
        "B17010B_004E",#below poverty line:Number of Married-Couple Families with children under18
        "B17010B_011E",#below poverty line:Male householder with children under 18
        "B17010B_017E",#below poverty line:Female householder with children under 18
        "B17010B_024E",#above poverty line:Number of Married-Couple Families with children under18
        "B17010B_031E",#above poverty line:Male householder with children under 18
        "B17010B_037E" #above poverty line:Female householder with children under 18
       )

f_b = c.acs5.get(code, {'for': 'county:*' }, year=2016)
f_b = pd.DataFrame(f_b)

f_b =f_b.rename(columns=
    {"B17010B_004E":"Below Poverty Line:Married-Couple families with children under 18",
    "B17010B_011E":"Below Poverty Line:Male householder with children under 18",
    "B17010B_017E":"Below Poverty Line:Female householder with children under 18",
    "B17010B_024E":"Above Poverty Line: Number of Married-Couple Families with children under 18",
    "B17010B_031E":"Above Poverty Line: Male householder with children under 18",
    "B17010B_037E":"Above Poverty Line: Female householder with children under 18"
    })

f_b["Total Married Families with Children"] = f_b[bp_1] + f_b[ap_1]

f_b["Total families with children with male householder only"] = f_b[bp_2] + f_b[ap_2]

f_b["total families with children with female householder only"] = f_b[bp_3]+ f_b[ap_3]

In [None]:
f_b.set_index("state", inplace = True)

f_b.drop(["02","15","72"],inplace = True) # dropping Alaska, Hawaii and Puerto Rico

In [None]:
f_b_1 = f_b[bp_1].sum()/f_b["Total Married Families with Children"].sum()
f_b_2 = f_b[bp_2].sum()/f_b["Total families with children with male householder only"].sum()
f_b_3 = f_b[bp_3].sum()/f_b["total families with children with female householder only"].sum()

In [None]:
# Asian
code = ("NAME",
        "B17010D_004E",#below poverty line:Number of Married-Couple Families with children under18
        "B17010D_011E",#below poverty line:Male householder with children under 18
        "B17010D_017E",#below poverty line:Female householder with children under 18
        "B17010D_024E",#above poverty line:Number of Married-Couple Families with children under18
        "B17010D_031E",#above poverty line:Male householder with children under 18
        "B17010D_037E" #above poverty line:Female householder with children under 18
       )

f_a = c.acs5.get(code, {'for': 'county:*' }, year=2016)
f_a = pd.DataFrame(f_a)

f_a = f_a.rename(columns = 
    {"B17010D_004E":"Below Poverty Line:Married-Couple families with children under 18",
    "B17010D_011E":"Below Poverty Line:Male householder with children under 18",
    "B17010D_017E":"Below Poverty Line:Female householder with children under 18",
    "B17010D_024E":"Above Poverty Line: Number of Married-Couple Families with children under 18",
    "B17010D_031E":"Above Poverty Line: Male householder with children under 18",
    "B17010D_037E":"Above Poverty Line: Female householder with children under 18"
    })

bp_1 = "Below Poverty Line:Married-Couple families with children under 18"
ap_1 = "Above Poverty Line: Number of Married-Couple Families with children under 18"

f_a["Total Married Families with Children"] = f_a[bp_1] + f_a[ap_1]

bp_2 = "Below Poverty Line:Male householder with children under 18"
ap_2 = "Above Poverty Line: Male householder with children under 18"

f_a["Total families with children with male householder only"] = f_a[bp_2] + f_a[ap_2]

bp_3 = "Below Poverty Line:Female householder with children under 18"
ap_3 = "Above Poverty Line: Female householder with children under 18"

f_a["total families with children with female householder only"] = f_a[bp_3]+ f_a[ap_3]

In [None]:
f_a.set_index("state", inplace = True)

f_a.drop(["02","15","72"],inplace = True) # dropping Alaska, Hawaii and Puerto Rico

In [None]:
f_a_1 = f_a[bp_1].sum()/f_a["Total Married Families with Children"].sum()
f_a_2 = f_a[bp_2].sum()/f_a["Total families with children with male householder only"].sum()
f_a_3 = f_a[bp_3].sum()/f_a["total families with children with female householder only"].sum()

In [None]:
# Hispanic or latino
code = ("NAME",
        "B17010I_004E",#below poverty line:Number of Married-Couple Families with children under18
        "B17010I_011E",#below poverty line:Male householder with children under 18
        "B17010I_017E",#below poverty line:Female householder with children under 18
        "B17010I_024E",#above poverty line:Number of Married-Couple Families with children under18
        "B17010I_031E",#above poverty line:Male householder with children under 18
        "B17010I_037E" #above poverty line:Female householder with children under 18
       )

f_h = c.acs5.get(code, {'for': 'county:*' }, year=2016)
f_h = pd.DataFrame(f_h)

f_h = f_h.rename(columns =
    {"B17010I_004E":"Below Poverty Line:Married-Couple families with children under 18",
    "B17010I_011E":"Below Poverty Line:Male householder with children under 18",
    "B17010I_017E":"Below Poverty Line:Female householder with children under 18",
    "B17010I_024E":"Above Poverty Line: Number of Married-Couple Families with children under 18",
    "B17010I_031E":"Above Poverty Line: Male householder with children under 18",
    "B17010I_037E":"Above Poverty Line: Female householder with children under 18"
    })

f_h["Total Married Families with Children"] = f_h[bp_1] +f_h[ap_1]

f_h["Total families with children with male householder only"] = f_h[bp_2] + f_h[ap_2]

f_h["total families with children with female householder only"] = f_h[bp_3]+ f_h[ap_3]

In [None]:
f_h.set_index("state", inplace = True)

f_h.drop(["02","15","72"],inplace = True) # dropping Alaska, Hawaii and Puerto Rico

In [None]:
f_h_1 = f_h[bp_1].sum()/f_h["Total Married Families with Children"].sum()
f_h_2 = f_h[bp_2].sum()/f_h["Total families with children with male householder only"].sum()
f_h_3 = f_h[bp_3].sum()/f_h["total families with children with female householder only"].sum()

Then I create a new DataFrame summarizing poverty rates of different family types and plot a bar chart to visually show the results.

In [None]:
d1 = {"Race":["All Races","White","Black","Asian","Hispanic/Latino"],
     "Married-Couple families with children Below Poverty Line":[f_1,f_w_1,f_b_1,f_a_1,f_h_1],
     "Families with Male householder only with children Below Poverty Line":[f_2,f_w_2,f_b_2,f_a_2,f_h_2],
     "Families with Female householder only with children Below Poverty Line":[f_3,f_w_3,f_b_3,f_a_3,f_h_3]
    }

family_type = pd.DataFrame(data = d1)


family_type.set_index("Race",inplace = True)                

family_type

In [None]:
fig, ax = plt.subplots(figsize = (12,6))

family_type.plot(ax=ax,kind = "bar", rot=1)

ax.set_title("Poverty Rate By Family Type By Race", fontsize = 17)

ax.spines["right"].set_visible(False) 
ax.spines["top"].set_visible(False) 

ax.set_xlabel("Races", fontsize = 14)
ax.set_ylabel("Average Poverty Rate",fontsize = 14)

ax.legend(["Female householder only","Male householder only","Married-Couple families"])

plt.show()


**Summary**: We can see clearly from this graph that family type plays a role in child poverty, and the patterns persist across different races. Children from single-mother families are suffering highest poverty rates, and children from married-couple families have the lowest poverty rates. This shows that children from single-parent families are more likely to be in poverty than married-coupled families, and children with a single mother are more likely to be in poverty than children with a single father. 

## Part 3: Geographic Variations and Racial Distribution
---
As we can see from the map at the end of part one, child poverty has geographical variations. Regarding the above two contributing factors, I think family type is more like a micro factor that is not very related to geographics,however, racial concentration is one of the most important demographics for a region and varies across different regions. In this part, I will investigate the relationship between geographic variations of child poverty and racial distribution.

Below, I grab data and calculate non-white population for different counties. Then I plot a national map indicating the non-white concentration across different counties, with counties with high non-white concentration(>25%) outlined in red.

In [None]:
my_api_key = "6d08ff7a7a1f5f90fb8d1972aedd83d457ea17e3"

c = Census(my_api_key)

In [None]:
code = ("NAME","B09001_001E",# grabbing the total population under 18,
        "B17020_003E",  # poverty population, under 6 years
        "B17020_004E",  # poverty population, 6-11 years
        "B17020_005E",  # poverty population, 12-17 years
        "B17020_011E"   # above poverty level, under 6 years
        ) 

c_pov = c.acs5.get(code, {'for': 'county:*' }, year=2016) # grabbing data for year 2016 at the conuty level

c_pov = pd.DataFrame(c_pov) # Convert into DataFrame

c_pov = c_pov.rename(columns = 
    {"B09001_001E":"Total Population under 18",
     "B17020_003E":"Poverty Population, under 6 years",
     "B17020_004E":"Poverty Population, 6-11 years",
     "B17020_005E":"Poverty Population, 12-17 years",
     "B17020_011E":"Above Poverty Level, under 6 years"
    })

# Calculating total child population in poverty
c_pov["Total Children in Poverty"] = c_pov["Poverty Population, under 6 years"]+c_pov["Poverty Population, 6-11 years"]
c_pov["Total Children in Poverty"] = c_pov["Total Children in Poverty"]+c_pov["Poverty Population, 12-17 years"]

# Calculating poverty rate
c_pov["Child Poverty Rate in 2016"] = c_pov["Total Children in Poverty"]/c_pov["Total Population under 18"]

# Calculating total young children
c_pov["Total Young Children"] = c_pov["Poverty Population, under 6 years"]+c_pov["Above Poverty Level, under 6 years"]


In [None]:
c_pov.set_index("state", inplace = True)
c_pov.drop(["02","15","72"],inplace = True)   # dropping Alaska, Hawaii and Puerto Rico
c_pov.head()

In [None]:
(c_pov["Total Children in Poverty"].sum())/(c_pov["Total Population under 18"].sum())

In [None]:
c_pov["Poverty Population, under 6 years"].sum()/c_pov["Total Young Children"].sum()

**Summary**: So, in aggregate, child poverty rate in the US in 2016 is **20.88%**, which means one in five children in US is suffering from poverty. Young child poverty rate is even higher, which is **23.58%** in 2016.

Next, let's see how child poverty is distributed within US. I will use geopandas to plot child poverty distribution at county level.