# Data Storytelling Summary

## Overview

The story of poverty in New York City is primarily about people and income. Within that, a wide variety of factors such as age, location, disability status, ability to speak English, the size and composition of an apartment or home -- even factors that the data only hint at -- play a key role.

We'll look quickly at the dataset and jump into the visualizations with comments.

### The Dataset

* Data from https://data.cityofnewyork.us/browse?q=poverty
* 12 annual data files, from 2005 to 2016 inclusive (e.g. NYCgov_Poverty_Measure_Data__2016_.csv)
* CSV files with ~80 columns and ~60,000 rows each
* Each file had essentially the same format and contained (mostly) the same information
* Data types included:
    * Classification types encoded as integers (e.g. 1 if in poverty, 2 if not in poverty)
    * Floats for financial data (e.g. wages for the calendar year)

### Poverty Rate by Year

* The overall poverty rate was 19.5% in 2016.
    * From 20.3% in 2005, it increased to 20.6% in 2010 (after the recession), and decreased to 19.5% in 2016.
* The yearly decreases look small, but are important.
    * 0.1% decrease in poverty in NYC is about 8,500 people.

![NYCPovertyRate.png](attachment:NYCPovertyRate.png)
![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/NYCPovertyRate.png]

### Age and Poverty Status

* Age has one of the largest impacts on poverty status.  
    * Note the cluster below age 25.
    * Above age 25 there appears to be a strong correlation with poverty rate.  
    * If you're in an age group with more than 22% poverty, you're either younger than 25 or over 60.

![PovByAge.png](attachment:PovByAge.png)
![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByAge.png]

* The distribution of population by age is also somewhat surprising:
    * Similar clustering between age < 25 and 25+ as in the poverty chart above.
    * 2005 and 2016 look very similar; the population is not significantly aging.
    * Birth rates and/or in-migration are keeping the population by age group relatively stable.
    * This is not the case in lots of other areas around the US.
* We'll dig into this further in EDA.

![PopByAge.png](attachment:PopByAge.png)
![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByAge.png]

### Age Category

* Here's another look by age; the disparity in poverty rates by group seems to have diminished since 2005.

![PovByAgeCateg.png](attachment:PovByAgeCateg.png)
![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByAgeCateg.png]

### Borough

* New York City ("NYC") has five boroughs as shown below (courtesy Wikipedia: https://en.wikipedia.org/wiki/Boroughs_of_New_York_City)

![MapOfNYC.png](attachment:MapOfNYC.png)
![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/MapOfNYC.png]

* Poverty has decreased fairly dramatically in the Bronx and Brooklyn.
* It has also decreased in Manhattan.
* Poverty has increased in Queens and especially Staten Island.

![PovByBoro.png](attachment:PovByBoro.png)
![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByBoro.png]

* Income varies dramatically by borough.
    * I've focused in on the IQRs of the charts because outliers make the charts difficult to interpret.
    * Median pretax income in the Bronx is roughly half of that of Staten Island. 
    * Median incomes by borough have increased roughly 25-33% over the 11-year period, though unevenly.  
    * Manhattan seems to have the most outliers (not shown on this chart).

![BoxByBoro1.png](attachment:BoxByBoro1.png)https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/BoxByBoro1.png] ![BoxByBoro2.png](attachment:BoxByBoro2.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/BoxByBoro2.png] ![BoxByBoro3.png](attachment:BoxByBoro3.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/BoxByBoro3.png]

* Population shares by borough have remained fairly consistent.
* Brooklyn and the Bronx (the two boroughs with the highest poverty) represent roughly half of NYC.

![PopByBoro.png](attachment:PopByBoro.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByBoro.png]

### Total Work Hours by Poverty Unit ('TotalWorkHrs_PU')

* Households with at least the equivalent of one half-time worker have dramatically better poverty rates.

![PovByTotalWorkHrs_PU.png](attachment:PovByTotalWorkHrs_PU.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByTotalWorkHrs_PU.png]

* About 35% of households don't have at least the equivalent of one half-time worker, a figure that's remained relatively stable through the years.

![PopByTotalWorkHrs_PU.png](attachment:PopByTotalWorkHrs_PU.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByTotalWorkHrs_PU.png]

### Usual Hours Worked ('WKHP')

* While there appears to be less poverty among those who work more, there are lots of outliers. Will need to dig deeper on income versus work hours.

![PovByWKHP.png](attachment:PovByWKHP.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByWKHP.png]

### Weeks Worked Past 12 Months ('WKW')

* The chart by weeks worked past 12 months is suprising in that it looks almost exactly like you'd expect.

![PovByWKW.png](attachment:PovByWKW.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByWKW.png]

### Salary/Wages ('WAGP_adj')

* Salary/Wages has a fairly strong correlation with poverty level, but not as much as one might naively expect.
* Note the 'column' around 0; the relationship appears somewhat linear outside of that, but with lots of noise.

![PovByWAGP_adj.png](attachment:PovByWAGP_adj.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByWAGP_adj.png]

### Social Security ('SSP_adj')

* Note the 'column' from 0-2500, with no clear relationship between Social Security and poverty rate.
* Interestingly, above 2500 dollars/year, there appears to be somewhat of a correlation.

![PovBySSP_adj.png](attachment:PovBySSP_adj.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovBySSP_adj.png]

### Disability Status

* Poverty status is much higher among the disabled in 2010 and 2016.

![PovByDis.png](attachment:PovByDis.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByDis.png]

### Ethnicity

* Asian and Hispanic groups have higher poverty rates, and have the highest increases in raw number of people since 2005.

![PovByEthnicity.png](attachment:PovByEthnicity.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByEthnicity.png]

![PopByEthnicity.png](attachment:PopByEthnicity.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByEthnicity.png]

### Citizenship Status

* Non-citizens have higher poverty incidence than citizens by birth or naturalized citizens.

![PovByCit.png](attachment:PovByCit.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByCit.png]

* The share of naturalized citizens has increased since 2005.

![PopByCit.png](attachment:PopByCit.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByCit.png]

### Family Type ('FamType_PU')

* Single parents or unrelated cohabitants are most likely to be in poverty.

![PovByFamType_PU.png](attachment:PovByFamType_PU.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByFamType_PU.png]

### Ability to Speak English

* Ability to speak english well has a negative liner relationship with poverty status.

![PovByEng.png](attachment:PovByEng.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByEng.png]

* Nearly 10% of the population speaks English "Not at All" or "Not Well".

![PopByEng.png](attachment:PopByEng.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByEng.png]

### Employment Status Recode

* "Employment Status Recode" is a strange term but since we'll look at multiple measures of employment, we'll keep it.  This is what you might expect; "Unemployed" and "Not in Labor Force" groups have the highest levels of poverty. 

![PovByESR.png](attachment:PovByESR.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByESR.png]

* Note that the rate of people that are unemployed is incredibly low.  Any way to see whether any of the NILF people could actually be in the labor force?  (Many are presumably retirement-age, etc.)

![PopByESR.png](attachment:PopByESR.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByESR.png]

### Housing Status

* Different subsidy statuses have different incidences of poverty. Why does "Own - Free & Clear" show higher poverty incidence than "Own - Mortgage"?

![PovByHousingStatus.png](attachment:PovByHousingStatus.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByHousingStatus.png]

![PopByHousingStatus.png](attachment:PopByHousingStatus.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByHousingStatus.png]

### Housing Tenure ('TEN')

* Aside from the question previously noted about poverty rates among "Free & Clear" owners, poverty rates by housing tenure look similar to what you might expect.

![PovByTEN.png](attachment:PovByTEN.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByTEN.png]

### Rent Payments ('RNTP_adj')

* Note the cluster below 1000 dollars/month, where there appears to be no discernible relationship between rent and poverty.
    * Recall the prior charts on housing status; subsidies make the relationship more complicated.
* Above 1,000 dollars/month, the datat looks roughly linear. 

![PovByRNTP_adj.png](attachment:PovByRNTP_adj.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByRNTP_adj.png]

### Number of People in Household

* The poverty incidence among 1-person households seems strangely high.
    * Hypothesis: many older people living alone?  Will need to investigate.
* Other than that, more people correlates roughly with higher poverty.

![PovByNP.png](attachment:PovByNP.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByNP.png]
![PopByNP.png](attachment:PopByNP.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PopByNP.png]

### Educational Attainment ('EducAttain')

* Educational attainment correlates with poverty status as you would expect.
* The differences between "less than high school", a diploma, and some college have diminished slightly since 2005.

![PovByEducAttain.png](attachment:PovByEducAttain.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovByEducAttain.png]

### Sex

* More women than men are in poverty.
    * Hypothesis/question: what percent of the difference could be explained by differences in child care duties between men and women, and what percentage to income gaps?

![PovBySex.png](attachment:PovBySex.png)![https://github.com/c74p/Springboard/blob/master/Capstone%20Project%201%20-%20Poverty/PovBySex.png]

