In [93]:
# import pandas 
import pandas as pd 
# read the csv into a usable dataframe 
all_211 = pd.read_csv("pgh-211.csv", low_memory=False)
all_211_df = pd.DataFrame(all_211)
all_211_df["contact_date"] = pd.to_datetime(all_211_df["contact_date"], format='%Y-%m-%d')

# Where should the city of Pittsburgh focus volunteer outreach programs to help prevent hunger issues, utility shut-offs, and housing access issues?

## Overview

The goal of the analysis is to look at the [WPRDC dataset on 211 calls](https://data.wprdc.org/dataset/211-requests) within Allegheny County, Pennsylvania specifially to try to identify areas where volunteer outreach programs would make the most impact within these three categories:

1. Hunger / food scarcity 
2. Basic utility access 
3. Housing access (people without homes)

> Every day, thousands of people in our region are struggling to put food on the table, keep the lights on, and keep a roof over their head. 2-1-1 is a 24/7 telephone helpline (also available via text and chat) that helps prevent hunger, utility shut-offs and homelessness when people have nowhere else to turn. During each call, our Resource Navigators skillfully identify an individual’s immediate needs and connect them to services and resources that address the full spectrum of needs discovered through a thoughtful and compassionate conversation.

## First, we'll start with an overview of the available data

In [94]:
all_211_df.head(3)

Unnamed: 0,contact_date,gender,age_range,zip_code,county,state,needs_category,needs_code,level_1_classification,level_2_classification,needs_met
0,2020-01-01,,,15212,Allegheny County,Pennsylvania,Food Pantries,BD-1800.2000,Basic Needs,Food,t
1,2020-01-01,F,65 and over,15221,Allegheny County,Pennsylvania,Food Pantries,BD-1800.2000,Basic Needs,Food,t
2,2020-01-01,F,,15226,Allegheny County,Pennsylvania,Food Pantries,BD-1800.2000,Basic Needs,Food,t


In [95]:
# Breaking down columns to get an idea of the unique values in each 

needs_values = all_211_df.needs_category.unique()
level_1_values = level_1 = all_211_df.level_1_classification.unique()
level_2_values = level_1 = all_211_df.level_2_classification.unique()

num_needs = len(needs_values)
# prints the unique values a given column 
# for value in values:
#     print(value)

# for value in level_1_values:
#     print(value)

num_level = len(level_2_values)
# for value in level_2_values:
#     print(value)


## Unique values in needs and classification columns 

The combination of needs & classifications are used by 211 operators to help match callers with the outreach service they need depending on their specific problem. You can read more about the 211 request line at [this website](https://www.211.org/about-us/our-impact).

### Needs

There are 1,321 categories of needs.  Here are some examples:

* Food pantries
* Soup kitchens
* Community Shelters
* Adult Protective Services
* Mortgage Payment Assistance
* Furniture
* Ex-Offender Reentry Programs
* Elder Law
* Smoke Alarms


### Classifications 

### Level 1 

* Basic Needs
* Consumer Services
* Criminal Justice and Legal Services
* Health Care
* Income Support and Employment
* Individual and Family Life
* Mental Health and Substance Use Disorder Services
* Organizational/Community/International Services
* Environment and Public Health/Safety
* Education

### Level 2 

There are 64 level 2 classifications. Level 2 classifications go one level deeper and are meant to be paired with the higher level 1 classifications to get more insight into the request. Here are some examples:
* Food
* Housing/Shelter
* Material Goods
* Transportation
* Utilities
* Consumer Assistance and Protection
* Legal Services
* Health Supportive Services
* Public Assistance Programs
* Counseling Settings


For this analysis i'll look at the the above fields and slice them by gender, age range, when the request was created, the zipcode of the area that the request was made from, and if the need was met and hopefully be able to answer the question - **Which areas need different outreach focus going forward?**

## How many requests are we working with?

For this analysis I only want to count requests from Allegheny County, Pennsylvania. There are **156,804** total 211 requests in the WPRDC dataset and **92,300** were made from Allegheny County, Pennsylvania. 

Going forward I just want to analyze these results so i'm going to create a new dataframe with only rows that have `county == "Allegheny County`.

In [96]:
dates_locations = all_211_df[["contact_date","county", "state"]]

allegheny_counts = 0
non_allegheny_counts = 0

for county in dates_locations["county"]:
    if county == "Allegheny County":
        allegheny_counts += 1
    else:
        non_allegheny_counts += 1
        
# print(allegheny_counts)
# print(non_allegheny_counts)

allegheny_df = all_211_df[all_211_df["county"] == "Allegheny County"]

## What are the top 10 requests (all time)?





In [97]:
# counts request categories 
category_counts = allegheny_df.groupby("needs_category")["contact_date"].count()

# sorts values and changes them in place 
category_counts.sort_values(inplace = True, ascending = False)

# gets top 10 rows after the series has been sorted in place 
top_10_requests = category_counts.head(10)

print(top_10_requests)

needs_category
Covid-19 Control                               14951
Rent Payment Assistance                         9791
COVID-19 Immunization Clinics                   4786
Tax Preparation Assistance                      4713
Electric Service Payment Assistance             4534
Gas Service Payment Assistance                  4159
Undesignated Temporary Financial Assistance     4050
Food Pantries                                   3259
Housing Search Assistance                       2578
Housing Related Coordinated Entry               2328
Name: contact_date, dtype: int64


The top request to date is **Covid-19 Control** with a total of **14,951 requests**, but if you include the third result of **COVID-19 Immunization Clinics** the total of Covid related requests is actually **19,737**. 

Out of the total amount of top 10 requests (55,149) Covid related requests make up **~36%**. The earliest date in this dataset for a request is January 1st, 2020 so it's hard to say what the requests trends might be in a non-pandemic world, but this is something that can be looked at in the future. 