# Final Project - Group 49
## Introduction
When looking for how to find a “best” neighborhood in Pittsburgh, we considered a couple different ways to measure this objective ranking. Ultimately, we decided to go with an ironic approach. The other option that we were considering was to look at the neighborhood with the least “unhealthy vendors,” but to make the analysis more fun and  interesting, we used a metric that would determine the best neighborhood based on an unhealthy lifestyle. We chose to analyze the datasets for the most frequently appearing zip codes under each metric and matched these zip codes with corresponding neighborhoods. Because zip codes do not directly correspond to neighborhoods (for example, they can span across multiple neighborhoods or a neighborhood can have multiple zip codes), we chose to specifically look at zip codes that were associated with Pittsburgh neighborhoods and generally match the zip code’s area. 
## The Metric
Our metric we chose was: “bad for the body, good for the soul.”  We looked for the neighborhoods which had the most frequent stores or restaurants providing fun, unhealthy goods.  This led us to our three datasets: fast food locations, convenience store locations, and tobacco store locations.

In [13]:
import pandas as pd
import geopandas
%matplotlib inline

## The Best Neighborhood
Using zipcodes to determine neighborhoods created some issues, since some zipcodes spanned multiple neighborhoods, and some zipcodes were actually outside of Pittsburgh.  Therefore, we used two websites to standardize our results.
<br>
https://www.unitedstateszipcodes.org/
<br>
This was the website used to determine which neighborhood the zipcode represented.  The site would either list the actual neighborhood name, or list an alternate acceptable name for the city, which we used as the neighborhood.
<br>
https://www.visitpittsburgh.com/neighborhoods/
<br
This website was our universally used list to determine which neighborhoods were fair use.  If the neighborhood associated with the zipcode wasn't on this list, the zipcode was deemed invalid and wouldn't be considered for the best neighborhood in Pittsburgh. 
### Fast Food Locations

In [2]:
fastfood = pd.read_csv("fastfoodalleghenycountyupdatexy2.csv")
fastfood.head(10)

Unnamed: 0,Name,Legal_Name,Start_Date,Street_Number,Street_Name,ZIP_Code,Lat,Lon,Category
0,Adrian's Pizza,,11/7/14,605,Thompson Run Rd,15237,40.539465,-79.990764,Take Out
1,Adrian's Pizza Express,Rock Enterprises Inc,4/22/04,7824,Perry Hwy,15237,40.551219,-80.037362,Take Out
2,Allegheny Sandwich Shop,,2/24/97,414,Grant St,15219,40.43811,-79.99686,NO Dollar Menu
3,Allegheny Sandwich Shoppe #3,Allegheny Sandwich Shoppe Inc,11/9/01,440,Ross St,15219,40.438514,-79.99533,NO Dollar Menu
4,Amili's Pizzeria,,2/26/99,1021,Brownsville Rd,15210,40.406082,-79.991863,Take Out
5,Angelia's Pizza,JNG Pizza LLC,5/11/04,202,Moon Clinton Rd,15108,40.513135,-80.223406,Take Out
6,Angelia's Pizza / Chill Frozen Dessserts,Eaton Pizza Inc,10/7/05,410,Penn Lincoln Dr,15126,40.442466,-80.235992,Take Out
7,Antney's Ice Cream,The Iceman Inc,4/11/02,1316,Poplar St,15205,40.42747,-80.052435,"Breakfast, Drink, Other"
8,Arby's,Kinco Inc,1/1/75,1617,Freeport Rd,15065,40.622125,-79.727516,Dollar Menu
9,Arby's #8,Linell Corporation,12/3/07,3974,Wm Penn Hwy,15146,40.437988,-79.772845,Dollar Menu


In [3]:
zipcode = fastfood.groupby('ZIP_Code').count()
zipcode.sort_values(by=['Name'], ascending = False)

Unnamed: 0_level_0,Name,Legal_Name,Start_Date,Street_Number,Street_Name,Lat,Lon,Category
ZIP_Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
15146,51,50,51,49,51,51,51,51
15222,46,43,46,46,46,46,46,46
15237,44,40,44,44,44,44,44,44
15213,38,37,38,38,38,38,38,38
15205,36,30,36,36,36,36,36,36
...,...,...,...,...,...,...,...,...
15148,1,1,1,1,1,1,1,1
15207,1,1,1,1,1,1,1,1
15208,1,1,1,1,1,1,1,1
15282,1,1,1,1,1,1,1,1


In [4]:
zipcode = fastfood.groupby('ZIP_Code').count()
zipcode_sorted = zipcode.sort_values(by=['Name'], ascending = False)
zipcode_sorted.head(15)

Unnamed: 0_level_0,Name,Legal_Name,Start_Date,Street_Number,Street_Name,Lat,Lon,Category
ZIP_Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
15146,51,50,51,49,51,51,51,51
15222,46,43,46,46,46,46,46,46
15237,44,40,44,44,44,44,44,44
15213,38,37,38,38,38,38,38,38
15205,36,30,36,36,36,36,36,36
15219,27,23,27,27,27,27,27,27
15236,25,22,25,24,25,25,25,25
15235,22,21,22,22,22,22,22,22
15102,21,20,21,21,21,21,21,21
15217,20,20,20,20,20,20,20,20


TOP 10 ZIPCODES: 
<br>
15146 - Monroeville, PA <br>
15222 - Troy Hill <br>
15237 - McKnight <br>
15213 - Oakland <br>
15205 - Crafton <br>
15219 - Central Business District / Downtown <br>
15236 - Pleasant Hills / West Mifflin <br>
15235 - Penn Hills <br>
15102 - Bethel Park, PA <br>
15217 - Squirrel Hill <br>


In [12]:
remove_invalid = zipcode_sorted.drop(zipcode_sorted.index[0])
remove_invalid = zipcode_sorted.drop(zipcode_sorted.index[8])
remove_invalid.head(15)

Unnamed: 0_level_0,Name,Legal_Name,Start_Date,Street_Number,Street_Name,Lat,Lon,Category
ZIP_Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
15146,51,50,51,49,51,51,51,51
15222,46,43,46,46,46,46,46,46
15237,44,40,44,44,44,44,44,44
15213,38,37,38,38,38,38,38,38
15205,36,30,36,36,36,36,36,36
15219,27,23,27,27,27,27,27,27
15236,25,22,25,24,25,25,25,25
15235,22,21,22,22,22,22,22,22
15217,20,20,20,20,20,20,20,20
15203,19,15,19,19,19,19,19,19
