# Final Report, The Best Neighborhood in Pittsburgh

## Introduction

When looking for places to live there are many things to take in to account. These can be **crime rates**, **housing cost**, **health conditions**, and more. For each person, their reasoning of picking a neighborhood will vary depending on their needs. In our report, we attempted to take some of the most common metrics and use them to determine the best neighborhood in Pittsburgh. 

---

## The Metric

For our metric, we will be combining the three individual metrics we used for our notebooks. These include the **cost of housing**, **police incidents**, and **firearm seizures**. 

### Cost of Housing

This data set contains transactions of properties that were sold throughout Pittsburgh. We then determine the average cost of housing in the area, and rank them based on highest to lowest.

In [2]:
# Load pandas
import pandas as pd

# Data set gives off some warnings due to how to was created
# this is used to turn those warnings off.
import warnings
warnings.filterwarnings('ignore')

# Load Data
property_sales = pd.read_csv("Allegheny County Property Sale Transactions.csv",
                            index_col = "MUNIDESC")

# Data set is full of some houses that were labeled as being
# sold for 1 or 0. These are removed to allow for the
# calculations to be more accurate. 
property_sales = property_sales[property_sales.PRICE != 0]
property_sales = property_sales[property_sales.PRICE != 1]
property_sales = property_sales.sort_values('MUNICODE')

# Sort the data by the municipality and then group by the
# average price of found for each municipality
property_sales_groupby = property_sales.groupby('MUNIDESC')['PRICE'].mean()

print('COUNTY WITH HIGHEST HOUSING COST:', property_sales_groupby.idxmax(),'at $' + str(round(property_sales_groupby.max(), 2)))
print('COUNTY WITH LOWEST HOUSING COST: ', property_sales_groupby.idxmin(), 'at $ ' + str(round(property_sales_groupby.min(),2)))

COUNTY WITH HIGHEST HOUSING COST: 2nd Ward - PITTSBURGH at $1408401.77
COUNTY WITH LOWEST HOUSING COST:  3rd Ward  - DUQUESNE at $ 25142.02


Data limitations:
   * Listed by counties, not neighborhoods
   * County lines may create differences
   * Housing cost depends on what is in surrounding area too
   
Housing cost alone cannot be enough to determine the best neighborhood, but it can be used with other metrics to find the best. 

---

### Police Incidents

This data set depicts every police-reported incident in the Pittsburgh region. From here, we list out all neighborhoods and their total number of incidents, resulting in finding the neighborhood with the lowest total number of incidents and number of burglaries for each neighborhood.

In [3]:
import pandas as pd;
import math;

incidents = pd.read_csv("Incidents.csv")
neighborhoods = incidents["INCIDENTNEIGHBORHOOD"].unique()
numIncidents = []
cleanedList = [x for x in neighborhoods if str(x) != 'nan']

for n in neighborhoods:
    query_mask = incidents['INCIDENTNEIGHBORHOOD'] == n
    inc = incidents[query_mask]
    numIncidents.append(len(inc.index))

cleanedNumList = [x for x in numIncidents if x != 0]

print("-----------------Number of Incidents since 1/1/16-----------------\n")
for hood, num in zip(cleanedList, cleanedNumList):
    print ('\t\t', hood, "= ", num, "incidents")

minValue = min(cleanedNumList)
indexMin = numIncidents.index(minValue)
print ("\nNeighborhood with least amount of incidents = ", cleanedList[indexMin-1])
print("with", minValue, "incidents since 1/1/16.")

burg = incidents[["INCIDENTNEIGHBORHOOD", 'INCIDENTHIERARCHYDESC']]
contain_values = burg[burg['INCIDENTHIERARCHYDESC'].str.contains('BURGLARY', na = False)]
print("\nAmount of Burglaries per Neighborhood (Acsending)\n" , contain_values['INCIDENTNEIGHBORHOOD'].value_counts(ascending=True))




FileNotFoundError: [Errno 2] File Incidents.csv does not exist: 'Incidents.csv'

Data Limitations:
   * Some incidents could be not registered or not accounted for resulting in this data set not being fully accurate.
   
Many could agree that incident rates and amount of burglaries in a neighborhood are big factors on whether a neighborhood is safe or not.

NOTE: The data set used contain multiple severities of incidents including theft, dischard of firearm, Public Drunkeness, DUIs, etc.. 

---

### Firearm Seizures

This data set contains the records when police seized firearms from an individual. From here, we add the totals per neighborhood and find which had the most. 

In [4]:
firearm_seizures = pd.read_csv("Police Firearm Seizures.csv", index_col = "neighborhood")
pd.set_option('display.max_rows', None)

#Data set has some houses listed as 'NaN',
#in order to keep the data useable these
#are removed to allow for more accuracy
firearm_seizures.dropna(subset = ["fire_zone"], inplace=True)

#Sort data by the neighborhood and then by
#the sum firearm seizures for each
firearm_seizures_groupby = firearm_seizures.groupby('neighborhood')['total_count'].sum()

print(firearm_seizures_groupby)

print('NEIGHBORHOOD WITH LOWEST FIREARM SEIZURES: ', firearm_seizures_groupby.idxmin() ,'at ' + str(firearm_seizures_groupby.min()))
print('NEIGHBORHOOD WITH HIGHEST FIREARM SEIZURES: ', firearm_seizures_groupby.idxmax() ,'at ' + str(firearm_seizures_groupby.max()))

neighborhood
Allegheny Center              13
Allegheny West                16
Allentown                     82
Arlington                     35
Arlington Heights             10
Banksville                     8
Bedford Dwellings             72
Beechview                     79
Beltzhoover                   43
Bloomfield                    47
Bluff                         49
Bon Air                       11
Brighton Heights             129
Brookline                     70
California-Kirkbride          23
Carrick                      131
Central Business District    110
Central Lawrenceville         23
Central Northside             76
Central Oakland               18
Chartiers City                 4
Chateau                       22
Crafton Heights               52
Crawford-Roberts              47
Duquesne Heights              17
East Allegheny                47
East Hills                   107
East Liberty                 138
Elliott                       52
Esplen                        

Data Limitations:
   * Mostly statistical, unable to know severity
   
Many can agree that knowing whether their neighborhood has had weapons seized is a helpful stat to follow. 

NOTE: these reports contain weapons that may have been considered illegal or were in some illegal event this does not list weapons ownded legally such as huniting rifles. 

---

## The Best Neighborhood

Using the information from metric 2 and 3, **Mt. Oliver Boro** is so far the best listed out of the other neighborhoods. In order to complete our search for the best neighborhood, we will check the average housing cost of Mt. Oliver Boro.

In [27]:
print('Mt. Oliver has an average housing cost of $' + str(round(property_sales_groupby.iloc[119],2)))
print('The median housing cost is $' + str(round(property_sales_groupby.median(), 2)))


Mt. Oliver has an average housing cost of $66470.16
The median housing cost is $145621.76


Looking at the median housing cost of the neighborhoods in Pittsburgh, we can see that Mt. Oliver Boro is a decent amount under the median. Reminder, these prices were grouped as the mean of prices. So the median you are seeing is the median of the averages. Due to the low crime rates, illegal weapon activity, and decent housing cost, we believe the best neighborhood in Pittsburgh is **Mt. Oliver Boro**. 

## Conclusion

To conclude, many factors need to be taken into account when choosing a neighborhood to live in. The factors depend on what each person needs and relies on. For example, families will need schools closer to them while elderly people will need easier access to their housing and car. To specifically select one *perfect* neighborhood can be a little difficult. This report can help someone who is looking for a well rounded neighborhood that should cause them little problems. In our opinion, we believe a good neighborhood should be one with little crime, great price for the housing, and little 