# Finding the Safest Neighborhood in Pittsburgh

In our project to find the best neighborhood in Pittsburgh, we decided to focus on an essential factor that affects everyone's well-being: safety. Safety is a broad notion that considers a variety of elements such as crime rates, law enforcement efficacy, and community trust. By prioritizing safety as our primary metric, we aim to shed light on places where residents may feel comfortable.

To quantify safety, we utilized two primary datasets: one focusing on arrests in Pittsburgh and the other on firearm seizures. Both datasets provide valuable insights into public safety by highlighting areas with higher incidents of law enforcement interventions and firearm-related incidents. By examining these datasets, we can gain a comprehensive understanding of the safety landscape across Pittsburgh neighborhoods.

In measuring the safety of each neighborhood, we first filtered them for 2020 data in order to get a more pinpointed set of data, then we calculated separate ranks for each neighborhood based on the total number of arrests and seizures. Higher numbers of incidents corresponded to higher ranks, indicating areas with higher incident rates. For neighborhoods present in both datasets, we computed a combined rank by summing their individual ranks from both datasets. This approach provided a consolidated view of safety by considering both law enforcement interventions and firearm-related incidents. Finally, based on the combined ranks, we identified the top 3 "safest" neighborhoods with the lowest combined ranks, signifying lower incident rates and higher safety levels.

In [3]:
import pandas as pd

In [5]:
arrests = pd.read_csv("arrests.csv")
seizures = pd.read_csv("firearm_seizures.csv")

In [6]:
#Filter for 2020
arrests_2020 = arrests[arrests['ARRESTTIME'].str.contains('2020', na=False)]
apn_2020 = arrests_2020['INCIDENTNEIGHBORHOOD'].value_counts()

seizures_2020 = seizures[seizures['year'] == 2020]
total_seizures_2020 = seizures_2020.groupby("neighborhood").sum()['total_count']

In [8]:
# Sort neighborhoods based on totals and assign ranks
sorted_arrests = apn_2020.sort_values().index.tolist()
arrests_ranks = {neighborhood: i+1 for i, neighborhood in enumerate(sorted_arrests)}

sorted_seizures = total_seizures_2020.sort_values().index.tolist()
seizures_ranks = {neighborhood: i+1 for i, neighborhood in enumerate(sorted_seizures)}

In [13]:
#Combining the ranks from each to find the lowest overall neighborhood
combined_ranks = {}
for neighborhood in set(sorted_arrests).intersection(sorted_seizures):
    combined_ranks[neighborhood] = arrests_ranks[neighborhood] + seizures_ranks[neighborhood]

top_3_safest_neighborhoods = sorted(combined_ranks, key=combined_ranks.get)[:3]
top_3_ranks = [combined_ranks[neighborhood] for neighborhood in top_3_safest_neighborhoods]

print("Top 3 'Safest' Neighborhoods Overall in Pittsburgh for 2020:")
for i, neighborhood in enumerate(top_3_safest_neighborhoods, 1):
    print(f"{i}. {neighborhood}")

Top 3 'Safest' Neighborhoods Overall in Pittsburgh for 2020:
1. Chartiers City
2. Upper Lawrenceville
3. Lincoln Place


In [15]:
#Showing more detail on the rankings for each set and the final combined set
print("Original Lists with Total Counts, Ranks, and Combined Ranks:")
print("\nArrests (2020):")
for i, neighborhood in enumerate(sorted_arrests, 1):
    print(f"{i}. {neighborhood}: {apn_2020[neighborhood]} arrests, Rank: {arrests_ranks[neighborhood]}")

print("\nSeizures (2020):")
for i, neighborhood in enumerate(sorted_seizures, 1):
    print(f"{i}. {neighborhood}: {total_seizures_2020[neighborhood]} seizures, Rank: {seizures_ranks[neighborhood]}")

# Print combined list with combined ranks for neighborhoods present in both datasets
print("\nCombined List with Combined Ranks (for neighborhoods present in both datasets):")
for i, (neighborhood, rank) in enumerate(combined_ranks.items(), 1):
    print(f"{i}. {neighborhood}: Combined Rank: {rank}")

# Find the top 3 "safest" neighborhoods overall (lowest combined ranks)
top_3_safest_neighborhoods = sorted(combined_ranks, key=combined_ranks.get)[:3]
top_3_ranks = [combined_ranks[neighborhood] for neighborhood in top_3_safest_neighborhoods]

# Print final rankings for top 3 "safest" neighborhoods
print("\nTop 3 'Safest' Neighborhoods Overall in Pittsburgh for 2020:")
for i, neighborhood in enumerate(top_3_safest_neighborhoods, 1):
    print(f"{i}. {neighborhood}: Combined Rank: {top_3_ranks[i-1]}")

# Print neighborhoods that were only in one set
only_in_arrests = set(sorted_arrests) - set(sorted_seizures)
only_in_seizures = set(sorted_seizures) - set(sorted_arrests)

print("\nNeighborhoods Only in Arrests (2020):")
for neighborhood in only_in_arrests:
    print(f"- {neighborhood}")

print("\nNeighborhoods Only in Seizures (2020):")
for neighborhood in only_in_seizures:
    print(f"- {neighborhood}")

Original Lists with Total Counts, Ranks, and Combined Ranks:

Arrests (2020):
1. St. Clair: 3 arrests, Rank: 1
2. Regent Square: 3 arrests, Rank: 2
3. Outside County: 3 arrests, Rank: 3
4. New Homestead: 3 arrests, Rank: 4
5. Mt. Oliver Boro: 4 arrests, Rank: 5
6. Chartiers City: 5 arrests, Rank: 6
7. Oakwood: 7 arrests, Rank: 7
8. Arlington Heights: 7 arrests, Rank: 8
9. East Carnegie: 7 arrests, Rank: 9
10. Outside State: 7 arrests, Rank: 10
11. Swisshelm Park: 7 arrests, Rank: 11
12. Mount Oliver: 8 arrests, Rank: 12
13. Ridgemont: 10 arrests, Rank: 13
14. Allegheny West: 10 arrests, Rank: 14
15. Friendship: 12 arrests, Rank: 15
16. Summer Hill: 14 arrests, Rank: 16
17. Polish Hill: 14 arrests, Rank: 17
18. Fairywood: 14 arrests, Rank: 18
19. Hays: 14 arrests, Rank: 19
20. Windgap: 17 arrests, Rank: 20
21. Spring Garden: 21 arrests, Rank: 21
22. Lower Lawrenceville: 21 arrests, Rank: 22
23. Upper Lawrenceville: 21 arrests, Rank: 23
24. Esplen: 22 arrests, Rank: 24
25. Glen Hazel: 22