# Number of compiants per neighboorhood

For this project, I used the Pittsburgh 311 dataset. The goal was to figure out which neighborhood in Pittsburgh is the “best” based on the number of complaints per neighborhood.
My approach was simple. I looked at how many 311 complaints each neighborhood received in the past year and compared those numbers. To make it fair, I divided the number of complaints by the population of each neighborhood. This gave me a fair comparison, since larger neighborhoods naturally have more complaints.
I also considered other ideas, like using total complaints (not fair to big neighborhoods) or weighting complaints by severity (too complex for this project). So I decided that complaints per 1,000 residents was the clearest and most balanced metric.


# Metric

The metric complaints per 1,000 residents measures how often people in each Pittsburgh neighborhood report issues through the city’s 311 system. It’s calculated by dividing the total number of 311 complaints by the neighborhood’s population and multiplying by 1,000. This shows which neighborhoods have fewer reported problems or stronger city services. The data comes from the 311 Service Requests dataset and City of Pittsburgh neighborhood population data.

# The Best Neighborhood

In [23]:
import pandas as pd

# Load 311 dataset
complaints = pd.read_csv("311.csv")
# Load demographics dataset (for population)
demographics = pd.read_csv("demographics.csv")
# Filter to only relavent columns
demographics = demographics[["SNAP_All_csv_Neighborhood","Pop__2010"]]

# Rename for easy merging
demographics.columns = ['neighborhood', 'population']
# Count complaints per neighborhood
complaint_counts = complaints.groupby('neighborhood').size().reset_index(name='Complaint_Count')

# Merge the population into the dataframe
merged = pd.merge(complaint_counts, demographics, on='neighborhood')

merged

# Calculate complaints per capita
merged['Complaints_per_capita'] = (merged['Complaint_Count'] / merged['population'])

# Sort neighborhoods (lowest = best)
ranking = merged.sort_values('Complaints_per_capita', ascending=True)

# Display all neighborhoods
print("All Neighborhoods Ranked by Complaints per capita:\n")
for i, row in ranking.iterrows():
    print(f"{i+1}. {row['neighborhood']} — {row['Complaints_per_capita']:.3f} complaints per capita")

#filter for saving
to_save = merged[['neighborhood','Complaints_per_capita']]

to_save.to_csv("311_pc.csv", index=False)

All Neighborhoods Ranked by Complaints per capita:

5. Arlington Heights — 0.066 complaints per capita
58. Northview Heights — 0.072 complaints per capita
36. Glen Hazel — 0.081 complaints per capita
6. Banksville — 0.100 complaints per capita
76. Squirrel Hill North — 0.122 complaints per capita
82. Swisshelm Park — 0.134 complaints per capita
89. Westwood — 0.142 complaints per capita
20. Central Oakland — 0.143 complaints per capita
63. Point Breeze — 0.152 complaints per capita
55. New Homestead — 0.160 complaints per capita
34. Friendship — 0.179 complaints per capita
68. Shadyside — 0.182 complaints per capita
77. Squirrel Hill South — 0.193 complaints per capita
23. Crafton Heights — 0.200 complaints per capita
37. Greenfield — 0.202 complaints per capita
64. Point Breeze North — 0.206 complaints per capita
67. Ridgemont — 0.207 complaints per capita
21. Chartiers City — 0.208 complaints per capita
83. Terrace Village — 0.208 complaints per capita
59. Oakwood — 0.211 complaints 