## Introduction

In this notebook, we will discuss the best neighborhood based on three of our metrics. We will analyze three metrics: 
- Incident Level
- Traffic Signs
- Aggregate Household Income

In [1]:
import os
print(os.getcwd())


/ihome/cmpinf0010_2025s/xil458/CMPINF-Final-Group8/Conclusion


---

**1. Incident Level** 

Xiang's metric analyzes the incident number in each neighborhood. Since the higher hierarchy level stands for more severity of incidents, the metric then counts the total hierarchy level of each neighborhood by adding the individual incident's hierarchy level together:

In [2]:
import pandas as pd
crime_with_neighborhood = pd.read_csv("crime_with_neighborhood.csv")


crime_weighted = crime_with_neighborhood.groupby("hood")["HIERARCHY"].sum().reset_index(name="CrimeSeverity")

crime_weighted = crime_weighted.sort_values("CrimeSeverity", ascending=False)

crime_weighted.head()

Unnamed: 0,hood,CrimeSeverity
16,Central Business District,178248
71,South Side Flats,133605
15,Carrick,98564
13,Brookline,78351
52,Mount Washington,73313


Then, by using `MinMaxScaler`, the data is scaled into a value between 0 and 1. We call it the `safety score`, where the higher the score is, the less CrimeSeverity. 

In [3]:
crime_safety_score = pd.read_csv("crime_safety_scores.csv")

crime_safety_score = crime_safety_score.sort_values("SafetyScore", ascending=False)

crime_safety_score.head()

Unnamed: 0,hood,SafetyScore
28,East Carnegie,1.0
73,St. Clair,0.997511
64,Ridgemont,0.997348
37,New Homestead,0.995976
82,Swisshelm Park,0.992946


Safety score is more visualized and more convenient when combining all three metrics. 

---

**2. Traffic Signs** 

Kim's analysis was the total amount of traffic signs in each neighborhood. This was used as a gauge of each neighborhood's average congestion. The idea was that stop, yield, or pedestrian traffic signs force drivers to stop on the road, which creates the potential for traffic jams. While *all* signs were counted, they should still give a picture of how many said stop, yield, or pedestrian signs are there.

Each neighborhood's total amount was then converted into traffic scores using `MinMaxScaler`. The higher the traffic score is, the less traffic signs there are. It thus suggests that the neighborhood is less likely to get congested.

In [4]:
traffic_score = pd.read_csv("traffic_scores.csv")

traffic_score = traffic_score.sort_values("TrafficScore", ascending=False)
traffic_score.head()

Unnamed: 0,hood,SignAmount,TrafficScore
63,Arlington Heights,21,1.0
88,Glen Hazel,38,0.995344
84,Chartiers City,44,0.9937
85,Ridgemont,62,0.98877
50,St. Clair,64,0.988222


Based on this metric, Arlington Heights is the best neighborhood.

---

**3. Aggregate Household Income**

Hongyu's analysis was the aggregated income housing for each neighborhood in 12-month range. One of the most fundamental components of the neighborhood quality of life potential is economic vitality. In this project, we capture this dimension using aggregate household income。 The total income earned by all households in a neighborhood over the past 12 months, adjusted to 2015 dollars to account for inflation.

In [5]:
income_score = pd.read_csv("income_scores.csv")

income_score.head()

Unnamed: 0,Neighborhood,IncomeEstimate,IncomeScore
0,Squirrel Hill South,659886300.0,1.0
1,Shadyside,605635700.0,0.917489
2,Squirrel Hill North,577467500.0,0.874648
3,Brookline,366945900.0,0.554461
4,Point Breeze,314958500.0,0.475392


---

## Final Rubric: Neighborhood Quality of Life Potential

### The relationship with Incident Levels:

Public safety is one of the most immediate and influential aspects of neighborhood livability. In this project, we capture this dimension through incident levels — the total number of reported incidents, including crimes or emergencies, recorded within each neighborhood.

Incident levels serve as a direct indicator of neighborhood safety. A higher incident count may signal increased risks to residents, such as theft, violence, or disturbances, whereas a lower count often reflects a safer and more secure living environment. These numbers provide a data-driven basis to assess which neighborhoods may offer more peace of mind and security to their inhabitants.

Neighborhoods with lower incident rates tend to provide:

- Safer streets and public spaces for residents and children,

- Greater appeal for families, students, and long-term investment,

- Reduced stress and anxiety related to crime or emergencies,

- Higher trust in community and law enforcement presence.

Tracking incidents also helps city planners and policymakers identify areas that need improved public safety measures, policing strategies, and social services.

In summary, incident levels reflect the safety and stability of a neighborhood, making them a critical factor in evaluating quality of life. A lower frequency of incidents typically corresponds with stronger community resilience, higher property values, and a more desirable place to live.

### The relationship with Aggregated Income Housing:
One of the most fundamental components of the Neighborhood Quality of Life Potential is economic vitality. In this project, we capture this dimension using aggregate household income — the total income earned by all households in a neighborhood over the past 12 months, adjusted to 2015 dollars to account for inflation.

Aggregate income offers a holistic view of a neighborhood’s financial landscape. Rather than focusing on individual wealth or a single household’s income, this metric reflects both the population size and economic strength of the area. A high total income implies a large number of working residents, well-paying jobs, and stronger local purchasing power. All of which contribute to more vibrant and resilient communities.

Economically strong neighborhoods are often equipped with:

- Higher quality schools, parks, and public facilities,

- Well-maintained streets and services,

- A robust local business environment with restaurants, shops, and job opportunities,

Greater housing stability and long-term investment potential.

Compared to median income, which reflects only the income of the "middle" household, aggregate income gives us insight into the total economic capacity of the neighborhood, which is how much wealth flows through it and what that means for residents' daily lives.

In summary, aggregate household income serves as a critical sign for the resources a community has to support well-being, infrastructure, and opportunity. All are essential for a high quality of life.


### The relationship with Neighborhood Traffic Infrastructure: 

A key indicator of neighborhood safety, navigability, and pedestrian accessibility is the presence and density of traffic signs. In this project, we assess this dimension by counting the number of traffic signs — including stop signs, pedestrian crossings, yield signs, and more — within each neighborhood’s boundaries.

The total number of traffic signs offers a proxy for how well-regulated and safe a neighborhood’s transportation environment is. A higher number of traffic signs typically indicates more thoughtful urban planning and a greater emphasis on pedestrian and driver safety. It also suggests clearer traffic guidance, reduced accident risks, and a more walkable community.

Neighborhoods with well-developed traffic infrastructure often benefit from:

- Enhanced pedestrian safety through visible and enforced crosswalks,

- Reduced traffic-related accidents and congestion,

- Better mobility for residents, including those walking, biking, or driving,

- Stronger adherence to traffic laws and road etiquette.

Traffic signs also play a role in increasing accessibility for vulnerable populations such as children, the elderly, and people with disabilities. A neighborhood with robust signage helps support inclusive mobility and community flow.

In summary, the density of traffic signs reflects a neighborhood’s investment in public safety and urban infrastructure, making it a vital contributor to the overall quality of life and livability of the area.

---

**Combining Data Frames** 

Now we have the scores for each dataset. A higher score stands for fewer incidents, fewer traffic signs, or a higher average income neighborhood. 

`Next`, we are trying to merge these datasets.

- Rename columns for clarity:

In [6]:
crime_safety_score = crime_safety_score.rename(columns={'hood': 'Neighborhood'})
traffic_score = traffic_score.rename(columns={'hood': 'Neighborhood'})
income_score = income_score.rename(columns={'Neighborhood': 'Neighborhood'})


- Merge all three scores on neighborhood name:

In [7]:
combined = crime_safety_score.merge(traffic_score[['Neighborhood', 'TrafficScore']], on='Neighborhood', how='inner')
combined = combined.merge(income_score[['Neighborhood', 'IncomeScore']], on='Neighborhood', how='inner')

combined.head()

Unnamed: 0,Neighborhood,SafetyScore,TrafficScore,IncomeScore
0,East Carnegie,1.0,0.972336,0.019096
1,St. Clair,0.997511,0.988222,0.002248
2,Ridgemont,0.997348,0.98877,0.008372
3,New Homestead,0.995976,0.973432,0.037526
4,Swisshelm Park,0.992946,0.958368,0.061911


- Compute a combined score (equal weights)

In [8]:
combined['CombinedScore'] = (combined['SafetyScore'] + combined['TrafficScore'] + combined['IncomeScore']) / 3

- Sort by CombinedScore descending

In [9]:
best_hoods = combined.sort_values('CombinedScore', ascending=False)

best_hoods.head()

Unnamed: 0,Neighborhood,SafetyScore,TrafficScore,IncomeScore,CombinedScore
61,Squirrel Hill North,0.789264,0.486716,0.874648,0.716876
40,Point Breeze,0.8933,0.654067,0.475392,0.674253
4,Swisshelm Park,0.992946,0.958368,0.061911,0.671075
39,Banksville,0.897578,0.907697,0.206381,0.670552
3,New Homestead,0.995976,0.973432,0.037526,0.668978


## Conclusion: The Best Neighborhood in Pittsburgh

After analyzing three important aspects — **Crime Safety**, **Traffic Infrastructure**, and **Aggregate Household Income** — we computed a combined score for each Pittsburgh neighborhood using scaled values and equal weights for fairness and balance.

According to our final ranking, the top five neighborhoods are:

1. **Squirrel Hill North**  
2. **Point Breeze**  
3. **Swisshelm Park**  
4. **Banksville**  
5. **New Homestead**

Squirrel Hill North leads with the highest overall score (0.7169), showing a strong balance of safety, traffic infrastructure, and economic strength. While some neighborhoods excel in one area, they may fall short in others. This highlights the value of using a combined metric, which ensures no single factor dominates the outcome.

This approach offers a **data-driven way to identify well-rounded neighborhoods**, helping both residents and decision-makers prioritize areas that perform consistentquality-of-life dimensions.e dimensions.
ility.
