# Project Introduction and Approach

This project aims to analyze neighborhood parks data to identify which neighborhood offers the best park experience. Using available data, I focused on quantitative metrics such as the number of parks and the total acreage of park space in each neighborhood.

# Metric Description and Dataset

The metric developed to evaluate the best neighborhood combines two key features from the parks dataset:

1. **Number of Parks:** The count of parks within each neighborhood, serving as an indicator of access and availability to green spaces.

2. **Total Park Acreage:** The sum of park sizes in acres within each neighborhood, reflecting the amount of green space and potential quality of park experience.

These features balance quantity and quality, as having many small parks may differ from fewer but larger parks.

## Dataset

The dataset used includes detailed information about parks, such as:

- Park names
- Neighborhood subdivision (`divname`) which is the basis for grouping
- Acreage of each park
- Additional attributes related to park type, maintenance, and facilities

The combined metric ranks neighborhoods by both number of parks and total acreage, summing ranks to produce an overall neighborhood ranking that reflects both the quantity and size of parks.

In [7]:
import pandas as pd

# Load data
df = pd.read_csv('parks.csv')

# Set pandas option to display all rows
pd.set_option('display.max_rows', None)

# Proceed with neighborhood count logic
neighborhood_col = 'divname'

# Count number of parks in each neighborhood
park_counts = df[neighborhood_col].value_counts().reset_index()
park_counts.columns = ['Neighborhood', 'NumParks']

# Sort neighborhoods by number of parks, descending
park_counts = park_counts.sort_values('NumParks', ascending=False)

# Display all rows
print(park_counts)

# Display best neighborhood
best = park_counts.iloc[0]
print(f"Best neighborhood by most amount of parks: {best['Neighborhood']} with {best['NumParks']} parks")


  Neighborhood  NumParks
0    Riverview        40
1      Emerald        33
2     McKinley        32
3     Highland        24
4     Schenley        24
5        Frick        22
6    Northeast         1
Best neighborhood by most amount of parks: Riverview with 40 parks


In [8]:
import pandas as pd

# Load data
df = pd.read_csv('parks.csv')

neighborhood_col = 'divname'
acreage_col = 'acreage' 

# Sum acreage by neighborhood
total_acreage = df.groupby(neighborhood_col)[acreage_col].sum().reset_index()
total_acreage = total_acreage.sort_values(by=acreage_col, ascending=False)

print(total_acreage)

best_by_acreage = total_acreage.iloc[0]
print(f"Best neighborhood by total park acreage: {best_by_acreage[neighborhood_col]} with {best_by_acreage[acreage_col]} total acreage")


     divname      acreage
3   McKinley  1114.112050
1      Frick   848.726508
6   Schenley   557.553008
2   Highland   462.161168
5  Riverview   445.134250
0    Emerald   413.046139
4  Northeast     5.413376
Best neighborhood by total park acreage: McKinley with 1114.1120498 total acreage


In [9]:
import pandas as pd

# Load data
df = pd.read_csv('parks.csv')

neighborhood_col = 'divname'
acreage_col = 'acreage'

# Calculate counts and acreage sums by neighborhood
park_counts = df[neighborhood_col].value_counts().reset_index()
park_counts.columns = ['Neighborhood', 'NumParks']

total_acreage = df.groupby(neighborhood_col)[acreage_col].sum().reset_index()

# Merge counts and acreage into one DataFrame
merged = pd.merge(park_counts, total_acreage, left_on='Neighborhood', right_on=neighborhood_col).drop(columns=[neighborhood_col])

# Rank neighborhoods by each criterion (descending order: high is better)
merged['RankParks'] = merged['NumParks'].rank(ascending=False, method='min')
merged['RankAcreage'] = merged[acreage_col].rank(ascending=False, method='min')

# Combine ranks into overall score (sum or weighted sum)
merged['OverallRank'] = merged['RankParks'] + merged['RankAcreage']

# Sort by overall rank
best_overall = merged.sort_values('OverallRank').reset_index(drop=True)

print(best_overall)

# Best overall neighborhood
top = best_overall.iloc[0]
print(f"Best overall neighborhood: {top['Neighborhood']} with rank score {top['OverallRank']}")


  Neighborhood  NumParks      acreage  RankParks  RankAcreage  OverallRank
0     McKinley        32  1114.112050        3.0          1.0          4.0
1    Riverview        40   445.134250        1.0          5.0          6.0
2     Schenley        24   557.553008        4.0          3.0          7.0
3      Emerald        33   413.046139        2.0          6.0          8.0
4     Highland        24   462.161168        4.0          4.0          8.0
5        Frick        22   848.726508        6.0          2.0          8.0
6    Northeast         1     5.413376        7.0          7.0         14.0
Best overall neighborhood: McKinley with rank score 4.0


<h1>Why McKinley is the Best Neighborhood</h1>

<p><strong>McKinley</strong> stands out as the best neighborhood for parks based on a balanced analysis of key factors:</p>

<ul>
  <li>It has one of the highest <strong>numbers of parks</strong>, providing ample recreational spaces for residents and visitors.</li>
  <li>The total <strong>acreage of parks</strong> in McKinley is among the largest, meaning the parks are not only numerous but also spacious.</li>
  <li>This combination of quantity and quality of park spaces makes McKinley ideal for outdoor activities, community events, and enjoying nature.</li>
</ul>

<p>In conclusion, McKinley excels both in the <strong>total number of parks</strong> and their <strong>overall size</strong>, making it the standout choice for residents seeking the best park experience.</p>
