## Nature
By Radley Lettich

Obviously, nature is an important thing to stick close to. In a large city, however, we struggle with this. Which neighborhoods have the most trees, and which neighborhoods have the most water features? Obviously we can't just count, so what would be a little more representative of how many trees are in each neighborhood?

In [2]:
# Load the cool stuff
import pandas as pd
import numpy as np

In [3]:
# Create a dataframe for each of the three datasets (neighborhoods, water, and trees)
neighborhood = pd.read_csv("Neighborhoods_.csv")
w = pd.read_csv("WaterFeatures_.csv").dropna()
t = pd.read_csv("Trees_.csv").dropna()

# Make a reduced dataframe of the neighborhood dataset that only contains the square mileage for each community.
# We sort the values now, as we can't sort them while putting them into a list (or at least I don't know how to)
n = neighborhood[['hood','sqmiles']].sort_values('hood')

# Make a list of the neighborhoods, in alphabetical order. We'll use alphabetical order to organize everything.
nbhood = n['hood'].unique().tolist()
sqrmiles = n['sqmiles'].tolist()

# The dataset had a 0 for neighborhood 60 (Perry North), so I filled it in manually here because we cannot divide by zero.
sqrmiles[60] = 1.212

# Make two empty lists, which  we'll fill with the counts of each communities' counts.
treesum = []
watersum = []

# Fill 'em up so they've got space
for r in range(90) :
    treesum.append(0)
    watersum.append(0)

# Here, we loop through the length of the trees dataset, and find the neighborhood for each tree.
# Once we know the neighborhood, we find it's spot in the treesum list relative to where the neighborhood is in the nbhood list.
# After that, we just tick that spot up by one. We just counted one tree! Do it about 45000 more times.
for r in range(len(t)):
    neighborhood = t.iloc[r,48]
    i = nbhood.index(neighborhood)
    treesum[i]+=1

# Aaaaaaaaaaand do it again, but with water features this time!
for r in range(len(w)):
    neighborhood = w.iloc[r,7]
    i = nbhood.index(neighborhood)
    watersum[i]+=1

# Now that we've got everything prepared, we're gonna loop to make these lists into a 2D array. Making sure we add...
num = [[]]
for i in range(90):
    num[i].append(nbhood[i]) # the neighborhood name,
    num[i].append(treesum[i] / sqrmiles[i]) # the density of trees (no. of trees / sqrmiles),
    num[i].append(watersum[i] / sqrmiles[i]) # the density of water features,
    num[i].append((treesum[i] + (watersum[i] * 5)) / sqrmiles[i]) # and the sum of them, which will be used for comparison.
    
    num.insert(i+1, []) # Make yo'self a new line.

# The loop ends after it creates a new line, we're just gonna get rid of it.
del num[90]

# Now, the piece de resistance, let's make it into a dataframe!
stats = pd.DataFrame(num, columns=['Neighborhood', 'Tree Density', 'Water Density', 'Total'])

# I looked at it and it looked like a mathematician threw up on a calculator, so let's round it to clean it up.
cleanstats = stats.round(decimals=2)

# Let's sort this bad boy, and see what we got!
cleanstatsbutsorted = cleanstats.sort_values('Total', ascending=False)
cleanstatsbutsorted

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


Unnamed: 0,Neighborhood,Tree Density,Water Density,Total
33,Friendship,2867.92,0.00,2867.92
18,Central Northside,2335.91,3.86,2355.21
28,East Liberty,2127.37,3.44,2144.58
48,Manchester,2064.52,7.17,2100.36
49,Marshall-Shadeland,1813.85,8.66,1857.14
...,...,...,...,...
26,East Carnegie,9.26,0.00,9.26
30,Esplen,8.70,0.00,8.70
70,South Shore,4.72,0.00,4.72
35,Glen Hazel,4.39,0.00,4.39


In [4]:
# Let's make it into a ranking! Start by taking the now sorted neighborhood column as a list, like we did at the start.
howdyneighborino = cleanstatsbutsorted['Neighborhood'].tolist()

# A similar system for loading up a 2D array, but we only need the two values, which just so happen to align with i nicely.
ranking = [[]]
for i in range(90):
    ranking[i].append(howdyneighborino[i])
    ranking[i].append(i+1)
    ranking.insert(i+1, [])
del ranking[90]

# Let's take a look!
rankingdf = pd.DataFrame(ranking, columns=['Neighborhood', 'Ranking'])
rankingdf

Unnamed: 0,Neighborhood,Ranking
0,Friendship,1
1,Central Northside,2
2,East Liberty,3
3,Manchester,4
4,Marshall-Shadeland,5
...,...,...
85,East Carnegie,86
86,Esplen,87
87,South Shore,88
88,Glen Hazel,89


# Conclusion

I, personally, like to be close to nature. I have a tree right outside my window. Due to the power of math, we were able to determine the density of trees and water in differing communities in Pittsburgh, and compare them. By calculating the number of trees and water features in each respective community, we were able to divide it by the square mileage to get something that was a little more representative. In the end, I concluded that the most natural community in Pittsburgh is **Friendship.**