# Final Analysis
When thinking about how to rank the neighborhoods of Pittsburgh, we opted to take a multifaceted approach. A given metric may vary massively in its importance between individuals. In order to make account for this variance in our final metric, we as a group chose three qualities in a neighboorhood that are distinct enough to approximate a wide range of preferences - walkability, education, and employment opportunities.

## Walkability
As stated in greater detail in the `Notebook_Cassano.ipynb`, walkability in our anaylsis comes from a WPRDC dataset that compares the amount of sidewalk distance to the amount of road distance in a given neighborhood. Using this, a ratio can be generated that can be considered a walkability measurement - areas with more sidewalks around their roads can be considered more walkable compared to areas that require cars.

After completing the statistical analysis, the following neighborhoods came out on top, with the highest walkability values. For the full results, visualizations, and detailed process, again refer back to `Notebook_Cassano.ipynb`.

| Rank | Neighborhood     |
|------|------------------|
| 1    | Terrace Village  |
| 2    | North Shore      |
| 3    | Allegheny Center |
| 4    | North Oakland    |
| 5    | Larimer          |

## Education

## Employment Opportunities


## Final Metric - RankRank
With our three individual ratings, our next task was to somehow incorporate them into a final, composite ranking. Instead of an arbitrary choice, we employed a statistcal method to generate this final rank. When devising the specific method, we felt that each dataset was equally important to the overall quality of a neighborhood. As such, we felt the final ranking should just be a simple combination of the three - an average. Our final metric is simply a ranking of the *average* ordinal rank of each neighborhood across the three datasets. For example, if a neighborhood was ranked 1st in walkability, 3rd in education, and 4th in employment opportunities, the its final value for our composite ranking is 2.67 - an average of the three sub-metric ranks. These composite rankings are then ordered by value, lowest to highest, to produce our final ranking, found below. Since it's a ranking of the ranks, let's call it RankRank!

In [1]:
import pandas as pd
import statistics

# import walkability data
wlkRank = pd.read_csv("Datasets/walkability_named_clean.csv")
enrRank = pd.read_csv("Datasets/enrollment_clean.csv")
jobRank = pd.read_csv("Datasets/employment_clean.csv")

# set up dictionary
rankArrays = dict()

# iterate through walk rank, adding to array of rankings
for index, row in wlkRank.iterrows():
    if row[0] not in rankArrays:
        rankArrays[row[0]] = [index+1]      
        
# iterate through enr rank, adding to array of rankings only for neighborhoods that exist in walk rank        
for index, row in enrRank.iterrows():
    if row['neighborhood'] not in rankArrays:
        continue
    else:
        rankArrays[row['neighborhood']].append(index+1)
        
#iterate through job rank, adding to array of rankings only for neighborhoods that exist in walk rank        
for index, row in jobRank.iterrows():
    if row['Neighborhood'] not in rankArrays:
        continue
    else:
        rankArrays[row['Neighborhood']].append(index+1)
    
print(rankArrays)

# remove neighborhoods not found in all 3 datasets
for key in list(rankArrays):
    if len(rankArrays[key]) != 3:
        rankArrays.pop(key)
    
rankRank = dict()

# generate the average of the rankings in each list of rankings that 
for key in rankArrays:
    if key not in rankRank:
        rankRank[key] = statistics.mean(rankArrays[key])
        
# make a new dataframe from this composite ranking
rankRankDF = pd.DataFrame.from_dict(rankRank,orient='index',columns=['Average Rank'])
rankRankDF = rankRankDF.sort_values(by='Average Rank', ascending=True)
# print top 5, organized from lowest to highest
rankRankDF.head(10)


{'Terrace Village': [1, 23, 71], 'North Shore': [2, 27], 'Allegheny Center': [3, 59, 49], ' North Oakland': [4], 'Larimer': [5, 33, 78], 'Garfield': [6, 13, 53], 'Lawrenceville': [7], 'South Side Flats': [8, 54, 5], 'Bloomfield': [9, 32, 6], 'Shadyside': [10, 41, 14], 'Crawford-Roberts': [11, 38, 77], 'Squirrel Hill North': [12, 25, 26], 'East Liberty': [13, 15, 51], 'Lincoln': [14], 'Friendship': [15, 72, 32], 'Point Breeze': [16, 27, 10], 'Golden Triangle': [17, 85], 'Homewood North': [18, 8, 61], 'Arlington': [19, 42, 60], 'South Oakland': [20, 62, 57], 'Knoxville': [21, 6, 50], 'Highland Park': [22, 22, 11], 'Central Oakland': [23, 86, 45], 'West Oakland': [24, 71, 73], 'Lawrencecville': [25], 'South Shore': [26], 'Stanton Heights': [27, 26, 24], 'Greenfield': [28, 16, 13], 'Upper Hill': [29, 49, 70], 'Squirrel Hill South': [30, 5, 28], 'Morningside': [31, 45, 15], 'Mount Washington': [32, 21, 8], 'Allentown': [33, 29, 65], 'Brighton Heights': [34, 7, 22], 'Beltzhoover': [35, 34, 5

Unnamed: 0,Average Rank
Bloomfield,15.666667
Point Breeze,17.666667
Highland Park,18.333333
Greenfield,19.0
Mount Washington,20.333333
Brookline,20.666667
Brighton Heights,21.0
Squirrel Hill South,21.0
Squirrel Hill North,21.0
Shadyside,21.666667


From our final metric, RankRank, you can see that the best neighborhoods to live in are Bloomfield, Point Breeze, and Highland Park, because they have the lowest composite ranking from all three of our metrics.

## Reflections

### Nick


### Bella

### Brit