## Introduction

The submetric being reviewed here is traffic by neighborhood. The majority of people do not enjoy traffic, so in theory, the less traffic there is in a neighborhood, the higher the quality of life will be. 

http://www.wprdc.org/ <br>
Data was obtained from the above link. The one data set used for this submetric is titled "Traffic Count Data."

By cleaning the data and finding the neighborhoods with the least amount of traffic, we can determine a quality of life ranking based on this submetric.

#### Creating DataFrame

In [70]:
import pandas as pd

#https://data.wprdc.org/dataset/traffic-count-data-city-of-pittsburgh
traffic = pd.read_csv("6dfd4f8f-cbf5-4917-a5eb-fd07f4403167.csv")

traffic.head()

Unnamed: 0,id,device_id,record_oid,count_start_date,count_end_date,average_daily_car_traffic,average_daily_bike_traffic,counter_number,counter_type,speed_limit,...,longitude,latitude,neighborhood,council_district,ward,tract,public_works_division,pli_division,police_zone,fire_zone
0,1011743669,85,1445865000.0,2019-04-18,2019-04-26,4949.0,,6.0,StatTrak,25.0,...,-79.967772,40.455733,Polish Hill,7.0,6.0,42003060500,6.0,6.0,2.0,2-6
1,1026101993,140,1121444000.0,2019-01-24,,,,,Intersection Study,,...,-79.952249,40.466157,Central Lawrenceville,7.0,9.0,42003090200,2.0,9.0,2.0,3-6
2,1032382575,11,1539893000.0,2018-08-28,2018-09-04,,,,,35.0,...,-80.076469,40.460717,Windgap,2.0,28.0,42003563000,5.0,28.0,6.0,1-16
3,103627606,9,734195100.0,2018-07-17,2018-08-01,2741.0,,,StatTrak,25.0,...,-79.914335,40.437379,Squirrel Hill South,5.0,14.0,42003140800,3.0,14.0,4.0,2-18
4,1039546167,144,,,,,,,,,...,-80.019211,40.490794,Perry North,1.0,26.0,42003260200,1.0,26.0,1.0,1-15


#### Cleaning the data

In [71]:

# deletes all rows with null values in the columns we are focusing on
traffic.dropna(subset=["average_daily_car_traffic"], inplace=True)

# Creates a dictionary of neighborhoods and car counts

# In the original data set, there are duplicate neighborhoods
# so this is done to combine the car counts from the different
# years so that there is only one instance of each neighborhood

trafficDict = dict()
i=0
while i < len(traffic):
    try:
        neighborhood = traffic.loc[i, "neighborhood"]
        if neighborhood in trafficDict.keys():
            trafficDict[neighborhood] += traffic.at[i, "average_daily_car_traffic"]
        else:
            trafficDict[neighborhood] = traffic.at[i, "average_daily_car_traffic"]
        i += 1
    except:
        i += 1
print(trafficDict)

# turns the dictionary into a pandas DataFrame
cleaned = pd.DataFrame(list(trafficDict.items()), columns =["neighborhood", "average_daily_car_traffic"] )
cleaned.head()

{'Polish Hill': 15894.0, 'Squirrel Hill South': 45484.0, 'Central Northside': 1946.0, 'Bluff': 5365.0, 'Crafton Heights': 11500.0, 'Shadyside': 18801.0, 'Highland Park': 23471.0, 'North Shore': 10350.0, 'East Liberty': 51247.0, 'Mount Washington': 43715.0, 'Brookline': 14718.0, 'Squirrel Hill North': 17649.0, 'Bloomfield': 41920.0, 'Larimer': 31987.0, 'Friendship': 4887.0, 'Point Breeze': 24125.0, 'Regent Square': 16729.0, 'Central Lawrenceville': 9145.0, 'Knoxville': 527.0, 'Central Oakland': 4158.0, 'Strip District': 29075.0, 'Greenfield': 976.0, 'Windgap': 3062.0, 'Beechview': 4255.0, 'Upper Hill': 2860.0, 'Stanton Heights': 16272.0, 'Manchester': 11344.0, 'South Side Slopes': 9114.0, 'Perry North': 8987.0, 'North Oakland': 23821.0, 'East Hills': 13788.0, 'Duquesne Heights': 5831.0, 'Sheraden': 7180.0, 'Morningside': 5508.0, 'Central Business District': 2305.0, 'St. Clair': 2436.0, 'Perry South': 6943.0, 'Elliott': 3765.0, 'Carrick': 3457.0, 'Westwood': 15400.0, 'Allegheny Center': 

Unnamed: 0,neighborhood,average_daily_car_traffic
0,Polish Hill,15894.0
1,Squirrel Hill South,45484.0
2,Central Northside,1946.0
3,Bluff,5365.0
4,Crafton Heights,11500.0


The average of the seperate values from each neighborhood could also be used, but the sum works just as efficiently to sort the data.

#### Sorting Data

In [72]:
# sorts by car traffic from least to greatest
cleaned.sort_values("average_daily_car_traffic", ascending=True, inplace=True)
cleaned.head(10)

Unnamed: 0,neighborhood,average_daily_car_traffic
43,Allegheny West,477.0
18,Knoxville,527.0
42,Overbrook,777.0
21,Greenfield,976.0
53,Spring Garden,1226.0
50,Beltzhoover,1577.0
41,East Allegheny,1924.0
2,Central Northside,1946.0
34,Central Business District,2305.0
40,Allegheny Center,2386.0


## Conclusion


After cleaning and sorting the data, the ranking neighborhoods with the least amount of traffic has been produced. Based on this ranking, Allegheny West has the least amount of total traffic, and would therefore have the highest quality of life by this submetric.