# Traffic In Pittsburgh

### By: Anna Cavanaugh and Liam Sullivan

One of the metrics considered for the most aesthetically pleasing neighborhood is the neighborhood with the least amount of traffic. This is because traffic is loud, annoying and just ugly.

In [38]:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

First, lets read in our data set.

In [39]:
traffic = pd.read_csv("traffic.csv")

Now, lets find the total number of traffic per neighborhood.

In [40]:
temp = traffic.groupby('neighborhood').sum()['average_daily_car_traffic']
temp = temp.sort_values(ascending = True)
temp

neighborhood
Middle Hill                0.0
Crawford-Roberts           0.0
South Oakland              0.0
Banksville                 0.0
Upper Lawrenceville        0.0
                        ...   
Bloomfield             43101.0
Mount Washington       47336.0
East Liberty           51247.0
Shadyside              66132.0
Squirrel Hill South    79302.0
Name: average_daily_car_traffic, Length: 74, dtype: float64

This is a good start, however some of the neighborhoods in the file are missing data, so lets remove all neighborhoods with no data from consideration. 

In [46]:
query = traffic['average_daily_car_traffic']>0
traffic = traffic[query]
temp = traffic.groupby('neighborhood').sum()['average_daily_car_traffic']
temp = temp.sort_values(ascending = True)
temp

neighborhood
South Side Flats              115.0
Lincoln Place                 126.0
Lincoln-Lemington-Belmar      195.0
Fineview                      419.0
Allegheny West                477.0
                             ...   
Bloomfield                  43101.0
Mount Washington            47336.0
East Liberty                51247.0
Shadyside                   66132.0
Squirrel Hill South         79302.0
Name: average_daily_car_traffic, Length: 64, dtype: float64

Now that we've removed the neighborhoods that their is no data for, we need to average the number of cars by the number of traffic counters are in that neighborhood.

In [80]:
#sort data alphabetically
traffic = traffic.sort_values('neighborhood')
temp = traffic.groupby('neighborhood').sum()['average_daily_car_traffic']
temp2 = traffic['neighborhood'].value_counts()
temp2 =temp2.sort_index()
#cdivide the number of cars by the number of counters
temp= (temp/temp2).round()

neighborhood
Allegheny Center     2386.0
Allegheny West        477.0
Arlington            4571.0
Beechview            4255.0
Beltzhoover          1577.0
                     ...   
Summer Hill           937.0
Upper Hill           2860.0
West Oakland         2585.0
Westwood            15400.0
Windgap              3062.0
Name: average_daily_car_traffic, Length: 64, dtype: float64

In [81]:
temp= (temp/temp2).round()

In [84]:
temp.sort_values()

neighborhood
South Side Flats              115.0
Lincoln Place                 126.0
Lincoln-Lemington-Belmar      195.0
Fineview                      419.0
Allegheny West                477.0
                             ...   
East Hills                   8126.0
Strip District               9692.0
North Shore                 10350.0
Crafton Heights             11500.0
Regent Square               16729.0
Length: 64, dtype: float64

This is much better. Now we have data that represents the average number of cars per counter in each neighborhood, excluding those neighborhoods with no data.