## Speed Lights and Red Light Cameras Preprocessing and Initial Analysis with Comparisons

This notebook will explore:

   1. Managing and comparing the two dataframes
   2. Features that can be extracted for further interest
   3. Basic comparisons and inferences with other datasets

## Basic Data Exploration

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
fixed = pd.read_csv("Fixed_Speed_Cameras.csv")
fixed

Unnamed: 0,address,direction,street,crossStreet,intersection,Location 1
0,S CATON AVE & BENSON AVE,N/B,Caton Ave,Benson Ave,Caton Ave & Benson Ave,"(39.2693779962, -76.6688185297)"
1,S CATON AVE & BENSON AVE,S/B,Caton Ave,Benson Ave,Caton Ave & Benson Ave,"(39.2693157898, -76.6689698176)"
2,WILKENS AVE & PINE HEIGHTS AVE,E/B,Wilkens Ave,Pine Heights,Wilkens Ave & Pine Heights,"(39.2720252302, -76.676960806)"
3,THE ALAMEDA & E 33RD ST,S/B,The Alameda,33rd St,The Alameda & 33rd St,"(39.3285013141, -76.5953545714)"
4,E 33RD ST & THE ALAMEDA,E/B,E 33rd,The Alameda,E 33rd & The Alameda,"(39.3283410623, -76.5953594625)"
5,ERDMAN AVE & N MACON ST,E/B,Erdman,Macon St,Erdman & Macon St,"(39.3068045671, -76.5593167803)"
6,ERDMAN AVE & N MACON ST,W/B,Erdman,Macon St,Erdman & Macon St,"(39.306966535, -76.5593122365)"
7,N CHARLES ST & E LAKE AVE,S/B,Charles,Lake Ave,Charles & Lake Ave,"(39.3690535299, -76.625826716)"
8,E MADISON ST & N CAROLINE ST,W/B,Madison,Caroline St,Madison & Caroline St,"(39.2993257666, -76.5976760827)"
9,ORLEANS ST & N LINWOOD AVE,E/B,Orleans,Linwood Ave,Orleans & Linwood Ave,"(39.2958661981, -76.5764270078)"


In [3]:
red = pd.read_csv("Red_Light_Cameras.csv")
red

Unnamed: 0,address,enforcement,installationDate,direction,street,block,crossStreet,intersection,Location 1
0,HARFORD RD & CHRISTOPHER AVE,Caton Avenue,11/04/2009,NB,Harford Road\n,6100,Christopher Ave,Harford Road\n & Christopher Ave,"(39.358332711, -76.5562473336)"
1,E 33RD ST & THE ALAMEDA,Caton Avenue,10/01/2009,EB,33rd \n,1300,The Alameda,33 rd & The Alameda,"(39.3283169575, -76.596564824)"
2,THE ALAMEDA & E 33RD ST,Charles Street,10/01/2009,SB,Alameda \n,3300,33rd St,Almeda & 33rd St,"(39.3285013787, -76.5953547798)"
3,S CATON AVE & BENSON AVE,Cold Spring Lane,10/30/2009,NB,Caton Ave \n,1100,Benson Ave,Caton Ave & Benson Ave,"(39.2693780607, -76.6688187378)"
4,S CATON AVE & BENSON AVE,Cold Spring Lane,02/03/2010,SB,Caton Ave \n,1000,Benson Ave,Caton Ave & Benson Ave,"(39.2693158543, -76.6689700257)"
5,N CHARLES ST & E LAKE AVE,Cold Spring Lane,05/20/2010,NB,Charles St. \n,5800,Lake Ave,Charles St. & Lake Ave,"(39.3690535944, -76.6258269246)"
6,E COLD SPRING LN & HILLEN RD,Cold Spring Lane,03/22/2010,WB,Cold Spring Lane \n,1700,Hillen Road,Cold Spring Lane & Hillen Road,"(39.3459075364, -76.5859275989)"
7,W COLD SPRING LN & TAMARIND RD,Eastern Avenue,11/02/2009,EB,Cold Spring Lane \n,2200,Tamarind,Cold Spring Lane & Tamarind,"(39.3438993775, -76.6519410001)"
8,W COLD SPRING LN & ROLAND AVE,Edmondson Avenue,11/04/2009,EB,Cold Spring Lane\n,500,Roland Ave,Cold Spring Lane & Roland Ave,"(39.3439061246, -76.6354264269)"
9,E COLD SPRING LN & LOCH RAVEN BLVD,Edmondson Avenue,03/04/2010,WB,Cold Spring Lane\n,1500,Loch Raven Blvd,Cold Spring Lane & Loch Raven Blvd,"(39.3461802891, -76.5919910533)"


Both CSV files give a list of the 80 speed cameras and 62 red light cameras on their respecive streets as well as orientations. Unfortunately, the data in its current state is not very useful when one tries to understand the general overview of traffic in each area.

In [5]:
fixed_count = pd.DataFrame(fixed.address.value_counts().reset_index())
fixed_count.columns = ['Address', 'Speed Cameras']
fixed_count

Unnamed: 0,Address,Speed Cameras
0,W NORTHERN PKWY & GREENSPRING AVE,2
1,ERDMAN AVE & N MACON ST,2
2,PARK HEIGHTS AVE & HAYWARD AVE,2
3,EDMONDSON AVE & N ATHOL AVE,2
4,S CATON AVE & BENSON AVE,2
5,RUSSELL ST & W HAMBURG ST,2
6,LIBERTY HTS & HILLSDALE RD,2
7,GWYNNS FLS & GARRISON BLVD,2
8,SINCLAIR LN & SHANNON DR,2
9,YORK RD & GITTINGS AVE,1


In [4]:
red_count = pd.DataFrame(red.address.value_counts().reset_index())
red_count.columns = ['Address', 'Red Cameras']
red_count

Unnamed: 0,Address,Red Cameras
0,W NORTHERN PKWY & GREENSPRING AVE,2
1,WABASH AVE & W COLD SPRING LN,2
2,PARK HEIGHTS AVE & HAYWARD AVE,2
3,SINCLAIR LN & SHANNON DR,2
4,WALTHER AVE & GLENMORE AVE,2
5,S CATON AVE & BENSON AVE,2
6,ERDMAN AVE & N MACON ST,2
7,E COLD SPRING LN & LOCH RAVEN BLVD,2
8,KELLY AVE & BONNIE VIEW DR,2
9,HARFORD RD & CHRISTOPHER AVE,1


## Data Comparison
Making a count of how many cameras of each on a given intersection gives a much better idea of general vehicle congestion, as traffic varies by intersection. Using this deduction, merging will display the number of both types of cameras at each intersection.

In [8]:
merged_df = pd.merge(fixed_count, red_count, on="Address")
merged_df 

Unnamed: 0,Address,Speed Cameras,Red Cameras
0,W NORTHERN PKWY & GREENSPRING AVE,2,2
1,ERDMAN AVE & N MACON ST,2,2
2,PARK HEIGHTS AVE & HAYWARD AVE,2,2
3,EDMONDSON AVE & N ATHOL AVE,2,1
4,S CATON AVE & BENSON AVE,2,2
5,SINCLAIR LN & SHANNON DR,2,2
6,SINCLAIR LN & MORAVIA RD,1,1
7,THE ALAMEDA & E 33RD ST,1,1
8,W FRANKLIN ST & CATHEDRAL ST,1,1
9,E NORTHERN PKWY & SPRINGLAKE WAY,1,1


In [9]:
merged_df.loc[(merged_df['Speed Cameras'] == 2) & (merged_df['Red Cameras'] == 2)]

Unnamed: 0,Address,Speed Cameras,Red Cameras
0,W NORTHERN PKWY & GREENSPRING AVE,2,2
1,ERDMAN AVE & N MACON ST,2,2
2,PARK HEIGHTS AVE & HAYWARD AVE,2,2
4,S CATON AVE & BENSON AVE,2,2
5,SINCLAIR LN & SHANNON DR,2,2


Focusing on the mode of both cameras possibly suggests that these are the five intersections have the most influx of vehicle traffic, as they are the most heavily surveillanced. With this knowledge, comparisons with other datasets can begin.

In [23]:
citation = pd.read_csv("Parking_Citations.csv")
citation.Address.value_counts()

  interactivity=interactivity, compiler=compiler, result=result)


1200 BLK SOUTH CATON AVE NB       22638
2700 BLK GWYNNS FALLS PKY WB      17216
2600 BLOCK OF GWYNNS FALLS PKW    11234
U/B CROSS ST                      10688
5600 BLK MORAVIA RD NB             9529
4400 BLK EDMONDSON AVE WB          7869
4400 BLK EDMONDSON AVE EB          7709
U/B CALVERT ST                     7406
1000 CHARLES ST                    7003
5600 BLK MORAVIA RD SB             6902
1300 BLK WEST COLD SPRING LN E     6761
2300 NEWKIRK ST                    6346
800 36TH ST                        6245
1700 THAMES ST                     6099
5900 BLK WALTHER AVE NB            6048
U/B PACA ST                        5937
1200 CHARLES ST                    5929
4300 BLK ERDMAN AVE NB             5846
4300 BLK ERDMAN AVE SB             5821
800 BROADWAY                       5768
3900 BLK THE ALAMEDA WB            5564
U/B COMMERCE ST                    5517
U/B SOUTH ST                       5486
1101 CATON AVE NB                  5320
U/B GAY ST                         5310


The parking citations dataset is a huge compilation of tickets issued on a given street. Notice the mode of tickets issued is along South Caton Avenue, which is one of the streets that has the most camera surveillance. Erdman Ave is another street under the same criteria, with both northbound and southbound each listed at about 5800 citations.

In [21]:
towing = pd.read_csv("DOT_Towing.csv")
towing.towedFromLocation.value_counts()

UNIT S GREENE STREET          235
0                             219
700 W LOMBARD ST              197
200 E LEXINGTON ST            172
200 E 33RD ST                 164
100 S CHARLES ST              160
800 W LOMBARD ST              155
S CALVERT ST                  151
700 W LOMBARD STREET          150
600 ST PAUL STREET            141
100 E MADISON STREET          136
UB S CALVERT ST               124
300 N CALVERT ST              119
100 E BIDDLE ST               118
U/B SOUTH CALVERT STREET      107
1300 N CHARLES ST             105
100 PARK AVE                   94
100 N LIBERTY STREET           93
4500 HILLEN RD                 92
800 N CHARLES ST               91
600 N CAROLINE STREET          88
800 N EUTAW ST                 87
SOUTH ST                       87
2700 E FAYETTE STREET          84
UB SOUTH ST                    83
200 EAST LEXINGTON STREET      82
500  Fallsway                  79
500 W FAYETTE STREET           78
700 WEST LOMBARD STREET        78
100 E FAYETTE 

From the towing dataset, initially there could not be any comparisons drawn since none of the heavily congested streets could be located from the most frequent towings on the list. However, one could conclude that the act of towing another vehicle would obstruct traffic, as there would need to be a diversion to allow workers to manage the tow truck. Therefore, towing on congested streets would not be optimal.