# Final Project
Fall 2018

Xavier Hung

Weiyi Chen

### Problem: Do red light cameras result in more speeding in Chicago?

The red light camera is intended to increase public safety by preventing running a red light, but it that really helpful?

We did a quick research on google and found some articles, most of them state red light camera result in more accidents, and the only purpose of the red light camera is generating millions of revenue for the government. However, according to the Insurance Institute for Highway Safety (IIHS),  fatal accidents caused by speeding can be reduced by up to 19% after the installation of red light cameras. 

To verify this statement, we focus on the association between red light cameras and speeding. We will compare the number of speeding that occurred at intersections before and after the use of red light camera to solve this problem. 

####  data extraction

We extracted data from the City of Chicago, an open data platform. There are four CSV files, red light camera locations, red light camera violations, speed camera locations, and speed camera violations (details of these data can be found at https://data.cityofchicago.org/browse?q=red+light+camera&sortBy=relevance).





In [2]:
import os
import pandas as pd
import numpy as np
import math
from ast import literal_eval
from bokeh.plotting import figure, show, output_notebook
from bokeh.models.annotations import Title
from bokeh.tile_providers import CARTODBPOSITRON
import warnings
warnings.filterwarnings('ignore')

redlight_loc = pd.read_csv('red-light-camera-locations.csv')
redlight_violations = pd.read_csv('red-light-camera-violations.csv')

speed_loc = pd.read_csv('speed-camera-locations.csv')
speed_violations = pd.read_csv('speed-camera-violations.csv')

##  missing data?

`speed_loc` and `redlight_loc` have columns, latitude and longitude. We use Mercator projection ???to convert latitude and longitude into coordinates, and plot the locations of red light cameras and speed cameras.

In [3]:
# plot red light cameras and speed cameras locations
speed_loc["COORDINATES"] = '('+ speed_loc["LATITUDE"].astype(str) +','+ speed_loc["LONGITUDE"].astype(str) + ')'
redlight_loc["COORDINATES"] = '('+ redlight_loc["LATITUDE"].astype(str) +','+ redlight_loc["LONGITUDE"].astype(str) + ')'

def merc(Coords):
    """
    
    
    """
    Coordinates = literal_eval(Coords)
    lat = Coordinates[0]
    lon = Coordinates[1]
    
    r_major = 6378137.000
    x = r_major * math.radians(lon)
    scale = x/lon
    y = 180.0/math.pi * math.log(math.tan(math.pi/4.0 + 
        lat * (math.pi/180.0)/2.0)) * scale
    return (x, y)

speed_loc['coords_x'] = speed_loc['COORDINATES'].apply(lambda x: merc(x)[0])
speed_loc['coords_y'] = speed_loc['COORDINATES'].apply(lambda x: merc(x)[1])

redlight_loc['coords_x'] = redlight_loc['COORDINATES'].apply(lambda x: merc(x)[0])
redlight_loc['coords_y'] = redlight_loc['COORDINATES'].apply(lambda x: merc(x)[1])

p = figure(x_range=(-9790000, -9735000), y_range=(5105000, 5165000),
           x_axis_type="mercator", y_axis_type="mercator")
p.add_tile(CARTODBPOSITRON)

p.circle(x = redlight_loc['coords_x'], y = redlight_loc['coords_y'], legend = "Red Light Camera", fill_color="#FF0000")
p.circle(x = speed_loc['coords_x'], y = speed_loc['coords_y'], legend = "Speed Camera")

t = Title()
t.text = 'Locations of All Speed Cameras and Red Light Cameras'
p.title = t

p.legend.location = "top_right"
p.legend.click_policy="hide"

output_notebook()
show(p)

In Chicago, the 0.0001 longitude is about 8 meters when latitudes are same. When longtitude are same, the 0.0001 latitude is about 11 meters. 
link: https://www.movable-type.co.uk/scripts/latlong.html

We are extracting the red light cameras and speed cameras that are near each other by 0.0001 latitude and longitude. 

In [4]:
# redlight_loc_copy is a subset of redlight_loc and contains the locations of red light cameras near speed cameras
# speed_loc_copy is a subset of speed_loc and contains the locations of speed cameras near red light cameras
redlight_loc_copy = pd.DataFrame(columns = redlight_loc.columns.values)
speed_loc_copy = pd.DataFrame(columns = speed_loc.columns.values)

# find red light cameras near speed cameras
for i in range(0,speed_loc.shape[0]):
    for j in range(0,redlight_loc.shape[0]):
        if abs(speed_loc.iloc[i,4] - redlight_loc.iloc[j,5]) < 0.0001 and abs(speed_loc.iloc[i,5] - redlight_loc.iloc[j,6] < 0.0001):
            redlight_loc_copy = redlight_loc_copy.append(redlight_loc.iloc[j], ignore_index = True)
            speed_loc_copy = speed_loc_copy.append(speed_loc.iloc[i], ignore_index = True)

In [5]:
# change format of addresses in speed_loc_copy to match format of addresses in speed_violations
remove = ["(Speed", "Camera)","Ave","ST","Rd","St","Blvd"]


for i in range(0,speed_loc_copy.shape[0]):
    speed_loc_copy.iloc[i,0] = " ".join([word for word in speed_loc_copy.iloc[i,0].split() if word not in remove])

speed_violations_copy = pd.DataFrame(columns = speed_violations.columns.values)    
    
# subset speed violations where speed cameras are near red light cameras
for i in range(0, speed_loc_copy.shape[0]):
    address = speed_violations[speed_violations['ADDRESS'].str.contains(speed_loc_copy.iloc[i,0].upper())]
    speed_violations_copy = speed_violations_copy.append(address, ignore_index = True)

In [7]:
redlight_violations_copy = pd.DataFrame(columns = redlight_violations.columns.values)

# subset red light violations where speed cameras are near red light camer
# this will take a while
for i in range (0, redlight_violations.shape[0]):
    for j in range(0, redlight_loc_copy.shape[0]):
        intersection1 = redlight_violations.iloc[i,0].replace(' AND ', '/').replace(' and ', '/').split('/')
        intersection2 = redlight_loc_copy.iloc[j,0].split('-')
        if len(intersection1) > 1 and len(intersection2) > 1:
            if intersection1[0] == intersection2[0].upper() or intersection1[0] == intersection2[1].upper():
                if intersection1[1] == intersection2[0].upper() or intersection1[1] == intersection2[1].upper():
                    redlight_violations_copy = redlight_violations_copy.append(redlight_violations.iloc[i], ignore_index = True)

In [6]:
speed_loc_copy['coords_x'] = speed_loc_copy['COORDINATES'].apply(lambda x: merc(x)[0])
speed_loc_copy['coords_y'] = speed_loc_copy['COORDINATES'].apply(lambda x: merc(x)[1])

redlight_loc_copy['coords_x'] = redlight_loc_copy['COORDINATES'].apply(lambda x: merc(x)[0])
redlight_loc_copy['coords_y'] = redlight_loc_copy['COORDINATES'].apply(lambda x: merc(x)[1])

p = figure(x_range=(-9780000, -9745000), y_range=(5120000, 5160000),
           x_axis_type="mercator", y_axis_type="mercator")
p.add_tile(CARTODBPOSITRON)

p.circle(x = redlight_loc_copy['coords_x'], y = redlight_loc_copy['coords_y'], legend = "Red Light Camera", fill_color="#FF0000")
p.circle(x = speed_loc_copy['coords_x'], y = speed_loc_copy['coords_y'], legend = "Speed Camera",)

t = Title()
t.text = 'Speed Cameras in Close Proximity to Red Light Cameras'
p.title = t

p.legend.location = "top_right"
p.legend.click_policy="hide"

output_notebook()
show(p)

In [7]:
#find top three intersections with speed cameras nearby that have the most red light violations on a single day
redlight_violations_copy = redlight_violations_copy.sort_values('VIOLATIONS', ascending = False)
redlight_violations_copy.drop_duplicates(subset = 'INTERSECTION', keep = "first").reset_index(drop = True).head()

Unnamed: 0,INTERSECTION,CAMERA ID,ADDRESS,VIOLATION DATE,VIOLATIONS,X COORDINATE,Y COORDINATE,LATITUDE,LONGITUDE,LOCATION
200291,CICERO AND I55,2251.0,4200 S CICERO AVENUE,2016-10-09T00:00:00,186,1.145024e+06,1.876358e+06,41.816729,-87.743537,"{'needs_recoding': False, 'longitude': '-87.74..."
395509,LAKE AND UPPER WACKER,3052.0,340 W UPPER WACKER DR,2018-05-27T00:00:00,143,1.173895e+06,1.901848e+06,41.886082,-87.636870,"{'needs_recoding': False, 'longitude': '-87.63..."
189548,LAKE SHORE DR AND BELMONT,1413.0,400 W BELMONT AVE,2016-08-27T00:00:00,102,1.172982e+06,1.921577e+06,41.940241,-87.639639,"{'needs_recoding': False, 'longitude': '-87.63..."
180725,STATE AND 79TH,2654.0,1 E 79TH STREET,2016-07-23T00:00:00,93,1.177680e+06,1.852598e+06,41.750852,-87.624464,"{'needs_recoding': False, 'longitude': '-87.62..."
320890,VAN BUREN AND WESTERN,2052.0,400 S WESTERN AVENUE,2017-10-08T00:00:00,90,1.160432e+06,1.898092e+06,41.876065,-87.686416,"{'needs_recoding': False, 'longitude': '-87.68..."
167405,CALIFORNIA AND DIVERSEY,1514.0,2800 W DIVERSEY,2016-05-22T00:00:00,83,1.157213e+06,1.918527e+06,41.932205,-87.697679,"{'needs_recoding': False, 'longitude': '-87.69..."
181259,ARCHER AND CICERO,2081.0,5200 S CICERO AVE,2016-07-16T00:00:00,72,1.145196e+06,1.869738e+06,41.798558,-87.743072,"{'needs_recoding': False, 'longitude': '-87.74..."
170697,LAFAYETTE AND 87TH,2503.0,30 W 87TH STREET,2016-06-11T00:00:00,72,1.177408e+06,1.847321e+06,41.736376,-87.625621,"{'needs_recoding': False, 'longitude': '-87.62..."
327662,63RD AND STATE,2714.0,1 E 63RD ST,2017-10-29T00:00:00,69,1.177368e+06,1.863199e+06,41.779949,-87.625290,"{'needs_recoding': False, 'longitude': '-87.62..."
171124,WENTWORTH AND GARFIELD,2261.0,5500 S WENTWORTH AVEN,2016-06-11T00:00:00,68,,,,,
