### CSDA1050 Capstone Project, Sprint 2
### The Effectiveness of Red Light Cameras in Toronto 
### by Sima Sarvate

### Objectives of Sprint 2

1.  The refinements to the project due to findings of the EDA in sprint 1.
2.	The main analysis and modelling for the project.


### Prerequisites

The follow packages need to be installed for this sprint:

OSMnx

folium

To do this, use the following command at the Anaconda prompt (notice that folium is installed as part of the OSMnx installation):

    conda install -c conda-forge osmnx
    
**Note:** During this process, the following packages will also be installed:
    
    altair
    
    branca
    
    folium
    
    geographiclib
    
    geopy
    
    vincent


### 1. Refinements to the Project

After a review of sprint 1, in particular the visual of the red light camera geopoints alongside the accident geopoints on the map of Toronto as well as the plot of the accident geopoints by themselves, the initial general research question posed in the proposal has evolved into a more specific question as follows:

**Do the 77 red light cameras installed around the City of Toronto reduce the number of red light running accidents in the areas surrounding those intersections?**

### 2. Project Analysis and Modelling

### Methodology

1. Interactive map visualisation of cameras geopoints alongside accidents geopoints using the folium package in Python. This will allow us to examine the red light cameras with respect to how the accidents are situated around them by drilling down to the intersection level.


2. Calculation of nearest accidents to each camera. We will calculate the distance (Euclidean) between each red light camera and each accident. The returned distance is based on the projection of the points (degrees in WGS84, meters in UTM). We will define a catchment area, which would ideally be decided by our client (in this case The City of Toronto). Accidents for each red light camera with distances that fall within this catchment area will be flagged and the details of these cameras and accidents printed for further review. ** Presently, the catchment area is set to 1km**.

In [1]:
# import packages
import osmnx as ox
import folium
#
import geopandas as gpd
import numpy as np
import pandas as pd
import scipy as sp
from shapely.geometry import Point

import missingno as msn

import seaborn as sns
import matplotlib.pyplot as plt

% matplotlib inline

### Read Cameras and KSI datasets and prepare for analysis.

In [2]:
# read red light camera data
cameras = gpd.read_file("Cameras.geojson")
cameras.head()

Unnamed: 0,_id,INTERSECTION_ID,LINEAR_NAME_FULL_1,LINEAR_NAME_FULL_2,ID,X,Y,LONGITUDE,LATITUDE,OBJECTID,geometry
0,1233,13457150,Dufferin St,Glencairn Ave,19,,,-79.45318,43.706983,1,POINT (-79.45318010588022 43.70698274215368)
1,1234,13464191,Dupont St,Lansdowne Ave,20,,,-79.446461,43.666726,2,POINT (-79.4464614030731 43.6667255135304)
2,1235,13465569,Dundas St E,Jarvis St,21,,,-79.374567,43.657067,3,POINT (-79.37456691819334 43.65706708759534)
3,1236,13464080,Coxwell Ave,Eastern Ave,22,,,-79.316215,43.665494,4,POINT (-79.3162154693632 43.66549388674469)
4,1237,13453221,Birchmount Rd,Eglinton Ave E,23,,,-79.27786,43.729962,5,POINT (-79.27785952913113 43.72996210644608)


In [3]:
# read in red light traffic data
red_light = pd.read_csv("Red_Light.csv")
red_light.head()

Unnamed: 0,X,Y,Index_,ACCNUM,YEAR,DATE,TIME,Hour,STREET1,STREET2,...,AG_DRIV,REDLIGHT,ALCOHOL,DISABILITY,Division,Ward_Name,Ward_ID,Hood_ID,Hood_Name,ObjectId
0,-79.561664,43.645896,80489774,5001661734,2015,2015-09-26T04:00:00.000Z,1,0,BURNHAMTHORPE RD,427 C N BURNHAMTHORPE RAMP,...,Yes,Yes,Yes,,22,,,14,,1
1,-79.561664,43.645896,80489775,5001661734,2015,2015-09-26T04:00:00.000Z,1,0,BURNHAMTHORPE RD,427 C N BURNHAMTHORPE RAMP,...,Yes,Yes,Yes,,22,,,14,,2
2,-79.33899,43.658345,5345498,1031943,2008,2008-03-13T04:00:00.000Z,10,0,CARLAW AVE,EASTERN AVE,...,Yes,Yes,,,55,,,70,,3
3,-79.33899,43.658345,5345499,1031943,2008,2008-03-13T04:00:00.000Z,10,0,CARLAW AVE,EASTERN AVE,...,Yes,Yes,,,55,,,70,,4
4,-79.33899,43.658345,5345500,1031943,2008,2008-03-13T04:00:00.000Z,10,0,CARLAW AVE,EASTERN AVE,...,Yes,Yes,,,55,,,70,,5


In [4]:
# get unique accidents from KSI dataset
red_light_unique = red_light.drop_duplicates('ACCNUM')
red_light_unique.head()

Unnamed: 0,X,Y,Index_,ACCNUM,YEAR,DATE,TIME,Hour,STREET1,STREET2,...,AG_DRIV,REDLIGHT,ALCOHOL,DISABILITY,Division,Ward_Name,Ward_ID,Hood_ID,Hood_Name,ObjectId
0,-79.561664,43.645896,80489774,5001661734,2015,2015-09-26T04:00:00.000Z,1,0,BURNHAMTHORPE RD,427 C N BURNHAMTHORPE RAMP,...,Yes,Yes,Yes,,22,,,14,,1
2,-79.33899,43.658345,5345498,1031943,2008,2008-03-13T04:00:00.000Z,10,0,CARLAW AVE,EASTERN AVE,...,Yes,Yes,,,55,,,70,,3
7,-79.215801,43.761779,80783074,7000557720,2017,2017-03-30T04:00:00.000Z,15,0,SCARBOROUGH GOLF CLUB RD,LAWRENCE AVE E,...,Yes,Yes,,,43,,,137,,8
10,-79.470042,43.787211,80931233,7000036465,2017,2017-01-07T05:00:00.000Z,16,0,DUFFERIN ST,STEELES AVE W,...,Yes,Yes,Yes,,32,,,34,,11
13,-79.300746,43.761183,80563132,6001066822,2016,2016-06-19T04:00:00.000Z,20,0,ELLESMERE RD,WARDEN AVE,...,Yes,Yes,,,41,,,119,,14


In [5]:
# Confirm the number of unique accidents
print ("Number of unique accidents: ", red_light_unique.shape[0])

Number of unique accidents:  263


In [6]:
type(cameras.geometry[0])

shapely.geometry.point.Point

In [7]:
# Convert our red light data DataFrame into a GeoDataFrame and create geopoints using the longitude and latitude 
# values for each accident location into a geopoint. 
# We want a geopoint for each KSI accident
geo_red_light = gpd.GeoDataFrame(
    red_light_unique, geometry=gpd.points_from_xy(red_light_unique.LONGITUDE, red_light_unique.LATITUDE))
geo_red_light.head()

Unnamed: 0,X,Y,Index_,ACCNUM,YEAR,DATE,TIME,Hour,STREET1,STREET2,...,REDLIGHT,ALCOHOL,DISABILITY,Division,Ward_Name,Ward_ID,Hood_ID,Hood_Name,ObjectId,geometry
0,-79.561664,43.645896,80489774,5001661734,2015,2015-09-26T04:00:00.000Z,1,0,BURNHAMTHORPE RD,427 C N BURNHAMTHORPE RAMP,...,Yes,Yes,,22,,,14,,1,POINT (-79.56166400000001 43.645896)
2,-79.33899,43.658345,5345498,1031943,2008,2008-03-13T04:00:00.000Z,10,0,CARLAW AVE,EASTERN AVE,...,Yes,,,55,,,70,,3,POINT (-79.33899 43.658345)
7,-79.215801,43.761779,80783074,7000557720,2017,2017-03-30T04:00:00.000Z,15,0,SCARBOROUGH GOLF CLUB RD,LAWRENCE AVE E,...,Yes,,,43,,,137,,8,POINT (-79.215801 43.761779)
10,-79.470042,43.787211,80931233,7000036465,2017,2017-01-07T05:00:00.000Z,16,0,DUFFERIN ST,STEELES AVE W,...,Yes,Yes,,32,,,34,,11,POINT (-79.47004200000001 43.787211)
13,-79.300746,43.761183,80563132,6001066822,2016,2016-06-19T04:00:00.000Z,20,0,ELLESMERE RD,WARDEN AVE,...,Yes,,,41,,,119,,14,POINT (-79.30074599999999 43.761183)


In [8]:
print ("Number of rows in geo accidents: ", geo_red_light.shape[0])

Number of rows in geo accidents:  263


In [9]:
# type of accident geometry
type(red_light_unique.geometry)

pandas.core.series.Series

## Interactive map visualisation of camera geopoints alongside accident geopoints.

In [10]:
# create map object of Toronto using folium and display
coords_TO = [43.6532,-79.3832]
map_TO = folium.Map(location=coords_TO, zoom_start=12)
map_TO

In [11]:
# add red light camera markers to map of Toronto with a popup of camera id number for each marker
for i in range(0,len(cameras)): 
    folium.Marker([cameras.iloc[i]['LATITUDE'], cameras.iloc[i]['LONGITUDE']], 
                  popup=cameras.iloc[i]['_id'],
                  icon=folium.Icon(color='red')).add_to(map_TO) 
map_TO

In [12]:
# add accident markers to map of Toronto with a popup of accident number for each marker
for i in range(0,len(geo_red_light)): 
    folium.Marker([geo_red_light.iloc[i]['LATITUDE'], geo_red_light.iloc[i]['LONGITUDE']], 
                  geo_red_light.iloc[i]['ACCNUM'],
                  icon=folium.Icon(color='blue')).add_to(map_TO) 
map_TO

## Calculation of nearest accidents to each camera.

In [14]:
# calculate the Euclidean distance between each camera and each accident
# flag all accidents within a decided upon catchment area (currently set to 1km)
#
# set a constant for the meters in a decimal degree
kms_per_decimal_degree = 111.32
# define the catchment area
catchment_area = 1
# calculate the Euclidean distance between each camera and each accident, flag the accidents that are within the 
# catchment area for each camera 
#
# initialize count of close accidents (across all intersections), count of cameras with zero accidents, count of cameras 
# with 1 or more accidents
count_close_accidents = 0
count_zero_accident_cameras = 0
count_nonzero_accident_cameras = 0
#
# distance calculations
for x in range(0,len(cameras)):
# initialize counter for accidents at intersection level    
    print ("CAMERA ID: ", cameras.iloc[x]['_id'])
    print ("Red light camera intersection: ", cameras.iloc[x]['LINEAR_NAME_FULL_1'], cameras.iloc[x]['LINEAR_NAME_FULL_2'])
    print (" ")
    count_close_accidents_camera = 0
    point1 = cameras.iloc[x]['geometry']
    for y in range(0,len(geo_red_light)):
        point2 = geo_red_light.iloc[y]['geometry']
        distance = point1.distance(point2)
        distance_meters = distance * kms_per_decimal_degree
        if distance_meters <= catchment_area:
            count_close_accidents = count_close_accidents + 1
            count_close_accidents_camera = count_close_accidents_camera + 1
            print ("Close accident #:  ", count_close_accidents)            
            print ("Accident intersection: ", geo_red_light.iloc[y]['STREET1'], geo_red_light.iloc[y]['STREET2'])
            print ("Distance between the points in kilometers is ", distance_meters)
            print (" ") 
    if count_close_accidents_camera > 0:
        count_nonzero_accident_cameras = count_nonzero_accident_cameras + 1
    else:
            count_zero_accident_cameras = count_zero_accident_cameras + 1
    print ("Number of red light accidents in catchment area of 1km: ", count_close_accidents_camera)
    print ("Percentage of red light accidents in catchment area of 1km: ", round(((count_close_accidents_camera/geo_red_light.shape[0]) * 100), 2), "%")   
    print (" ")
    print (" ")
print ("Number of cameras with accidents in catchment area ", count_nonzero_accident_cameras) 
print ("Number of cameras with zero accidents in catchment area ", count_zero_accident_cameras) 
print ("Total number of red light accidents in catchment area of 1km: ", count_close_accidents)
print ("Percentage of total number of red light accidents in catchment area of 1km: ",round((count_close_accidents/geo_red_light.shape[0]) * 100, 2), "%")

CAMERA ID:  1233
Red light camera intersection:  Dufferin St Glencairn Ave
 
Close accident #:   1
Accident intersection:  DUFFERIN ST BRIAR HILL AVE
Distance between the points in kilometers is  0.5376610600545126
 
Close accident #:   2
Accident intersection:  DUFFERIN ST WINGOLD AVE
Distance between the points in kilometers is  0.35764176014148935
 
Number of red light accidents in catchment area of 1km:  2
Percentage of red light accidents in catchment area of 1km:  0.76 %
 
 
CAMERA ID:  1234
Red light camera intersection:  Dupont St Lansdowne Ave
 
Close accident #:   3
Accident intersection:  DUFFERIN ST DUPONT ST
Distance between the points in kilometers is  0.8520218364133292
 
Close accident #:   4
Accident intersection:  DAVENPORT RD LANSDOWNE AVE
Distance between the points in kilometers is  0.5739678492967292
 
Close accident #:   5
Accident intersection:  SYMINGTON AVE DUPONT ST
Distance between the points in kilometers is  0.5375849654392421
 
Close accident #:   6
Accid

Close accident #:   40
Accident intersection:  KEELE ST LAWRENCE AVE W
Distance between the points in kilometers is  0.002025401429295463
 
Close accident #:   41
Accident intersection:  LAWRENCE AVE W BENTON RD
Distance between the points in kilometers is  0.7116030014204687
 
Number of red light accidents in catchment area of 1km:  2
Percentage of red light accidents in catchment area of 1km:  0.76 %
 
 
CAMERA ID:  1251
Red light camera intersection:  Wilson Ave Keele St
 
Number of red light accidents in catchment area of 1km:  0
Percentage of red light accidents in catchment area of 1km:  0.0 %
 
 
CAMERA ID:  1252
Red light camera intersection:  Keele St Rogers Rd
 
Number of red light accidents in catchment area of 1km:  0
Percentage of red light accidents in catchment area of 1km:  0.0 %
 
 
CAMERA ID:  1253
Red light camera intersection:  Kingston Rd Port Union Rd
 
Close accident #:   42
Accident intersection:  KINGSTON RD SHEPPARD AVE E
Distance between the points in kilomet

Close accident #:   69
Accident intersection:  ST CLAIR AVE E O CONNOR DR
Distance between the points in kilometers is  0.9898791586214608
 
Number of red light accidents in catchment area of 1km:  1
Percentage of red light accidents in catchment area of 1km:  0.38 %
 
 
CAMERA ID:  1271
Red light camera intersection:  Queen St W Jameson Ave
 
Number of red light accidents in catchment area of 1km:  0
Percentage of red light accidents in catchment area of 1km:  0.0 %
 
 
CAMERA ID:  1272
Red light camera intersection:  Woodbine Ave Queen St E
 
Close accident #:   70
Accident intersection:  WOODBINE Aven KINGSTON Road
Distance between the points in kilometers is  0.6236551191962003
 
Number of red light accidents in catchment area of 1km:  1
Percentage of red light accidents in catchment area of 1km:  0.38 %
 
 
CAMERA ID:  1273
Red light camera intersection:  Richmond St E Parliament St
 
Close accident #:   71
Accident intersection:  PARLIAMENT St FRONT St E
Distance between the poin

Number of red light accidents in catchment area of 1km:  2
Percentage of red light accidents in catchment area of 1km:  0.76 %
 
 
CAMERA ID:  1296
Red light camera intersection:  Bayview Ave Cummer Ave
 
Number of red light accidents in catchment area of 1km:  0
Percentage of red light accidents in catchment area of 1km:  0.0 %
 
 
CAMERA ID:  1297
Red light camera intersection:  Bayview Ave Fifeshire Rd
 
Close accident #:   96
Accident intersection:  BAYVIEW AVE 401 C W BAYVIEW RAMP
Distance between the points in kilometers is  0.4462917094580085
 
Number of red light accidents in catchment area of 1km:  1
Percentage of red light accidents in catchment area of 1km:  0.38 %
 
 
CAMERA ID:  1298
Red light camera intersection:  Birchmount Rd Huntingwood Dr
 
Number of red light accidents in catchment area of 1km:  0
Percentage of red light accidents in catchment area of 1km:  0.0 %
 
 
CAMERA ID:  1299
Red light camera intersection:  Bloor St W Dundas St W
 
Close accident #:   97
Acci

## Findings and Insights from Interactive Map Coupled with Distance Calculations

Of the 263 red light running accidents, 98 of them are within a distance of 1km of the red light camera intersection, that is 37% of all red light running accidents occur within 1km of the intersection with a red light camera. Out of the 77 red light cameras installed currently, 37 of them had 0 accidents reported in the catchment area. The number of cameras reporting 1-10 accidents is 40. 

By looking at the interactive map of the red light camera markers alongside the accident markers we can clearly see that the downtown core of Toronto is a hotspot for accidents occurring in close vicinity of red lights cameras. In corroboration with the distance calculations we can see 3 important intersections in the downtown core caught in this process are:

1. **Camera ID 1249 (King & Jarvis)** - 8 accidents within the catchment area.
2. **Camera ID 1255 (Lower Jarvis & Esplanade)** - 10 accidents within catchment area.
3. **Camera ID 1259 (York & Lake Shore Blvd W)** - 6 accidents within the catchment area.

These 3 intersections account for almost 25% of the accidents within the defined catchment area.


By reviewing the distance calculations we find more red light camera IDs where 3 or more accidents are flagged within the catchment area:

1. **Camera ID 1234 (Dupont St Lansdowne Ave)** - 4 accidents within the catchment area.
2. **Camera ID 1235 (Dundas St E Jarvis St)** - 4 accidents within the catchment area.
3. **Camera ID 1240 (Eglinton Ave W Spadina Rd)** - 3 accidents within the catchment area. 
4. **Camera ID 1241 (Eglinton Ave E Victoria Park Ave)** - 3 accidents within the catchment area.
5. **Camera ID 1242 (1242 Ellesmere Rd Kennedy Rd)** - 5 accidents within the catchment area.
6. **Camera ID 1246 (1246 Islington Ave The Westway)** - 3 accidents within the catchment area.
7. **Camera ID 1255 (Eglinton Ave W Spadina Rd)** - 3 accidents within the catchment area.
8. **Camera ID 1273 (Richmond & Parliament)** - 4 accidents within the catchment area.
9. **Camera ID 1291 (Adelaide & Spadina)** - 3 accidents within the catchment area.

If we add another intersection to the above list of hotspot intersections, camera id 1242 at Ellesmere Rd & Kennedy Rd with 5 accidents in the catchment area, these 4 intersections would account for almost 30% of the accidents within the defined catchment area.