# Mapping Top 20 Locations

In this notebook we will look at maps of the top 20 locations, as well as the distances to the nearest parking lots and TTC stops

We will examine the following:

* Distance to closest parking lots for top 20
* Distance to closest TTC stop for top 20 infractions
* Geographic distribution (location) of top 20 infractions (count)
* Geographic distribution by ward for top 20 infractions (count)
* Geographic distribution by ward for top 20 infractions (revenue)

In [3]:
import numpy as np
import pandas as pd
import os
import matplotlib.pyplot as plt
from itertools import groupby
from operator import itemgetter
import seaborn as sns
import csv
import sys
from sklearn.preprocessing import LabelEncoder
import re
%matplotlib inline

We have saved the top 20 infractions by count and rev, so to do the above analysis we will load these in

In [4]:
top_20_freq = pd.read_csv('../input/parking-tickets-top-20/top_20_count_total.csv')
top_20_rev = pd.read_csv('../input/parking-tickets-top-20/top_20_revenue_total.csv')

In [5]:
top_20_rev.head(5)

Unnamed: 0,location2,count
0,410 College Street,3124470.0
1,40 Orchard View Boulevard,2665708.0
2,1 Brimley Road S,2584975.0
3,2075 Bayview Avenue,2413296.0
4,18 Grenville Street,2131720.0


In [6]:
top_20_freq.head(5)

Unnamed: 0,location2,count
0,2075 Bayview Avenue,111733
1,20 Edward Street,66580
2,1750 Finch Avenue E,52971
3,James Street,36384
4,1265 Military Trl,20638


In [7]:
# we also saved the geocodes for these locations
top_loc = pd.read_csv('../input/parking-tickets-top-20/top_locations.csv', 
                      usecols = ['location2', 'address', 'Latitude', 'Longitude'])

In [8]:
# add the coordinates for top 20s
top_20_freq = top_20_freq.merge(top_loc, on='location2', how='left')
top_20_rev = top_20_rev.merge(top_loc, on='location2', how='left')

In [9]:
top_20_freq.head(5)

Unnamed: 0,location2,count,address,Latitude,Longitude
0,2075 Bayview Avenue,111733,"2075 Bayview Ave, North York, Toronto, Ontario...",43.72149,-79.37881
1,20 Edward Street,66580,"20 Edward St, Toronto, Ontario, M5G 1C9",43.65706,-79.38219
2,1750 Finch Avenue E,52971,"1750 Finch Ave E, North York, Toronto, Ontario...",43.793941,-79.349402
3,James Street,36384,"James St, Toronto, Ontario, M5G",43.653072,-79.381099
4,1265 Military Trl,20638,"1265 Military Trl, Scarborough, Toronto, Ontar...",43.785471,-79.186076


## Mapping Top 20 Tickets

In [10]:
import folium
from folium import plugins

# mapbox (for satelites)
token = "pk.eyJ1IjoiZWxpdmlnbiIsImEiOiJja2dweDV0bXowN2ZiMnhudnVvcmJwZXd0In0.n76l8oXiAyuokFz7QQ7a_w"
tileurl = 'https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}@2x.png?access_token=' + str(token)

# initialize the map centered on center of toronto
m = folium.Map(location=[43.6532, -79.3832], zoom_start=10)#, tiles=tileurl, attr='Mapbox')

# add in the wards
folium.GeoJson('../input/toronto-wards/City Wards Data.geojson', name="geojson").add_to(m)

# add the satelite images
folium.TileLayer(tiles=tileurl, attr='Mapbox').add_to(m)

# draw circle markers for the top 10
for index, row in top_20_freq.iterrows():
    
    # html formatting for the tables that are displayed in the popups
    popuptext = row.to_frame().to_html(classes='table table-striped', header=False)
    
    # set the scroll bar for the popup
    html_str0 = '<div style="overflow-y: scroll; height: 100px;">\n'

    # add the circle markers
    folium.CircleMarker((row['Latitude'], row['Longitude']),
                         radius=row['count']/10000,    # scale the radius by the count
                         popup=folium.Popup(html_str0+popuptext, sticky=True),
                         color="red",
                         fill_color="red").add_to(m)
    

# add in a heat map
heatdf = [[row['Latitude'], row['Longitude']] for index, row in top_20_freq.iterrows()]
folium.plugins.HeatMap(heatdf).add_to(m)

# layer control
folium.LayerControl().add_to(m)

# add the toolbar on the left
from folium.plugins import Draw
draw = Draw()
draw.add_to(m)

m

Above we have mapped the top 20 locations where the ticket counts are the highest, note that a high res satelite image layer can be toggled, as well as the wards the heat maps

In [11]:
# initialize the map centered on center of toronto
m2 = folium.Map(location=[43.6532, -79.3832], zoom_start=10)

# add in the wards
folium.GeoJson('../input/toronto-wards/City Wards Data.geojson', name="geojson").add_to(m2)

folium.TileLayer(tiles=tileurl, attr='Mapbox').add_to(m2)

# draw circle markers for the top 20
for index, row in top_20_rev.iterrows():
    
    # html formatting for the tables that are displayed in the popups
    popuptext = row.to_frame().to_html(classes='table table-striped', header=False)
    
    # set the scroll bar for the popup
    html_str0 = '<div style="overflow-y: scroll; height: 100px;">\n'

    # add the circle markers
    folium.CircleMarker((row['Latitude'], row['Longitude']),
                         radius=4,
                         #radius=row['count']/1000000,    # scale the radius by the count
                         popup=folium.Popup(html_str0+popuptext, sticky=True),
                         color="red",
                         fill_color="red").add_to(m2)
    

# add in a heat map
heatdf = [[row['Latitude'], row['Longitude']] for index, row in top_20_rev.iterrows()]
folium.plugins.HeatMap(heatdf).add_to(m2)

# layer control
folium.LayerControl().add_to(m2)

# add the toolbar on the left
from folium.plugins import Draw
draw = Draw()
draw.add_to(m2)

m2

Above we have mapped the top 20 locations where the ticket revenues are the highest, note that a high res satelite image layer can be toggled, as well as the wards the heat maps

## Green P Parking and TTC stop locations

Let's take a look at the Green P Parking dataset

In [13]:
import requests
import json

url = "https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/b66466c3-69c8-4825-9c8b-04b270069193/resource/059cde7d-21bc-4f24-a533-6c2c3fc33ef1/download/green-p-parking-2019.json"
params = { "id": "b66466c3-69c8-4825-9c8b-04b270069193"}
package = requests.get(url, params = params).json()
green_p = pd.json_normalize(package['carparks'])
green_p.head(3)

Unnamed: 0,id,slug,address,lat,lng,rate,carpark_type,carpark_type_str,is_ttc,is_under_construction,...,map_marker_logo,alert_box,enable_streetview,streetview_lat,streetview_long,streetview_yaw,streetview_pitch,streetview_zoom,rate_details.periods,rate_details.addenda
0,1,https://parking.greenp.com/carpark/1_20-charle...,20 Charles Street East,43.669282202140174,-79.3852894625656,$2.50 / Half Hour,garage,Garage,False,False,...,greenp_only,Monthly Permits are no longer available at thi...,yes,43.669282202140174,-79.3852894625656,321.21,-12.45,0,"[{'title': 'Monday - Sunday & Holidays', 'rate...",[]
1,3,https://parking.greenp.com/carpark/3_13-isabel...,13 Isabella Street,43.667577,-79.384707,$3.00 / Half Hour,surface,Surface,False,False,...,greenp_only,,yes,43.667735,-79.384966,115.84,7.51,0,"[{'title': 'Monday - Sunday & Holidays', 'rate...",[]
2,5,https://parking.greenp.com/carpark/5_15-welles...,15 Wellesley Street East,43.664837,-79.383591,$3.00 / Half Hour,surface,Surface,False,False,...,greenp_bikeshare,,yes,43.665083,-79.383807,138.09,-4.68,0,"[{'title': 'Monday - Sunday & Holidays', 'rate...",[]


In [14]:
# only need a few of the columns
green_p = green_p[['address', 'lat', 'lng']]

## Add Green P Parking locations to the map

In [15]:
# draw circle markers for the top 10
for index, row in green_p.iterrows():
    
    # html formatting for the tables that are displayed in the popups
    #popuptext = row.to_frame().to_html(classes='table table-striped', header=False)
    
    # set the scroll bar for the popup
    #html_str0 = '<div style="overflow-y: scroll; height: 100px;">\n'
    folium.CircleMarker((row['lat'], row['lng']),
                         radius=3,
                         color="blue",
                         fill_color="blue").add_to(m2)
    
    
    #folium.Marker((row['lat'], row['lng']),
    #              icon=folium.Icon(color="blue", icon="car", prefix='fa', size=1),
    #             ).add_to(m)

m2

The above is getting too cluttered, we can just map the closest parking spaces and ttc stops

In [16]:
# read in the ttc stops
ttc_stops = pd.read_csv('../input/ttc-green-p/opendata_ttc_schedules/stops.txt', delimiter = ",")

In [17]:
ttc_stops.head()

Unnamed: 0,stop_id,stop_code,stop_name,stop_desc,stop_lat,stop_lon,zone_id,stop_url,location_type,parent_station,stop_timezone,wheelchair_boarding
0,262,662,DANFORTH RD AT KENNEDY RD,,43.714379,-79.260939,,,,,,2
1,263,929,DAVENPORT RD AT BEDFORD RD,,43.674448,-79.399659,,,,,,1
2,264,940,DAVENPORT RD AT DUPONT ST,,43.675511,-79.401938,,,,,,2
3,265,1871,DAVISVILLE AVE AT CLEVELAND ST,,43.702088,-79.378112,,,,,,1
4,266,11700,DISCO RD AT ATTWELL DR,,43.701362,-79.594843,,,,,,1


As mentioned above, we want to avoid over-cluttering the map so we will first find the closest locations.
To do this we will fit a ball tree to the lat long coordinates of both the green p and ttc datasets, and then
use this to querry for the closest locations

In [18]:
# need to convert these to numeric
green_p[['lat', 'lng']] = green_p[['lat', 'lng']].astype(float)

In [19]:
# Create a BallTree using the haversine metric, which expects
# (lat, lon) in radians and returns distances in radians

# train the ball tree on the green p stops

from sklearn.neighbors import NearestNeighbors
nbrs = NearestNeighbors(algorithm='ball_tree',
                        metric='haversine',
                        leaf_size=2,
                        n_jobs=-1,   # number of parallel jobs to run, -1 means all processors
                        n_neighbors=1
                        ).fit(np.radians(green_p[['lat', 'lng']].abs()))

In [20]:
# train another ball tree on the ttc stops

ttc_stops[['stop_lat', 'stop_lon']] = ttc_stops[['stop_lat', 'stop_lon']].astype(float)

nbrs2 = NearestNeighbors(algorithm='ball_tree',
                        metric='haversine',
                        leaf_size=2,
                        n_jobs=-1,   # number of parallel jobs to run, -1 means all processors
                        n_neighbors=1
                        ).fit(np.radians(ttc_stops[['stop_lat', 'stop_lon']].abs()))

In [44]:
# convert to rads
top_20_freq[['Latitude Rads', 'Longitude Rads']] = np.radians(top_20_freq[['Latitude', 'Longitude']].abs())

# query the green p tree for the nearest regionID
distances, indices = nbrs.kneighbors(top_20_freq[['Latitude Rads', 'Longitude Rads']])

# query the ttc tree for the nearest regionID
distances2, indices2 = nbrs2.kneighbors(top_20_freq[['Latitude Rads', 'Longitude Rads']])


# dists is in rad; convert to km, scale unit radius to radius of earth
top_20_freq['Closest Parking Lot (m)'] = distances.flatten() * 6371 * 1000

# add a ttc column
top_20_freq['Closest TTC Stop (m)'] = distances2.flatten() * 6371 * 1000

In [25]:
# clean up
top_20_freq.rename({'location2':'location'}, inplace=True,  axis=1)
parking_ttc_distances = top_20_freq[['location', 'count', 'address', 'Closest Parking Lot (m)', 'Closest TTC Stop (m)']]

### Distance to closest parking lots and TTC stops for top 20 by count

In [26]:
parking_ttc_distances

Unnamed: 0,location,count,address,Closest Parking Lot (m),Closest TTC Stop (m)
0,2075 Bayview Avenue,111733,"2075 Bayview Ave, North York, Toronto, Ontario...",272.157948,75.264717
1,20 Edward Street,66580,"20 Edward St, Toronto, Ontario, M5G 1C9",216.232156,81.213559
2,1750 Finch Avenue E,52971,"1750 Finch Ave E, North York, Toronto, Ontario...",2027.092793,14.549883
3,James Street,36384,"James St, Toronto, Ontario, M5G",259.158127,113.834339
4,1265 Military Trl,20638,"1265 Military Trl, Scarborough, Toronto, Ontar...",4185.366835,75.424508
5,25 The West Mall,19861,"25 The West Mall, Etobicoke, Toronto, Ontario,...",3454.255944,15.949581
6,25 St Mary Street,19538,"25 St Mary St, Toronto, Ontario, M4Y 1R2",195.22456,149.173159
7,941 Progress Avenue,19469,"941 Progress Ave, Scarborough, Toronto, Ontari...",768.807386,77.689957
8,1 Brimley Road S,19162,"1 Brimley Rd S, Scarborough, Toronto, Ontario,...",63.674867,1055.153412
9,40 Orchard View Boulevard,18938,"40 Orchard View Blvd, Toronto, Ontario, M4R 1B9",179.56679,98.125594


In [45]:
# convert to rads
top_20_rev[['Latitude Rads', 'Longitude Rads']] = np.radians(top_20_rev[['Latitude', 'Longitude']].abs())

# query the green p tree for the nearest regionID
distances3, indices3 = nbrs.kneighbors(top_20_rev[['Latitude Rads', 'Longitude Rads']])

# query the ttc tree for the nearest regionID
distances4, indices4 = nbrs2.kneighbors(top_20_rev[['Latitude Rads', 'Longitude Rads']])


# dists is in rad; convert to km, scale unit radius to radius of earth
top_20_rev['Closest Parking Lot (m)'] = distances3.flatten() * 6371 * 1000

# add a ttc column
top_20_rev['Closest TTC Stop (m)'] = distances4.flatten() * 6371 * 1000

In [27]:
top_20_rev.rename({'count':'revenue', 'location2':'location'}, inplace=True,  axis=1)
parking_ttc_distances_rev = top_20_rev[['location', 'revenue', 'address', 'Closest Parking Lot (m)', 'Closest TTC Stop (m)']]

### Distance to closest parking lots and TTC stops for top 20 by revenue

In [28]:
parking_ttc_distances_rev

Unnamed: 0,location,revenue,address,Closest Parking Lot (m),Closest TTC Stop (m)
0,410 College Street,3124470.0,"410 College St, Toronto, Ontario, M5T 1S8",406.996532,88.035642
1,40 Orchard View Boulevard,2665708.0,"40 Orchard View Blvd, Toronto, Ontario, M4R 1B9",179.56679,98.125594
2,1 Brimley Road S,2584975.0,"1 Brimley Rd S, Scarborough, Toronto, Ontario,...",63.674867,1055.153412
3,2075 Bayview Avenue,2413296.0,"2075 Bayview Ave, North York, Toronto, Ontario...",272.157948,75.264717
4,18 Grenville Street,2131720.0,"18 Grenville St, Toronto, Ontario, M4Y 3B3",320.572707,112.96952
5,20 Edward Street,1963682.0,"20 Edward St, Toronto, Ontario, M5G 1C9",216.232156,81.213559
6,1090 Don Mills Road,1676830.0,"1090 Don Mills Rd, North York, Toronto, Ontari...",3167.637421,59.539041
7,James Street,1590447.0,"James St, Toronto, Ontario, M5G",259.158127,113.834339
8,150 Dan Leckie Way,1405990.0,"150 Dan Leckie Way, Toronto, Ontario, M5V 0C9",226.244279,37.028931
9,21 Hillcrest Avenue,1390470.0,"21 Hillcrest Ave, North York, Toronto, Ontario...",250.078682,103.982667


### Adding the closest parking and ttc stops to the maps

When we queried the ball tree we saved the indices of the locations in the green p, ttc datasets
so we can use these indices to plot the nearst locations

In [46]:
# draw circle markers for the top 10
for index, row in green_p.iloc[indices.flatten()].iterrows():
    
    # html formatting for the tables that are displayed in the popups
    #popuptext = row.to_frame().to_html(classes='table table-striped', header=False)
    
    # set the scroll bar for the popup
    #html_str0 = '<div style="overflow-y: scroll; height: 100px;">\n'
    
    folium.Marker((row['lat'], row['lng']),
                  icon=folium.Icon(color="green", icon="car", prefix='fa', size=1),
                 ).add_to(m)
    
for index, row in ttc_stops.iloc[indices2.flatten()].iterrows():    
    folium.Marker((row['stop_lat'], row['stop_lon']),
                      icon=folium.Icon(color="blue", icon="subway", prefix='fa', size=1),
                 ).add_to(m)    
    

m

In [47]:
# draw circle markers for the top 10
for index, row in green_p.iloc[indices3.flatten()].iterrows():
    
    # html formatting for the tables that are displayed in the popups
    #popuptext = row.to_frame().to_html(classes='table table-striped', header=False)
    
    # set the scroll bar for the popup
    #html_str0 = '<div style="overflow-y: scroll; height: 100px;">\n'
    
    folium.Marker((row['lat'], row['lng']),
                  icon=folium.Icon(color="green", icon="car", prefix='fa', size=1),
                 ).add_to(m2)
    
for index, row in ttc_stops.iloc[indices4.flatten()].iterrows():    
    folium.Marker((row['stop_lat'], row['stop_lon']),
                      icon=folium.Icon(color="blue", icon="subway", prefix='fa', size=1),
                 ).add_to(m2)    
    

m2