Being a daily commuter in Toronto's subway system(TTC), I found myself stuck many times in the train.
The announcement by the automated voice warning of congestions still rings clearly through my head!

It got me wondering if I could put numbers to this frustration of mine and show my friends of what pleces to avoid during their commute.

So I collected data about the metro system from the City of Toronto's open data catalog(https://www.toronto.ca/city-government/data-research-maps/open-data/)

This is the metadata of the dataset. 

In [32]:
import pandas as pd

metadata = pd.read_excel("./data/SubwayDelaysMetadata.xlsx")
print(metadata)

Unnamed: 0,Field Name,Description,Example
0,Date,Date (YYYY/MM/DD),2016-12-31 00:00:00
1,Time,Time (24h clock),01:59:00
2,Day,Name of the day of the week,Saturday
3,Station,TTC subway station name,Rosedale Station
4,Code,TTC delay code,MUIS
5,Min Delay,Delay (in minutes) to subway service,5
6,Min Gap,Time length (in minutes) between trains,9
7,Bound,Direction of train dependant on the line,N
8,Line,"TTC subway line i.e. YU, BD, SHP, and SRT",YU
9,Vehicle,TTC train number,5961


This information is available in a monthly format. I then proceeded to aggregate and group the data for a single year.

Then to plot the delays on the city map, I have used OpenStreetMaps (https://www.openstreetmap.org/#map=4/48.14/-103.72) and a wonderful mapping library in Python
called Folium (https://python-visualization.github.io/folium/).

In [33]:
import folium
import json

with open('./data/subway_delays_map_data.txt', "r") as read_file:
        subway_file = json.load(read_file)
map = folium.Map(location=[43.6452299, -79.38060999999999], zoom_start=12)

for row in subway_file:
    name = row[0]
    congestion = row[1]
    coordinates = row[2]
    latitude = coordinates['lat']
    longitude = coordinates['lng']
    
    folium.Circle(
      location=[latitude, longitude],
      popup=name,
      radius=congestion/3,
      color='crimson',
      fill=True,
      fill_color='crimson',
        fill_opacity=1,
   ).add_to(map)    
map

This is a dynamic zoomable, pannable map. Feel free to resize the map and play around with it!

Click on the dots to know the name of the station.

To visualize this more clearly, lets see the top 10 stations with maximum delays

In [34]:
top_10_delays = subway_file[0:11]
top_10_map = folium.Map(location=[43.6452299, -79.38060999999999], zoom_start=11.4)

for row in top_10_delays:
    name = row[0]
    congestion = row[1]
    coordinates = row[2]
    latitude = coordinates['lat']
    longitude = coordinates['lng']
    
    folium.Circle(
      location=[latitude, longitude],
      popup=name,
      radius=congestion/2,
      color='crimson',
      fill=True,
      fill_color='crimson',
        fill_opacity=1,
   ).add_to(top_10_map)    
top_10_map


So the major culprits are Sheppard West, Kennedy, Eglinton etc. This matches with my experiences!