# Analyzing Patterns and Causes of TTC Streetcar Delays: January-March 2025

## Abstract

This study examines the Toronto Transit Commission (TTC) streetcar delay data from January to March 2025. By analyzing over 4,000 delay incidents, we identify patterns in delay frequency, location, time, and cause. Our findings reveal that miscellaneous operational issues account for the majority of delays (around 60%), followed by transportation, security, and equipment-related problems. Delays occur consistently throughout the week, with slight variations by day. Time analysis shows peak delay periods in the afternoon (2-3 PM) and evening (8-9 PM). Spatially, Broadview Station experiences the highest number of delays. These insights can inform TTC's resource allocation and preventive maintenance strategies to improve service reliability and passenger experience.

## Introduction

Public transit systems are essential infrastructure for urban mobility, and their reliability significantly impacts quality of life and economic productivity in cities. In Toronto, the streetcar network is a vital transportation artery serving thousands of commuters daily. However, like any transit system, it experiences service disruptions that affect its reliability (Diab & El-Geneidy, 2013).

Delay analysis is crucial for transit authorities to understand where, when, and why disruptions occur. Previous research has demonstrated that identifying patterns in transit delays can lead to more effective resource allocation and preventive maintenance strategies (Barron et al., 2013). For example, a study by Cats et al. (2016) found that analyzing delay patterns helped Stockholm's public transit authority reduce disruptions by 15% through targeted interventions.

For the Toronto Transit Commission (TTC), which operates one of North America's largest streetcar networks, understanding delay patterns can lead to significant improvements in service reliability. According to the TTC's own service standards, maintaining consistent headways (time between vehicles) is a priority for passenger satisfaction (TTC, 2022).

This paper analyzes TTC streetcar delay data from January to March 2025, seeking to:
1. Identify spatial patterns in delay occurrences
2. Analyze temporal trends, including time-of-day and day-of-week variations
3. Categorize and quantify the causes of delays
4. Provide data-driven insights for potential service improvements

By systematically examining these aspects, this study aims to contribute to the understanding of transit reliability challenges and inform strategies for enhancing service quality.

## Data

### TTC Streetcar Delay Data

The data used in this analysis comes from the Toronto Open Data Portal's "TTC Streetcar Delay Data" dataset, which provides detailed records of delay incidents across the TTC streetcar network. The dataset is published by the Toronto Transit Commission as part of their open data initiative.

The primary dataset includes the following fields:
- **_id**: Unique identifier for each delay record
- **Date**: The date when the delay occurred (YYYY-MM-DD format)
- **Line**: The streetcar route number and name (e.g., "504 KING")
- **Time**: The time when the delay occurred (24-hour format)
- **Day**: Day of the week
- **Station**: Location description where the delay occurred
- **Code**: TTC delay code (categorized alphanumeric code)
- **Min Delay**: Delay in minutes to the schedule for the following vehicle
- **Min Gap**: Time length in minutes from the vehicle ahead of the following vehicle
- **Bound**: Direction of the streetcar route (E, W, N, S)
- **Vehicle**: Vehicle number

Here's an example record from the dataset:

| _id | Date       | Line        | Time  | Day       | Station               | Code  | Min Delay | Min Gap | Bound | Vehicle |
|-----|------------|-------------|-------|-----------|-----------------------|-------|-----------|---------|-------|---------|
| 1   | 2025-01-01 | 504 KING    | 02:10 | Wednesday | KING AND PARLIAMENT   | MTSAN | 10        | 20      | W     | 4569    |
| 2   | 2025-01-01 | 506 CARLTON | 02:50 | Wednesday | COLLEGE AND HURON     | MTAFR | 34        | 49      | E     | 4480    |
| 3   | 2025-01-01 | 504 KING    | 03:11 | Wednesday | KING AND QUEEN EAST   | MTIE  | 15        | 30      | E     | 4629    |
| 4   | 2025-01-01 | 501 QUEEN   | 05:14 | Wednesday | QUEEN AND DUFFERIN    | MTAFR | 36        | 55      | E     | 4455    |
| 5   | 2025-01-01 | 504 KING    | 06:14 | Wednesday | LESLIE CARHOUSE       | MTVIS | 10        | 20      | E     | 4483    |

This record indicates a 10-minute delay on the 504 King streetcar line at King and Parliament on Wednesday, January 1, 2025, at 2:10 AM, with a delay code of MTSAN (Unsanitary Vehicle).

### Delay Code and its Description

A supplementary dataset provides descriptions for the delay codes, with fields:
- **_id**: Unique identifier for each code
- **CODE**: The alphanumeric delay code
- **DESCRIPTION**: Human-readable description of the delay cause

For example:

| _id | CODE  | DESCRIPTION               |
|-----|-------|----------------------------|
| 1   | ETAC  | HVAC                      |
| 2   | ETAR  | ARTICULATION (ALRV)       |
| 3   | ETAX  | AUXILARY POWER SUPPLY     |
| 4   | ETBO  | BODY                      |
| 5   | ETCA  | COMPRESSED AIR            |

The delay codes are categorized with prefixes that indicate the general category of the delay:
- **ET**: Equipment-related delays
- **MT**: Miscellaneous operations-related delays
- **PT**: Plant-related delays (infrastructure)
- **ST**: Security-related delays
- **TT**: Transportation-related delays

It's important to note that the directory on delay codes does not contain explanations for all codes mentioned in the delay data. Specifically, codes that start with 'EF', 'MF', 'SF', 'TF', or 'PF' are not explained in the reference dataset. This limitation means that some delay causes in our analysis remain without detailed descriptions.

### Data Processing and Analysis

The data processing pipeline for this analysis involved several steps:

1. **Data Retrieval**: Obtaining the raw data from Toronto's Open Data Portal using their API.
2. **Data Cleaning**: Merging the delay incidents with their corresponding code descriptions, handling malformed characters, and standardizing formats.
3. **Feature Extraction**: Deriving new features including hour of day and categorical grouping of delay codes.
4. **Aggregation and Analysis**: Performing various aggregations to identify patterns across time, space, and delay categories.
5. **Visualization**: Creating informative visualizations to represent the findings.

Python was used for all data processing and analysis, with pandas for data manipulation, matplotlib and seaborn for visualization, and specialized libraries like squarify for treemap visualizations. The analysis scripts were structured to follow a consistent pattern of loading, processing, analyzing, and visualizing the data.

## Results

### Spatial Distribution of Delays

The spatial analysis reveals key problem areas in the TTC streetcar network (Figure 1). Three major stations account for 25% of all location-specific delays: Broadview Station (103 delays), Spadina Station (86 delays), and Dundas West Station (75 delays). These are all major transfer points where multiple routes connect.

![Stations with the highest number of delays](repo_root/outputs/06-delay_stations.png)

*Figure 1: Top 15 stations with the highest number of delays.*

Terminal loops also show high delay rates, with Sunnyside Loop (48 delays), Humber Loop (46 delays), and Kingston Loop (42 delays) all appearing in the top 10. This suggests problems with vehicle turnarounds and schedule management at endpoints.
The east side of the network appears to have more delay hotspots than the west side, with several major delay locations (Broadview Station, Kingston Loop) in the eastern portion of the network. This may indicate differences in infrastructure quality or operational practices between these areas.

### Temporal Patterns

#### Time of Day Analysis

The hourly delay distribution challenges typical assumptions about when transit is most reliable (Figure 2). Delays peak at 2-3 PM (216 delays) and 8-9 PM (204 delays), not during traditional rush hours.

![Frequency of Delays by Hour of Day](repo_root/outputs/05-delay_time_barchart.png)

*Figure 2: Distribution of delay incidents by hour of day.*

The 2-3 PM peak occurs before the afternoon rush, suggesting that preparing for peak service may be more problematic than handling peak crowds. The 8-9 PM peak happens when supervisor coverage is typically reduced, which may explain the higher delay rates.

Early morning hours (4-6 AM) show consistent delays despite low ridership and traffic. This points to internal issues like vehicle deployment and staff availability at the start of service.

The relatively small variation between hours suggests that delay factors are present throughout the day rather than tied to specific peak periods.

#### Day of Week Distribution

Delays occur at similar rates across all days of the week (Figure 3), with Saturday showing only slightly more delays (16.1%) than the lowest day, Wednesday (12.8%).

![Delays by Day of Week](repo_root/outputs/03-weekday_delays.png)

*Figure 3: Distribution of delays by day of the week.*

This pattern challenges the common practice of reducing resources on weekends. With weekend days accounting for nearly one-third of all delays, and fewer staff typically scheduled on weekends, each delay may have greater impact due to reduced response capacity.

#### Monthly Trends

February shows nearly twice as many delays (1,900) as January (1,000) or March (1,100), as seen in Figure 4.

![Monthly Delay Distribution](repo_root/outputs/02-monthly_delays.png)

*Figure 4: Distribution of delays by month in 2025.*

This February spike likely reflects the accumulated effects of winter conditions. While January brings the initial winter weather, February combines sustained cold, built-up snow and ice, and ongoing salt damage to infrastructure. The decrease in March suggests some recovery as conditions improve, though delay levels remain higher than January.

### Delay Causes

#### Categories of Delay Incidents

The analysis of delay types shows that operational issues, not equipment failures, are the main problem (Figure 5).

![Delay Incident Categories](repo_root/outputs/04-delay_incidents_treemap.png)

*Figure 5: Treemap showing the distribution of delay incidents by category.*

Miscellaneous Operations [MT] account for 60% of all delays (2,442 incidents), far outweighing equipment problems. Security issues [ST] cause 14% of delays (571 incidents), showing how passenger behavior significantly impacts service.
Equipment failures [ET] (194 incidents, 5%) and infrastructure issues [PT] (67 incidents, 2%) are relatively minor factors, suggesting that vehicle and track reliability are not the main problems.

#### Specific Delay Codes

Looking at specific delay codes reveals more detail about what causes service disruptions (Figure 6).

![Top 15 Delay Incident Codes](repo_root/outputs/04-top_delay_codes.png)

*Figure 6: Bar chart showing the top 15 delay incident codes by frequency.*

Just four types of incidents account for over 40% of all delays:

1. **Auto Foul Rail (MTAFR)**: Over 650 incidents where vehicles encounter track obstructions
2. **On Diversion (MTDV**): About 400 incidents where vehicles must detour from normal routes
3. **Disorderly Patron (STDP)**: Over 300 incidents involving passenger behavior issues
4. **Unsanitary Vehicle (MTSAN)**: Over 300 incidents requiring vehicle cleaning

The high rate of Auto Foul Rail incidents points to consistent problems with track-vehicle interfaces. The frequency of Disorderly Patron delays highlights how social factors affect transit reliability.

## Discussion

The comprehensive analysis of TTC streetcar delays from January to March 2025 reveals several important patterns and insights that can inform operational improvements.

### Spatial Concentration of Delays

The concentration of delays at major stations such as Broadview, Spadina, and Dundas West suggests that these interchange points may benefit from targeted interventions. These stations serve as connection points between multiple routes and transportation modes, creating complex operational environments. The TTC could consider:

1. Implementing specialized staff training for operators and supervisors at these high-delay locations
2. Reviewing scheduling practices at interchange points to build in appropriate recovery time
3. Enhancing passenger flow management at these stations during peak periods

The high incidence of delays at loop terminals (Humber, Kingston, Sunnyside) also suggests that turn-around operations may need review. Delays at these points can cascade throughout the system, affecting multiple routes.

### Temporal Insights

The time-of-day analysis reveals several critical periods for delay management:

1. **Afternoon Peak (2:00-3:00 PM)**: This period, with the highest number of delays, aligns with the transition between midday and PM rush hour service. The TTC might consider adding additional supervisory resources during this transition period.

2. **Evening Operations (8:00-9:00 PM)**: The secondary peak in the evening suggests that resource allocation might need adjustment during what is traditionally considered "off-peak" time.

3. **Early Morning Operations (4:00-6:00 AM)**: The consistent pattern of delays during service start-up suggests potential issues with vehicle preparation or operator availability that could be addressed through procedural improvements.

The relatively even distribution of delays across days of the week contradicts the common assumption that weekday rush hours would show substantially higher delay rates. This finding suggests that the TTC's current day-of-week resource allocation model may need reconsideration to ensure consistent service quality throughout the week.

The monthly spike in February aligns with typical winter conditions in Toronto and highlights the need for seasonal preparedness strategies. The TTC might consider:

1. Enhanced winter operational protocols specific to February conditions
2. Preventive maintenance focused on cold-weather vulnerabilities
3. Staffing adjustments to account for the historically higher February delay rates

### Delay Causes and Potential Interventions

The dominance of Miscellaneous Operations (MT) delays, particularly "Auto Foul Rail" (MTAFR) and "On Diversion" (MTDV), points to operational issues as the primary contributor to service disruptions. These operational delays often involve interactions between vehicles and the rail environment, suggesting that:

1. Track maintenance schedules may need review to reduce instances of fouled rails
2. Diversion management procedures could be streamlined to reduce delay impacts

The high incidence of security-related delays (ST category), particularly "Disorderly Patron" (STDP), indicates that passenger behavior significantly impacts service reliability. If more funding is permitted, The TTC might consider:

1. Expanding customer service training for operators to better manage disruptive situations
2. Increasing security presence at high-incident locations identified in the spatial analysis
3. Implementing public education campaigns about respectful transit use

Equipment-related delays (ET category), while less frequent, still represent significant service disruptions. A preventive maintenance program targeting the most common equipment failures could help reduce these incidents.

### Limitations and Future Research

This analysis has several limitations that should be acknowledged:

1. The three-month time frame (January-March 2025) may not capture seasonal variations throughout the entire year.
2. The lack of explanation for certain delay codes (those beginning with 'EF', 'MF', 'SF', 'TF', or 'PF') limits our understanding of some delay categories.
3. The dataset does not include information on the number of passengers affected by each delay, which would help quantify the service impact.

Future research could address these limitations by:
1. Extending the analysis to a full year of data to capture seasonal patterns
2. Incorporating passenger volume data to weigh delays by their impact
3. Conducting comparative analysis with previous years to identify trends over time
4. Investigating the relationship between weather conditions and specific delay types

## References

Barron, A., Melo, P. C., Cohen, J., & Anderson, R. J. (2013). Passenger-focused management approach to the measurement of train delays. Transportation Research Record, 2351(1), 46-53.

Cats, O., West, J., & Eliasson, J. (2016). A dynamic stochastic model for evaluating congestion and crowding effects in transit systems. Transportation Research Part B: Methodological, 89, 43-57.

Diab, E. I., & El-Geneidy, A. M. (2013). Variation in public transit quality of service by time of day and day of week: A case study in Montreal, Canada. Transportation Research Record, 2351(1), 18-26.

Toronto Transit Commission (TTC). (2022). Service Standards and Decision Rules for Planning Transit Service. Retrieved from https://www.ttc.ca/transparency-and-accountability/transit-planning

Werner, M., Schön, C., & König, A. (2018). Systematic literature review on machine learning approaches for public transportation delay prediction. arXiv preprint arXiv:1812.07170.