Exploring the reliability of the NYC Subway system
A project from the Microsoft Research Data Science Summer School class of 2018:
Akbar Mirza, Brian Hernandez, Amanda Rodriguez, Renzhentaxi Baerde, Phoebe Nguyen, Peter Farquharson, Ayliana Teitelbaum, Sasha Paulovich
The New York City subway is the largest rapid transit system in the world, serving approximately 5.5 million riders each day. Recently there has been a growing concern over the state of the subway system due to aging equipment as reflected in system-wide metrics such as “on-time percentage”, or how often trains run according to schedule. While these metrics provide some insight into the performance of the subway system, they fail to capture how riders experience the system. In this project we use recently released countdown clock data that logs where each train is reported to be at each minute of the day to gain a better understanding of how riders experience the subway system. We examine rider wait times and trip times, considering not just average but also worst-case performance of the system. We also compare the subway to above ground travel, investigate how changes to the system affect rider options, and look at how commutes vary across demographic groups. We find that the subway is typically quite reliable, but that averages can be misleading: variance in subway performance can account for up to a 50% difference between average and worst-case travel times. We also find a correlation between income and commute times and that small changes to the system (e.g., adding or removing stops or lines) can have large effects on riders’ options.
Watch the talk for more details.