In [1]:
import pandas as pd
from IPython.display import display


In [2]:
wego = pd.read_csv("../data/Route 50 Timepoint and Headway Data, 1-1-2023 through 5-12-2025.csv")

# WeGo Public Transit
[WeGo Public Transit](https://www.wegotransit.com/) is a public transit system serving the Greater Nashville and Davidson County area. WeGo provides local and regional bus routes, the WeGo Star train service connecting Lebanon to downtown Nashville, along with several other transit services.

The data for this project can be downloaded from [here](https://drive.google.com/drive/folders/1L8d3xEaPD13BMz_k-3G8XRRLvPIbNRq9?usp=sharing).

Since 2019, WeGo has been using [**Transit Signal Priority (TSP)**](https://www.wegotransit.com/projects/transit-signal-priority/), a technology that helps to manage traffic flow more efficiently. For buses it reduces wait times at traffic signals by holding green lights longer, shortening red lights or in some cases allowing buses to bypass traffic. 

The data that you have been provided was collected for trips on Route 50, Charlotte Pike. TSP has been used on portions of this route, with different periods of being on or off, either conditionally or unconditionally. For these timespans, TSP was used between White Bridge and MCC, including all intervening timepoints, in both directions.
The important dates are as follows:

* February 3rd @ 12 noon: TSP Turned On (Unconditional)

* February 10th @ 12 noon: TSP Schedule-Conditional Priority Begins (Only buses more than 2 minutes late receive priority)

* April 28th @ 12 noon: TSP Turned Off

* May 5th @ 12 noon: TSP Turned On (Unconditional)

* May 12th @ 12 noon: TSP Headway-Conditional TSP Priority Begins (Only gapped buses with actual leading headway more than 120% of scheduled headway receive priority)


In [72]:
wego.head()

Unnamed: 0,CALENDAR_ID,SERVICE_ABBR,ADHERENCE_ID,DATE,ROUTE_ABBR,BLOCK_ABBR,OPERATOR,TRIP_ID,OVERLOAD_ID,ROUTE_DIRECTION_NAME,...,ACTUAL_HDWY,HDWY_DEV,ADJUSTED_EARLY_COUNT,ADJUSTED_LATE_COUNT,ADJUSTED_ONTIME_COUNT,STOP_CANCELLED,PREV_SCHED_STOP_CANCELLED,IS_RELIEF,BLOCK_STOP_ORDER,DWELL_IN_MINS
0,120230101,3,93549161,2023-01-01,50,5000,2355,332422,0,TO DOWNTOWN,...,,,0,0,1,0,0.0,0,2,8.133333
1,120230101,3,93549162,2023-01-01,50,5000,2355,332422,0,TO DOWNTOWN,...,,,0,0,1,0,0.0,0,5,0.0
2,120230101,3,93549163,2023-01-01,50,5000,2355,332422,0,TO DOWNTOWN,...,,,0,0,1,0,0.0,0,11,0.0
3,120230101,3,93549164,2023-01-01,50,5000,2355,332422,0,TO DOWNTOWN,...,,,0,0,1,0,0.0,0,13,0.0
4,120230101,3,93549165,2023-01-01,50,5000,2355,332422,0,TO DOWNTOWN,...,,,0,0,1,0,0.0,0,18,2.15


In [71]:
(
    wego
    .loc[wego['CALENDAR_ID'] == 120240203]
    .loc[wego['TRIP_ID'] == 371878]
    [[
        'DATE', 'CALENDAR_ID', 'TRIP_ID', 'ROUTE_ABBR',
        'TIME_POINT_ABBR', 'TRIP_EDGE',
        'SCHEDULED_TIME', 'ACTUAL_DEPARTURE_TIME', 'ADHERENCE',
        'ADJUSTED_EARLY_COUNT', 'ADJUSTED_LATE_COUNT', 'ADJUSTED_ONTIME_COUNT'
    ]]
)

Unnamed: 0,DATE,CALENDAR_ID,TRIP_ID,ROUTE_ABBR,TIME_POINT_ABBR,TRIP_EDGE,SCHEDULED_TIME,ACTUAL_DEPARTURE_TIME,ADHERENCE,ADJUSTED_EARLY_COUNT,ADJUSTED_LATE_COUNT,ADJUSTED_ONTIME_COUNT
282333,2024-02-03,120240203,371878,50,WALM,1,05:34:00,05:34:45,-0.75,0,0,1
282334,2024-02-03,120240203,371878,50,HLWD,0,05:40:00,05:40:04,-0.066666,0,0,1
282335,2024-02-03,120240203,371878,50,WHBG,0,05:47:00,05:45:43,1.283333,1,0,0
282336,2024-02-03,120240203,371878,50,CH46,0,05:50:00,05:50:40,-0.666666,0,0,1
282337,2024-02-03,120240203,371878,50,28&CHARL,0,05:54:00,05:54:52,-0.866666,0,0,1
282338,2024-02-03,120240203,371878,50,MCC5_1,2,06:05:00,06:04:18,0.7,0,0,1


The first main variable you will be studying in this project is **adherence**, which compares the actual departure time to the scheduled time and is included in the ADHERENCE column. A negative adherence value means that a bus left a time point late and a positive adherence indicates that the bus left the time point early. Buses with adherence values beyond negative 6 are generally considered late and beyond positive 1 are considered early. However, there is some additional logic where the staff applies waivers to allow early departures. For example, express buses that have already picked up everyone at a park-and-ride lot and are only dropping off passengers may be allowed to leave early.  Early departures are also permitted at the end of a trip (when TRIP_EDGE = 2), since they do not affect upstream passengers. **Note:** When determining whether a bus is early or late, it is advised that you use the 'ADJUSTED_EARLY_COUNT', 'ADJUSTED_LATE_COUNT', and 'ADJUSTED_ONTIME_COUNT' columns in order to account for the adjustments.

In [90]:
adjusted_counts = wego[['DATE', 'CALENDAR_ID', 'OPERATOR', 'TRIP_ID', 'ROUTE_ABBR', 'TIME_POINT_ABBR', 'TRIP_EDGE', 'SCHEDULED_TIME', 'ACTUAL_ARRIVAL_TIME', 'ADHERENCE', 'DWELL_IN_MINS', 'SCHEDULED_HDWY', 'ACTUAL_HDWY', 'HDWY_DEV', 'ROUTE_DIRECTION_NAME', 'ADJUSTED_EARLY_COUNT', 'ADJUSTED_LATE_COUNT', 'ADJUSTED_ONTIME_COUNT']]

In [98]:
#adjusted_counts

In [101]:
# adjusted_counts.loc[(adjusted_counts['CALENDAR_ID'] == 120250203) & (adjusted_counts['ADHERENCE'] < -6)]

In [102]:
# adjusted_counts.loc[(adjusted_counts['CALENDAR_ID'] == 120250210) & (adjusted_counts['ADHERENCE'] < -6)]

In [103]:
# adjusted_counts.loc[(adjusted_counts['CALENDAR_ID'] == 120250428) & (adjusted_counts['ADHERENCE'] < -6)]

In [104]:
# adjusted_counts.loc[(adjusted_counts['CALENDAR_ID'] == 120250505) & (adjusted_counts['ADHERENCE'] < -6)]

In [110]:
adjusted_counts.loc[
    (adjusted_counts['CALENDAR_ID'] == 120250512) & 
    (adjusted_counts['ADHERENCE'] < -6) & 
    (adjusted_counts['TIME_POINT_ABBR'] == 'WHBG')]

Unnamed: 0,DATE,CALENDAR_ID,OPERATOR,TRIP_ID,ROUTE_ABBR,TIME_POINT_ABBR,TRIP_EDGE,SCHEDULED_TIME,ACTUAL_ARRIVAL_TIME,ADHERENCE,DWELL_IN_MINS,SCHEDULED_HDWY,ACTUAL_HDWY,HDWY_DEV,ROUTE_DIRECTION_NAME,ADJUSTED_EARLY_COUNT,ADJUSTED_LATE_COUNT,ADJUSTED_ONTIME_COUNT
618450,2025-05-12,120250512,2109,429504,50,WHBG,0,10:16:00,10:25:23,-9.383333,0.0,15.0,22.933333,7.933333,TO DOWNTOWN,0,1,0
618457,2025-05-12,120250512,2109,429505,50,WHBG,0,11:03:00,11:09:50,-6.833333,0.0,15.0,20.633333,5.633333,FROM DOWNTOWN,0,1,0
618774,2025-05-12,120250512,1587,429628,50,WHBG,0,12:46:00,12:50:21,-6.533333,2.183333,15.0,20.216666,5.216666,TO DOWNTOWN,0,1,0
618793,2025-05-12,120250512,1764,429631,50,WHBG,0,15:07:00,15:13:20,-6.333333,0.0,15.0,20.2,5.2,FROM DOWNTOWN,0,1,0
618810,2025-05-12,120250512,2362,429634,50,WHBG,0,17:46:00,17:56:24,-12.983333,2.583333,17.0,26.3,9.3,TO DOWNTOWN,0,1,0
618817,2025-05-12,120250512,2362,429635,50,WHBG,0,18:33:00,18:39:07,-6.116666,0.0,15.0,19.55,4.55,FROM DOWNTOWN,0,1,0
618889,2025-05-12,120250512,3328,429681,50,WHBG,0,09:18:00,09:25:51,-7.85,0.0,15.0,22.183333,7.183333,FROM DOWNTOWN,0,1,0
618937,2025-05-12,120250512,2706,429689,50,WHBG,0,15:22:00,15:30:01,-8.016666,0.0,15.0,16.683333,1.683333,FROM DOWNTOWN,0,1,0
618942,2025-05-12,120250512,2706,429690,50,WHBG,0,16:10:00,16:14:17,-9.3,5.016666,15.0,22.85,7.85,TO DOWNTOWN,0,1,0


In [80]:
early_mean = wego['ADJUSTED_EARLY_COUNT'].mean()
late_mean = wego['ADJUSTED_LATE_COUNT'].mean()
ontime_mean = wego['ADJUSTED_ONTIME_COUNT'].mean()

total = early_mean + late_mean + ontime_mean

early_percent = (early_mean / total) * 100
late_percent = (late_mean / total) * 100
ontime_percent = (ontime_mean / total) * 100

print(early_percent)
print(late_percent)
print(ontime_percent)

3.0172442491427813
12.185825094044409
84.7969306568128


The second main variable you'll be looking at is **headway**.  This is the amount of time between a bus and the prior bus at the same stop. In the dataset, the amount of headway scheduled is contained in the SCHEDULED_HDWY column and indicates the difference between the scheduled time for a particular stop and the scheduled time for the previous bus on that same stop.
This dataset contains a column HDWY_DEV, which shows the amount of deviation from the scheduled headway. **Bunching** occurs when there is shorter headway than scheduled, which would appear as a negative HDWY_DEV value. **Gapping** is when there is more headway than scheduled and appears as a positive value in the HDWY_DEV column. Note that you can calculate headway deviation percentage as HDWY_DEV/SCHEDULED_HDWY. The generally accepted range of headway deviation is 50% to 150% of the scheduled headway, so if scheduled headway is 10 minutes, a headway deviation of up to 5 minutes would be acceptable (but not ideal).


How has TSP affected these two metrics? Keep in mind that there are many other factors that could also be contributing, so be sure to take into account things like day of the week, time of day, time of year (school in session or not), or other factors that may also be affecting adherence or headway deviation.