# 77 Travel Times before and after bus lane
In november of 2021 bus lanes were added to Mass Ave between Dudley Street and Alewife Brook Parkway (see https://www.cambridgema.gov/CDD/Transportation/regionalplanning/masstransit/buspriority). This notebook will compare bus travel times before and after their implementation. Unfortunately this all takes place in the backdrop of covid which undoubtedbly has had a large effect on bus ridership and traffic on the roads. This notebook will make some plots and include some attempts a controls. However, take care to not draw casual conclusions from the contents of this notebook, without more rigorous statistical analysis we can't claim anything firm.

This whole endeavour was inspired by https://twitter.com/PetruSofio/status/1508555535360180230

In [None]:
import matplotlib.pyplot as plt
import pandas as pd

from mbta_analysis import (
    load_months,
    plot_travel_times_by_chunked_departure,
    travel_time,
)
from mbta_analysis._util import to_min

In [None]:
fname = "data/input/2022/2022_01_SBM.csv"
df = load_months(fname)

In [None]:
# basic usage of travel time. For a single route
tt = travel_time(df, 1, ("hynes", "cntsq"))
tt

In [None]:
# making a super basic plot
# This is the median time with shading for the 25th and 75th quantiles
plot_travel_times_by_chunked_departure(tt.loc["01", "Outbound"])
plot_travel_times_by_chunked_departure(tt.loc["01", "Inbound"])

## Adding more months

Of course we can't really draw any conclusions from that. There are way to many other variables for a time point to time point comparison to be meaningful. So we need to at the very least compare to another bus route in the same time period that didn't get a bus lane. Unfortunately we definitely can't also compare to the same route in different years because it's unclear how Covid would impact that.

And the same caveat as in the intro applies here - no drawing big conclusions! We still haven't done any proper statistics.

In [None]:
# loading more months so we can show the difference
files = [
    "data/input/2022/2022_01_SBM.csv",
    "data/input/2022/2022_02_SBM.csv",
    "data/input/2021/MBTA-Bus-Arrival-Departure-Times_2021-09.csv",
    "data/input/2021/MBTA-Bus-Arrival-Departure-Times_2021-10.csv",
]
df = load_months(files)

In [None]:
df

In [None]:
%%time
# compute each of the diffs we are interested in
tt = travel_time(df, [1, 77], [("hynes", "cntsq"), ("portr", "alwpk")])

In [None]:
tt

In [None]:
ylims = [4, 14]
label_fs = 14
title_fs = 16
fig, axd = plt.subplot_mosaic(
    """
    AAC
    BBC
    """,
    figsize=(10, 6),
    layout="constrained",
)
ax = axd["A"]
plt.sca(ax)

ax.set_ylim(ylims)

ax.set_title("77 Bus - Porter to Alewife - Priority Lane added", fontsize=title_fs)
plot_travel_times_by_chunked_departure(
    tt.loc["77", "Outbound", :"2022-01-01"], label="Before Bus Lanes"
)
plot_travel_times_by_chunked_departure(
    tt.loc["77", "Outbound", "2022-01-01":], label="After Bus Lanes"
)
ax.legend()

ax = axd["B"]
ax.set_ylim(ylims)
plt.sca(ax)
plot_travel_times_by_chunked_departure(
    tt.loc["01", "Outbound", :"2022-01-01"], label="Sept - Oct 2021"
)
plot_travel_times_by_chunked_departure(
    tt.loc["01", "Outbound", "2022-01-01":], label="2022"
)
ax.legend()
ax.set_title("1 Bus - Hynes to Central - No bus priority Lane", fontsize=title_fs)
ax.set_xlabel("Hour of day", fontsize=label_fs)

ax = axd["C"]
ax.axhline(0, color="k", alpha=0.65)
ax.plot(
    to_min(
        pd.Series(
            tt.loc["01", "Outbound"]["2021-12-31":].groupby("scheduled-chunked").mean()
            - tt.loc["01", "Outbound"][:"2021-12-31":]
            .groupby("scheduled-chunked")
            .mean()
        )
    ),
    color="C4",
    label="1 Bus",
)
ax.plot(
    to_min(
        pd.Series(
            tt.loc["77", "Outbound"]["2021-12-31":].groupby("scheduled-chunked").mean()
            - tt.loc["77", "Outbound"][:"2021-12-31":]
            .groupby("scheduled-chunked")
            .mean()
        )
    ),
    color="C3",
    label="77 Bus",
)
ax.legend()
ax.set_title("Difference", fontsize=title_fs)
ax.set_xlabel("Hour of day", fontsize=label_fs)
ax.yaxis.tick_right()
ax.yaxis.set_label_position("right")
ax.set_ylabel("Δt (min)", fontsize=label_fs)
fig.supylabel("Avg Travel Time (minutes)", fontsize=label_fs)
plt.savefig("77-and-1-sept-v-jan.png", facecolor="white")

## Takeaways

Again - we must do real statistics to make really firm claims - but that certainly looks good. Naively there was some baseline improvment due to other factors and the bus lanes had a large effect on the morning and early evening commutes.

### TODOs

Still several improvements to make.

1. Most important: propogate the noise to the difference - without that it's not super meaningful.
2. Explain the 25.5 hour thing
    - In the first day of the month the trips after midnight get counted for the previous day as leaving at 25 hours
3. Stats???

