In [1]:
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import HTML, Image
import warnings
import matplotlib.pyplot as plt
# ** import our own package here **

warnings.filterwarnings("ignore")

In [2]:
!jupyter trust Mapping_Ridership.ipynb

Signing notebook: Mapping_Ridership.ipynb


# The Effects of COVID-19 on Bart Ridership

# Exploring the Effects of COVID-19 on Bart Commuter Ridership in Downtown San Francisco and Berkeley

Authors: Noor Wahle, Jiachen Li, Shih-hung Chiu, Jingya Zhao

## Introduction

Add background information on bart ridership pre/post COVID-19

## Data Cleaning

Data Source: https://www.bart.gov/about/reports/ridership

Background information on include:
- only using weekday data so that we can observe trends in commute
- how we divided data into pre/post COVID-19 looking at the last 5 years
- used the entry data, not the exit data
- what else?

In [3]:
# insert basic cleaning code here

## EDA

### Mapping

To begin understanding trends in Bart ridership during COVID-19, we narrowed our focus to two locations: Downtown San Francisco and Bereley. We chose these locations to better understand Bart ridership changed prior to and post COVID-19 specifically for those commuting to and from their workplace. Below, we visualized the percent decline in total ridership for the stations in each location.

_**Note:** hover over the circles to view the percent decline in ridership per station._

#### Downtown San Francisco

In [4]:
HTML("figures/SF_interactive_map.html")

#### Berkeley

In [5]:
HTML("figures/berkeley_interactive_map.html")

## Jingya's plots

## Analysis of ridership of stations with different land use

This notebook examines the ridership of various important BART stations with different land use over the past five years. A noticeable trend was observed where there was a significant decrease in ridership during the pandemic period regardless of the land use, which began in February 2020. Although people have started returning to their normal lives in 2022, ridership at most stations remains at only 40% of pre-COVID levels. This indicates that the pandemic has had a significant impact on people's way of life. The aim of this study is to compare the ridership patterns before, during, and after the pandemic, with a focus on four specific locations with different land use: 1) San Francisco CBD (commercial land use), 2) Berkeley(academic land use), 3) Oakland(mixed land use), and 4) International Airports(transport land use).

### 1. San Francisco CBD (Commertial Land Use)

The pandemic led many companies to implement the "work from home" policy, allowing employees to work from anywhere instead of commuting to the office. This policy proved successful for some companies, and many continue to implement it. As a result, this study assumes that the number of people traveling to San Francisco CBD for work significantly decreased during and after the pandemic comparing to pre-COVID time. The plots below show that before the pandemic, the selected BART stations (Embarcadero, Montgomery Street, Powell Street, Civic Center) had 20,000 to 50,000 daily trips. However, although ridership has been increasing since 2020, late 2022 data shows only around 10,000 monthly trips. This indicates that although people may feel that things have returned to normal, the pandemic has altered their commuting patterns.

In [6]:
display.Image('figures/SF CBD Arrival Trip Count.jpg')

AttributeError: 'function' object has no attribute 'Image'

In the previous sections, we discussed the travel volumes of the stations, but it is not clear how much of a difference there is before and after the pandemic. To address this, in this section, we calculate the arrival trip ratio relative to the peak arrival trip volumes of the past 5 years.

To begin, let's look at the stations in San Francisco CBD. It is evident that the number of trips is gradually increasing, but as of late 2022, less than 40% of the peak arrival trip volumes of the past 5 years are observed.

In [12]:
display.Image('figures/SF CBD Arrival Trip Ratio.jpg')

### 2. Berkeley (Academic Land Use)

It is assumed that the majority of BART passengers in Berkeley, particularly at North Berkeley and Downtown Berkeley stations, are students studying at UC Berkeley. Due to the campus lockdown from 2020-2021, ridership during this period is expected to have been very low. With the reopening of the campus, it is expected that the number of trips would gradually increase back to pre-COVID levels. However, the graph below shows that although a growth trend is observed after the summer of 2021, which is when the campus reopened, the total number of trips is still far below the pre-COVID level.

In [13]:
display.Image('figures/Berkeley Arrival Trip Count.jpg')

Next, looking at the stations in Berkeley, the pattern is similar to the one observed above. We were assuming that the ridership growth rate might be faster than the commercial area because students need to go back to campus. However, the ridership in Berkeley does not grow faster comparing to the commercial areas. This might be because of the recording of the classes. After the pandemics, many classes starts to provide lecture recordings that student might not need to go to the campus but can finish the course wherever they can connect to the internet. As a result, the ridership is less comparing to pre-pandemic period.  

In [15]:
display.Image('figures/Berkeley Arrival Trip Ratio.jpg')

### 3. Oakland (Mixed Land Use)


Oakland is a major city in the East Bay area with a more complex land use pattern than San Francisco CBD and Berkeley. Unlike the other two locations, Oakland's land use is a mix of both business and residential areas. This study aims to investigate whether this mix-use land use pattern leads to different travel patterns. The graph below shows that despite the different land use, similar trends to the above plots are observed, indicating that the general travel pattern remains the same regardless of land use.

In [16]:
display.Image('figures/Oakland Arrival Trip Count.jpg')

Similar to the patterns shown above, the ridership rate started to grow after the pandemic. However, the amount of the trips is still far less than pre-COVID level (about 40%). 

In [17]:
display.Image('figures/Oakland Arrival Trip Ratio.jpg')

### 4. Airports (Transport Land Use)

Lastly, this study discusses trips to the international airports (Oakland International Airport and San Francisco International Airport). Similar patterns to the previous locations are observed, but with a faster growth rate. This highlights the importance of public transit for international airports, as the majority of passengers are likely foreigners who rely heavily on public transit to reach their destinations.

In [18]:
display.Image('figures/Airports Arrival Trip Count.jpg')

The stations at the airports show a faster growing pattern than the rest of the stations. This indicates the importance of the connection between public transits.

In [20]:
display.Image('figures/Airport Arrival Trip Ratio.jpg')

## Conclusion

The analysis reveals that, compared to the pre-COVID period, the ridership at the majority of stations, regardless of land use, has only recovered to less than 40%. However, there is a notable exception for stations located at airports, where a higher percentage (around 60%) of the ridership has returned. This study highlights the significant changes in people's travel patterns due to various factors such as work-from-home policies or online lectures. However, it also underscores the ongoing demand for public transit connections, particularly at international airports, emphasizing the importance of efficient transportation links in serving travelers' needs.

## Predicting Future Ridership Patterns After Covid-19 Using Pre-Covid-19 Data

For this section of the analysis, the focus will be on analyzing the ridership from Downtown Berkeley station (entry station) to Embarcadero station (exit station) pre- and post-pandemic to infer the impact of Covid-19 on people's choice of transportation modes. The reason for selecting these two stations is that, firstly, students constitute the majority of Downtown Berkeley station riders. Because they typically don't own cars, the impact could be much more significant. Secondly, Embarcadero is one of the busiest stations according to BART's website (bart.gov). With the growing popularity of remote working and studying during the pandemic, it is highly likely that the ridership for these two stations has changed drastically. Hypothetically, it may take a long period of time for the ridership to recover back to pre-pandemic levels.

*pre-pandemic: before 2020, post-pandemic: after 2020

### Pre-Pandemic Ridership

In [12]:
display(Image(filename="figures/Berkeley_Embarcadero_2010_2019_Ridership.png"))

In [11]:
display(Image(filename="figures/Berkeley_Embarcadero_2010_2019_Log_Ridership.png"))

Based on the plots above, there appears to be an annual seasonality with the peak around summer and the trough around winter. In general, there is an increasing trend observed throughout 2010-2015, but it gradually declines afterwards. The variance of ridership over time has been steady, although the variances for the years 2014-2016 are larger than those of other years. To standardize the variance of the data, a Log Transformation has been applied.

After taking the first order difference, the data is non-stationary with a strong seasonality at the indies of multiples of 12. To minimize the effect of seasonality, another lag difference of 12 is taken. The resulting data gives a much better result in terms of stationarity. Moreover, the p-value of the adfuller test is below 0.05, allowing us to reject the null and conclude that the differenced data is stationary. The data set is then split into training and testing, and a parameter search is performed on the training set to determine the best arima model. 

In [13]:
display(Image(filename="figures/Berkeley_Embarcadero_2010_2019_Modeling.png"))

### Post-Pandemic Ridership

The extent of influence is calculated by computing the absolute difference between the ridership predicted by the pre-pandemic model and the actual ridership observed in 2021-2022. Since the output value of the pre-pandemic model is logged, the result must be exponentialized for comparison to be feasible.

In [15]:
display(Image(filename="figures/Berkeley_Embarcadero_2021_2022_Prediction_Observation_Comparison.png"))

### Conclusion

The difference between the predicted and observed ridership is gradually decreasing, with only a 500-rider average gap expected at the end of 2022. Although the two ridership values will eventually intersect with each other, the current difference implies that Covid-19 has had a significant impact on people's choice of transportation modes. However, it is important to note that the model's training data shows a declining trend after 2015, which means the model could result in negative ridership in the future. The underlying factor that caused the drop remains unknown, but presumably it affected the model's accuracy.

In [16]:
display(Image(filename="figures/Berkeley_Embarcadero_2021_2022_Prediction_Observation_Difference.png"))