WeGo Public Transit is a public transit system serving the Greater Nashville and Davidson County area. WeGo provides local and regional bus routes, the WeGo Star train service connecting Lebanon to downtown Nashville, along with several other transit services.

In this project, you'll be analyzing the bus spacing to look for patterns and try to identify correlations to controllable or external factors. Specifically, you'll be using a dataset containing information on the headway, or amount of time between vehicle arrivals at a stop. This dataset contains a column HDWY_DEV, which shows the headway deviation. This variable will be negative when bunching has occurred (shorter headway than scheduled) and will be positive for gapping (longer headway than scheduled). Note that you can calculate headway deviation percentage as HDWY_DEV/SCHEDULED_HDWY.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
#reading in the 3 csv's
bna_2022 = pd.read_csv("../data/bna_2022.csv")
bna_2022.head()

In [None]:
headway_data = pd.read_csv("../data/Headway_Data.csv.txt")
headway_data.head()

In [None]:
bna_weather = pd.read_csv("../data/bna_weather.csv")
bna_weather.head()

In [None]:
#kept columns that we wanted
weather_df = bna_weather[['Date', 'temp', 'wx_phrase']]
headway_df = headway_data[['ADHERENCE_ID', 'DATE', 'ROUTE_ABBR', 'BLOCK_ABBR', 'OPERATOR', 'TRIP_ID', 'ROUTE_DIRECTION_NAME', 'TIME_POINT_ABBR', 'ROUTE_STOP_SEQUENCE', 'LATITUDE', 'LONGITUDE', 'SCHEDULED_TIME', 'ACTUAL_ARRIVAL_TIME', 'ACTUAL_DEPARTURE_TIME', 'ADHERENCE', 'SCHEDULED_HDWY', 'ACTUAL_HDWY', 'HDWY_DEV']]

In [None]:
headway_df

In [None]:
#changing the column names
headway_df.columns = ['adh_id', 'date', 'rte_abbr', 'blk_abbr', 'opr', 'trip_id', 'rte_dir_name', 'time_pt_abbr', 'rte_stop_seq', 'lat', 'log', 'schd_time', 'act_arrvl_time', 'act_depart', 'adh', 'schd_hdwy', 'act_hdwy', 'hdwy_dev']

In [None]:
headway_df

In [None]:
#adding new column to calculate the headway deviation percentage
headway_df["hdwy_dev_%"] = ((headway_df["hdwy_dev"] / headway_df["schd_hdwy"])*100)
headway_df


In [None]:
weather_df

In [None]:
#seperating Date into Date and Time
weather_df['Dates'] = pd.to_datetime(weather_df['Date']).dt.date
weather_df['Time'] = pd.to_datetime(weather_df['Date']).dt.time
weather_df

In [None]:
#changed column order
weather_df1 = weather_df[['Dates', 'Time', 'temp', 'wx_phrase']]
weather_df1

In [None]:
#renaming columns in weather
weather_df = weather_df1.rename(columns={'Dates': 'date', 'Time': 'time'})
weather_df