# MetroTransit Passenger Trends

## Key questions:

1. What are the most popular MetroTransit stations, as defined by volume of "Ons" & "Offs"?
2. What part of the week sees the highest level of MetroTransit traffic?
3. What is the relationship between station "Ons" & "Offs"?
4. How has Minnesota's lockdown impacted MetroTransit?

In [1]:
# JOSH
# 1. What are the most popular metro transit stations? Defined by volume of "ons" & "offs"
    # Img 1.1 is heatmap routes
    # Img 1.2 is heatmap w/ top 5 pins dropped on the map

# KARIM
# 2. What part of the week sees the highest level of metro transit traffic?
    # Img 2.1 is bar chart
    # Img 2.2 heatmap showing the highest level day (Saturday "ons")

# ZOEY
# 3. What is the relationship between station "ons" & "offs"
    # Img 3.1 scatterplot of single station "ons" vs. "offs"
    # Img 3.2 stasticial analysis 

# OPTIONAL -- IF WE GET TO IT    
# 4. Does the Vikings' success correlate to Lightrail traffic ("ons")?
    # 

In [2]:
%matplotlib notebook

# Dependencies
import gmaps
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import random

# Save file path to variable
metrotransit_csv = "TransitStopsBoardingsAndAlightings2019.csv"

# Read with Pandas
metrotransit_df = pd.read_csv(metrotransit_csv)
metrotransit_df["Route Classification"].unique()

array(['Core Local', 'Supporting Local', 'Commuter Express', 'Special',
       'Suburban Local', 'BRT', 'LRT', 'Commuter Rail', nan], dtype=object)

In [3]:
core_local_df = metrotransit_df.loc[metrotransit_df["Route Classification"]=="Core Local"]
core_local_df.head()

Unnamed: 0,Provider,Route Type,Route Classification,Route,Dir,Site_id,Geo_Node_Name,Group ID,latitude,longitude,...,City,Trips,Obs Trips,Schedule,Ons,Offs,Seq,%Sampled,Downtown,Comment
0,Metro Transit,Urban Local,Core Local,2,East,51581.0,HENNEPIN AVE & FRANKLIN AVE / 22ND ST,,44.961886,-93.292079,...,MINNEAPOLIS,92.0,92.0,Weekday,75,8.0,1.0,100%,N,
1,Metro Transit,Urban Local,Core Local,2,East,1099.0,HENNEPIN AVE S & 22ND ST W,,44.96103,-93.292777,...,MINNEAPOLIS,92.0,92.0,Weekday,33,2.0,2.0,100%,N,
2,Metro Transit,Urban Local,Core Local,2,East,13340.0,DUPONT AVE & FRANKLIN AVE W,,44.962534,-93.293016,...,MINNEAPOLIS,92.0,92.0,Weekday,17,1.0,3.0,100%,N,
3,Metro Transit,Urban Local,Core Local,2,East,13337.0,FRANKLIN AVE W & HENNEPIN AVE S,,44.96263,-93.291123,...,MINNEAPOLIS,92.0,92.0,Weekday,177,2.0,4.0,100%,N,
4,Metro Transit,Urban Local,Core Local,2,East,56705.0,FRANKLIN AVE W & LYNDALE AVE S,,44.962642,-93.287697,...,MINNEAPOLIS,92.0,92.0,Weekday,143,4.0,5.0,100%,N,


In [4]:
# Store latitude and longitude in locations
locations = core_local_df[["latitude", "longitude"]].astype(float)

# Fill NaN values and convert to float
volume = core_local_df["Ons"].astype(float)

# Plot Heatmap
fig = gmaps.figure()

# Create heat layer
heat_layer = gmaps.heatmap_layer(locations, weights=volume, 
                                 dissipating=False, max_intensity=10,
                                 point_radius=.0012)

# Add layer
fig.add_layer(heat_layer)

# Display figure
fig


Figure(layout=FigureLayout(height='420px'))

In [5]:
# ID top five ons

core_local_df = pd.DataFrame(metrotransit_df,columns=["Geo_Node_Name","Ons","Offs","latitude", "longitude"])
core_local_df = core_local_df.dropna()
core_local_df["Ons"] = core_local_df["Ons"].str.replace(',', '')
core_local_df["Ons"] = core_local_df["Ons"].astype(int)

# core_local_on_df = core_local_df.sort_values(by="Ons",ascending=False)
# core_local_on_df.head()

core_local_top5ons = core_local_df.nlargest(5,["Ons"])
core_local_top5ons

Unnamed: 0,Geo_Node_Name,Ons,Offs,latitude,longitude
22223,7TH & NICOLLET STATION,1061,124.0,44.977311,-93.273066
83,PLEASANT ST & JONES HALL,938,108.0,44.978407,-93.235333
3167,NICOLLET MALL & 7TH ST S,898,107.0,44.97712,-93.272189
6189,5TH ST & MINNESOTA ST,750,34.0,44.946749,-93.092106
31543,7TH & NICOLLET STATION,731,73.0,44.977311,-93.273066


In [6]:
# ID top five offs
core_local_top5offs = core_local_df.nlargest(5,["Offs"])
core_local_top5offs

Unnamed: 0,Geo_Node_Name,Ons,Offs,latitude,longitude
3157,NICOLLET MALL & 7TH ST S,82,930.0,44.976718,-93.272273
22257,6TH ST S & NICOLLET MALL,66,930.0,44.977888,-93.271671
929,CHICAGO LAKE TRANSIT CENTER & GATE A,357,850.0,44.949243,-93.262056
20170,MAPLE GROVE P&R & TRANSIT STATION,12,812.0,45.091409,-93.436758
21546,FOLEY P&R & EVERGREEN BLVD,15,800.0,45.142338,-93.285538


In [7]:
# Pins on the map to answer what are the most popular "ons" & "offs"
info_box_template = """
<dl>
<dt>Name</dt><dd>{Geo_Node_Name}</dd>
<dt>Ons</dt><dd>{Ons}</dd>
</dl>
"""
# Store the DataFrame Row
# NOTE: be sure to update with your DataFrame name
top5_ons_df = [info_box_template.format(**row) for index, row in core_local_top5ons.iterrows()]
top5_on_pins = core_local_top5ons[["latitude", "longitude"]]


In [8]:
# Add top 5 ON marker layer ontop of heat map
markers = gmaps.marker_layer(top5_on_pins, info_box_content=top5_ons_df)
fig.add_layer(markers)

# Display figure
fig

Figure(layout=FigureLayout(height='420px'))

In [9]:
# Pins on the map to answer what are the most popular "ons" & "offs"

info_box_template = """
<dl>
<dt>Name</dt><dd>{Geo_Node_Name}</dd>
<dt>Offs</dt><dd>{Offs}</dd>
</dl>
"""
# Store the DataFrame Row
# NOTE: be sure to update with your DataFrame name
top5_offs_df = [info_box_template.format(**row) for index, row in core_local_top5offs.iterrows()]
top5_offs_pins = core_local_top5offs[["latitude", "longitude"]]


In [10]:
# Add top 5 OFF marker layer ontop of heat map
markers = gmaps.marker_layer(top5_offs_pins, info_box_content=top5_offs_df)
fig.add_layer(markers)

# Display figure
fig

Figure(layout=FigureLayout(height='420px'))