Data below from here: https://geohub.lacity.org/datasets/9b1bc4861f1e4277b6bd6e51f48e0f4d/explore

In [18]:
import pandas as pd
from plotly import express as px
df = pd.read_csv("Metro_Stations.csv")
#The data is from the link above.

This data was much nicer and smaller than the New York Subway's hourly ridership data, but this data is also considerably less informative. We were unable to find ridership data by station, only by line, which made creating nice graphics and visualizations much more difficult. (Why LA Metro why???)

In [19]:
df

Unnamed: 0,X,Y,OBJECTID,source,ext_id,cat1,cat2,cat3,org_name,Name,...,description,zip,link,use_type,latitude,longitude,date_updated,dis_status,POINT_X,POINT_Y
0,-118.192933,33.768076,72713,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Downtown Long Beach Station,...,Blue Line,,,publish,33.768076,-118.192933,2023/04/04 16:19:54+00,,33.768076,-118.192933
1,-118.193712,33.772263,72714,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Pacific Ave Station,...,Blue Line,,,publish,33.772263,-118.193712,2023/04/04 16:19:54+00,,33.772263,-118.193712
2,-118.189396,33.781835,72715,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Anaheim Street Station,...,Blue Line,,,publish,33.781835,-118.189396,2023/04/04 16:19:54+00,,33.781835,-118.189396
3,-118.189394,33.789095,72716,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Pacific Coast Hwy Station,...,Blue Line,,,publish,33.789095,-118.189394,2023/04/04 16:19:54+00,,33.789095,-118.189394
4,-118.189846,33.807084,72717,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Willow Street Station,...,Blue Line,,,publish,33.807084,-118.189846,2023/04/04 16:19:54+00,,33.807084,-118.189846
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
125,-118.378703,33.945678,72838,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Aviation/Century Station,...,K Line,,,publish,33.945678,-118.378703,2023/04/04 16:19:54+00,,33.945678,-118.378703
126,-118.377271,33.929635,72839,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Aviation/LAX Station,...,K Line,,,publish,33.929635,-118.377271,2023/04/04 16:19:54+00,,33.929635,-118.377271
127,-118.251208,34.054751,72840,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Grand Av Arts/Bunker Hill,...,Regional Connector,,,publish,34.054751,-118.251208,2023/04/04 16:19:54+00,,34.054751,-118.251208
128,-118.246166,34.052039,72841,Metropolitan Transportation Authority (MTA),,Transportation,Metro Stations,,,Historic Broadway,...,Regional Connector,,,publish,34.052039,-118.246166,2023/04/04 16:19:54+00,,34.052039,-118.246166


We can get rid of a lot of the extra detail columns that we don't need.

In [20]:
cols = ['OBJECTID', 'post_id', 'latitude', 'longitude', 'Name', 'description']
df = df[cols]

In [21]:
df['description'] = df['description'].str.split().str.get(0)

#clean up the line names
for row in range(len(df)):
    if df['description'][row] == "Regional" or df['description'][row] == "Blue/EXPO":
        df['description'][row] = "Blue/Expo"
    if df['description'][row] == "EXPO":
        df['description'][row] = "Expo"

#clean up the lines column
df["Line"] = df["description"].str.split('/').str.get(0)
df["Lines"] = df["description"].str.split('/')

#we dont need description anymore
cols = ['OBJECTID', 'post_id', 'latitude', 'longitude', 'Name', 'Lines', 'Line']
df = df[cols]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['description'] = df['description'].str.split().str.get(0)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['description'][row] = "Blue/Expo"
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['description'][row] = "Blue/Expo"
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning

In [22]:
df.to_csv('Clean_Metro_Stations.csv')

From here, the IDs startion ids are jumbled and unordered. So I went and manually sorted through the columns, putting the stations on each line in consecutive order.

In [23]:
df = pd.read_csv("Clean_Metro_Stations_Manual_Clean.csv")

So here's a rough system map showing the locations of all the stations and what line they are on:

In [24]:
#station map, lines not connected, also hover_data is kinda broken. The G-Line (BRT, not Rail) is shown as well
fig = px.scatter_mapbox(df, 
                        lat = "Latitude",
                        lon = "Longitude",
                        color = "Line",
                        hover_name = "Name",
                        #hover_data = "Lines",
                        zoom = 8.8,
                        height = 600,
                        width = 800,
                        title = "Graph Representation of LA Metro Station Locations by Line",
                        mapbox_style = "carto-positron")

fig.show()

Here's one with the lines connected:

In [25]:
#line map, stations not shown
fig = px.line_mapbox(df, 
                        lat = "Latitude",
                        lon = "Longitude",
                        color = "Line",
                        hover_name = "Name",
                        #hover_data = "Lines",
                        line_group='Line',
                        zoom = 8.8,
                        height = 600,
                        width = 800,
                        title = "Approximate Graph Representation of LA Metro Lines",
                        mapbox_style = "carto-positron")

fig.show()

And here's one showing distance from/to the nearest station. Again, I haven't quite figured out how to construct the contours based on real world distance yet so that's a bit unfortunate for now, but it's a work in progress!

In [26]:
#density mapbox, shows areas within a certain radius to a station. The higher the number the closer the straight line distance to a metro station
fig = px.density_mapbox(df,
                        lat='Latitude',
                        lon='Longitude',
                        radius=15,
                        opacity=0.3,
                        hover_name = "Name",
                        zoom = 8.8,
                        height = 600,
                        width = 800,
                        title = "Areas Within Approximately 1 Mile of a LA Metro Station",
                        mapbox_style = "carto-positron")
fig.show()