# Muni BRT stops
Muni sent over a list of 136 new stops to add as BRT.

### Issues
**1. Check that the `stop_ids` all match what we have in January.**
<br>None of the stops are missing.

**2. What routes they belong to / how many routes are we adding compared to the 1 we had?**

It might be way more concise to tag them by routes, similar to other operators' BRT.

Adding 28 new routes. All are `route_type == 3 (bus)` as expected.

**3. What categories are these stops now? Do they already show up in our HQ stops dataset (not as BRT, but as 2+ bus intersections)?** 

All the stops Muni wants to add already appears as `hq_corridor_bus` (having high frequency too!...all are `corridor_frequent_stop`).

A large chunk, 90 / 136 do appear as `major_stop_bus` and show up at the intersection of 2+ bus routes.

But yes, we only tag 1 route as being BRT, and this was the main purpose of their email, to recategorize those as BRT. Eric's criteria for BRT is more stringent, so we'd have to check these, because if we make it more lenient for Muni, we will have to do it for all the operators in CA.

**muni stops parquet: {GCS_FILE_PATH}operator_input/check_muni_stops.parquet**

In [1]:
import geopandas as gpd
import pandas as pd

from segment_speed_utils import helpers
from update_vars import analysis_date
from utilities import GCS_FILE_PATH

analysis_date

'2024-01-17'

In [2]:
GCS_FILE_PATH

'gs://calitp-analytics-data/data-analyses/high_quality_transit_areas/'

In [3]:
muni = helpers.import_scheduled_trips(
    analysis_date,
    filters=[[("name", "==", "Bay Area 511 Muni Schedule")]],
    columns=["feed_key", "route_type", "route_id", "trip_id", "route_short_name"],
)

muni_feed_key = muni.feed_key.unique()[0]

In [4]:
stops = helpers.import_scheduled_stops(
    analysis_date,
    filters=[[("feed_key", "==", muni_feed_key)]],
)

FILE = "SFMTA_muni_high_quality_transit_stops_2024-02-01.csv"

muni_stops = (
    pd.read_csv(f"{GCS_FILE_PATH}operator_input/{FILE}", 
                dtype={"bs_id": "str"})
    .drop(columns=["latitude", "longitude"])
    .rename(columns={"bs_id": "stop_id"})
)

stops2 = pd.merge(
    stops, 
    muni_stops, 
    on="stop_id", 
    how="outer", 
    indicator=True
)

stops2._merge.value_counts()

left_only     3128
both           136
right_only       0
Name: _merge, dtype: int64

In [5]:
muni_stop_times = helpers.import_scheduled_stop_times(
    analysis_date,
    with_direction=False,
    columns=["feed_key", "trip_id", "stop_id"],
    filters=[
        [("feed_key", "==", muni_feed_key),
         ("stop_id", "in", muni_stops.stop_id.unique()),
        ]],
    get_pandas=True,
)

In [6]:
# Merge stop_times in, but just keep unique stops-routes (don't need trips)
muni_stops_with_route = pd.merge(
    muni, 
    muni_stop_times, 
    on=["feed_key", "trip_id"],
).drop(columns = "trip_id").drop_duplicates().reset_index(drop=True)

# Save this out so it's easier for others to work off of
muni_stops_with_route.to_parquet(
    f"{GCS_FILE_PATH}operator_input/check_muni_stops.parquet"
)

In [7]:
muni_stops_with_route.route_id.nunique()

28

In [8]:
muni_stops_with_route.route_type.unique()

array(['3'], dtype=object)

In [9]:
# this matches the number they sent over
muni_stops_with_route.stop_id.nunique()

136

In [10]:
# Filter our HQ stops dataset
# to Muni (organization_name) and to stop_ids and route_ids we 
# found in muni_stop_times_with_route
hq_stops = gpd.read_parquet(
    f"{GCS_FILE_PATH}export/{analysis_date}/"
    "ca_hq_transit_stops.parquet",
    filters = [[
        ("stop_id", "in", muni_stops.stop_id.unique()), 
        ("agency_primary", "==", "City and County of San Francisco"), 
        ("route_id", "in", muni_stops_with_route.route_id.unique())
    ]]
)

In [11]:
hq_stops.hqta_type.value_counts()

hq_corridor_bus    136
major_stop_bus      90
major_stop_brt       9
Name: hqta_type, dtype: int64

In [12]:
# Each stop can be tagged as several things based on
# what category, so let's keep stop-hqta_type, but throw away route
hq_stops2 = hq_stops[
    ["stop_id", "hqta_type", "hqta_details"]
].drop_duplicates().reset_index(drop=True)

In [13]:
pd.merge(
    muni_stops_with_route,
    hq_stops,
    on = ["route_id", "stop_id"],
    how = "outer",
    indicator = True
)._merge.value_counts()

both          235
left_only     134
right_only      0
Name: _merge, dtype: int64

Test out a merge with route vs without route.

The same stop can have multiple HQ categories, depending on if it's generated from an intersection of 2+ bus routes or a major transit stop, etc. We do care about route when it comes to the detail, but let's just check to see if all the stops Muni wants is represented in HQ stops.

**Finding: all the stops they asked for does show up, but not in the category they want**

In [14]:
muni_hq = pd.merge(
    muni_stops_with_route,
    hq_stops2,
    on = "stop_id",
    how = "outer",
    indicator = True
)

muni_hq._merge.value_counts()

both          494
left_only       0
right_only      0
Name: _merge, dtype: int64

In [15]:
muni_hq.hqta_type.value_counts()

hq_corridor_bus    270
major_stop_bus     206
major_stop_brt      18
Name: hqta_type, dtype: int64

This tells me that all the stops Muni wants to add already appears as `hq_corridor_bus` (having high frequency too!...all are `corridor_frequent_stop`).

A large chunk, 90 / 136 do appear as `major_stop_bus` and show up at the intersection of 2+ bus routes.

But yes, we only tag 1 route as being BRT, and this was the main purpose of their email, to recategorize those as BRT. BRT would require manual checks though.

In [16]:
muni_hq.columns

Index(['feed_key', 'route_type', 'route_id', 'route_short_name', 'stop_id',
       'hqta_type', 'hqta_details', '_merge'],
      dtype='object')

In [17]:
for t in muni_hq.hqta_type.unique():
    print(t)
    subset = muni_hq[muni_hq.hqta_type == t]
    
    print(f"nunique stop_ids: {subset.stop_id.nunique()}")   
    print(subset.hqta_details.value_counts())
    print("------")

hq_corridor_bus
nunique stop_ids: 136
corridor_frequent_stop    270
Name: hqta_details, dtype: int64
------
major_stop_bus
nunique stop_ids: 90
intersection_2_bus_routes_same_operator          124
intersection_2_bus_routes_different_operators     82
Name: hqta_details, dtype: int64
------
major_stop_brt
nunique stop_ids: 9
major_stop_brt_single_operator    18
Name: hqta_details, dtype: int64
------


In [18]:
muni_hq.route_id.nunique()

28

In [19]:
muni_hq.stop_id.nunique()

136