-
Notifications
You must be signed in to change notification settings - Fork 6
Closed
Labels
gtfs-rtWork related to GTFS-RealtimeWork related to GTFS-Realtimeresearch requestIssues that serve as a request for research (summary and handoff)Issues that serve as a request for research (summary and handoff)
Description
Complete the below when receiving a research request, and continue to add to this issue as you receive additional details and produce deliverables. Be sure to also add the appropriate project-level label to this issue (eg gtfs-rt, DLA).
Research Question
Single sentence description: Draft GTFS digest is up, now decide which metrics are worth charts (time-series), which are more easily captured by tables (perhaps mostly static with little changes)
Tables
Grain
- route-direction
- in digest, do not use
schedule_gtfs_dataset_key
orname (schedule)
, but useorganization_source_record_id
andorganization_name
. - organize by district
- meaningful route name displayed, and charts use this standardized
route_name
, notroute_id
:- put through script to standardize / parse route_id over time so we can identify the same route over time (route_id2).
- further clean up by operator so that combo
route_short_name
androute_long_name
does not have redundancy (1 Route 1
->Route 1
)
Operator Stats ("digest/operator_profiles" and "digest/operator_routes")
- monthly scheduled service hours by day of week (
day_type
,time_of_day
) - number of routes, service area (length in miles, only count each route-direction once)
- number of stops served, total stop arrivals, arrivals per stop
- route typology breakdown
Schedule ("digest/operator_schedule_rt_category"?)
- avg scheduled service minutes
- avg stop distance (meters) -> change to per mile?
- all_day / peak / offpeak scheduled n_trips
- all_day /peak / offpeak scheduled frequency
- monthly scheduled service hours by day of week (
day_type
) andtime_of_day
Speeds ("digest/segment_speeds")
- summary speeds for all_day /peak / offpeak
RT vs Schedule
- trips breakdown by
sched_vp_category
(10 trips schedule only, 100 trips schedule_and_vp, 5 trips vp only).drop_duplicates()
but keep some dates where this distribution changes
-
route breakdown bydon't do this, because routes can have different numbers ofsched_vp_category
("digest/schedule_vp_metrics")n_scheduled_trips
,n_vp_trips
, and we can assign incorrectly.
Charts
Schedule
- monthly scheduled service hours by day of week (
day_type
) andtime_of_day
- number of trips by
sched_vp_category
?
Speeds
- summary speeds for all_day / peak / offpeak
RT vs Schedule
- filter to
schedule_vp_category = "schedule_and_vp"
- vp per minute (goal line = 2)
- % vp in scheduled shape (goal line = 100%) - use
all_day
- % RT journey with 1+/2+ vp (goal line = 100%) - use
all_day
, one chart shared for 1+ and 2+ - % schedule journey with 1+/2+ vp (goal line = 100%) - use
all_day
, one chart shared for 1+ and 2+ - comparison of scheduled to RT trip time (aggregated to route-direction)
- breakdown n_trips early (5+ min early) / on-time within 5 min early or late / late (5+ min late)
Maps
Deliverables
- Parameterized notebook with supporting scripts (charting, report styling, or table aggregations).
- Plug and play approach. How the metrics are displayed and organized potentially can get reshuffled, so set up functions for making each chart and then wrap them up for display.
- If we want to have dropdown by route, would altair's extract data help with displaying tables? Right now, tables are not filtered interactively, but that is desired.
- Try out great_tables, potentially can help with displaying tables. Their nanoplots look cool, but only take
polars
dfs, notpandas
and df might have to be wide?
Data Sources
Metadata
Metadata
Assignees
Labels
gtfs-rtWork related to GTFS-RealtimeWork related to GTFS-Realtimeresearch requestIssues that serve as a request for research (summary and handoff)Issues that serve as a request for research (summary and handoff)
Type
Projects
Status
Done