# Getting all arrivals to all stops of a given line in a given day

## Install dependencies

This notebook requires two dependencies which can be installed with the following command `pip install pandas open-bus-stride-client`.

You can also launch it online at [this URL](https://mybinder.org/v2/gh/hasadna/open-bus-stride-client/HEAD?labpath=notebooks%2Fgetting%20all%20arrivals%20to%20all%20stops%20of%20a%20given%20line%20in%20a%20given%20day.ipynb), when launching online the dependencies are already installed.

## Import dependencies

In [1]:
import datetime

from ipywidgets import DatePicker
from IPython.display import display

# we use pandas to visualize the results we get
import pandas as pd

# The stride client library, used to make the calls to the stride api
import stride

# For local development, set the following to connect to local api:
# stride.config.STRIDE_API_BASE_URL = "http://localhost:8000"

## Pick a date

First, we pick the day we want to analyze, GTFS data queries should be limited to a specific date

You can use the date picker to choose a date which will be used in the next code blocks

In [10]:
date_widget = DatePicker(description="Date:", value=datetime.date(2025, 2, 2))
display(date_widget)

DatePicker(value=datetime.date(2025, 2, 2), description='Date:', step=1)

## Get list of GTFS Routes

For this example we get routes based on route_short_name which is usually the route number in the GTFS data

We look for all routes on the given day matchin route_short_name `360`

In [11]:
gtfs_routes = stride.get(
    "/gtfs_routes/list",
    {
        "date_from": date_widget.value,
        "date_to": date_widget.value,
        "route_short_name": 360,
        # 'route_mkt': 23056
    },
    pre_requests_callback="print",
)
pd.DataFrame(gtfs_routes)

https://open-bus-stride-api.hasadna.org.il/gtfs_routes/list?date_from=2025-02-02&date_to=2025-02-02&route_short_name=360


Unnamed: 0,id,date,line_ref,operator_ref,route_short_name,route_long_name,route_mkt,route_direction,route_alternative,agency_name,route_type
0,6353452,2025-02-02,24985,14,360,ת. מרכזית צפת/רציפים-צפת<->ת. מרכזית המפרץ/הור...,13360,1,#,נתיב אקספרס,3
1,6353453,2025-02-02,24986,14,360,ת. מרכזית המפרץ/רציפים בינעירוני-חיפה<->ת. מרכ...,13360,2,#,נתיב אקספרס,3
2,6353981,2025-02-02,28774,4,360,היכל המשפט/אבא אבן-ירושלים<->סובת התאנה-אפרת-1#,14360,1,#,אלקטרה אפיקים תחבורה,3
3,6353982,2025-02-02,28775,4,360,סובת התאנה-אפרת<->שדרות שז''ר/בנייני האומה-ירו...,14360,2,#,אלקטרה אפיקים תחבורה,3


## Get list of SIRI rides

Based on this list the GTFS line_refs and operator_refs we can get the list of SIRI rides which occured on these routes

We use the scheduled_start_time field which is populated from the SIRI data to limit the rides to the specific date we want to check

This data contains the duration_minutes field which is the duration of the ride from the SIRI data

In [12]:
siri_rides = stride.get(
    "/siri_rides/list",
    {
        "scheduled_start_time_from": datetime.datetime.combine(
            date_widget.value, datetime.time(), datetime.timezone.utc
        ),
        "scheduled_start_time_to": datetime.datetime.combine(
            date_widget.value, datetime.time(23, 59), datetime.timezone.utc
        ),
        "siri_route__line_refs": ",".join(
            [str(gtfs_route["line_ref"]) for gtfs_route in gtfs_routes]
        ),
        "siri_route__operator_refs": ",".join(
            [str(gtfs_route["operator_ref"]) for gtfs_route in gtfs_routes]
        ),
        "order_by": "scheduled_start_time asc",
    },
    pre_requests_callback="print",
)
pd.DataFrame(siri_rides)

https://open-bus-stride-api.hasadna.org.il/siri_rides/list?scheduled_start_time_from=2025-02-02T00%3A00%3A00.000000%2B0000&scheduled_start_time_to=2025-02-02T23%3A59%3A00.000000%2B0000&siri_route__line_refs=24985%2C24986%2C28774%2C28775&siri_route__operator_refs=14%2C14%2C4%2C4&order_by=scheduled_start_time+asc


Unnamed: 0,id,siri_route_id,journey_ref,scheduled_start_time,vehicle_ref,updated_first_last_vehicle_locations,first_vehicle_location_id,last_vehicle_location_id,updated_duration_minutes,duration_minutes,...,gtfs_route__date,gtfs_route__line_ref,gtfs_route__operator_ref,gtfs_route__route_short_name,gtfs_route__route_long_name,gtfs_route__route_mkt,gtfs_route__route_direction,gtfs_route__route_alternative,gtfs_route__agency_name,gtfs_route__route_type
0,94566583,8104,2025-02-02-584858609,2025-02-02 03:00:00+00:00,61565202,,,,,,...,,,,,,,,,,
1,94567426,2093,2025-02-02-0,2025-02-02 03:20:00+00:00,27724703,,,,,,...,,,,,,,,,,
2,94567159,2090,2025-02-02-585191043,2025-02-02 03:20:00+00:00,7670169,,,,,,...,,,,,,,,,,
3,94570883,8104,2025-02-02-584858610,2025-02-02 04:00:00+00:00,13687902,,,,,,...,,,,,,,,,,
4,94570628,2093,2025-02-02-57859219,2025-02-02 04:00:00+00:00,61558002,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,94660255,8105,2025-02-02-584858707,2025-02-02 16:25:00+00:00,79195401,,,,,,...,,,,,,,,,,
96,94661269,2093,2025-02-02-0,2025-02-02 16:30:00+00:00,4284234,,,,,,...,,,,,,,,,,
97,94662286,2093,2025-02-02-585160364,2025-02-02 16:40:00+00:00,79797002,,,,,,...,,,,,,,,,,
98,94661725,2090,2025-02-02-585160380,2025-02-02 16:40:00+00:00,79180101,,,,,,...,,,,,,,,,,


## Choose a ride

There are usually a lot of rides for a given route, the following code gets the first ride which occured at or after 7:00 in the morning

(the rides are already ordered in ascending order based on dates, so first one which is after or on 7 will be the right one)

In [13]:
pd.DataFrame(siri_rides)

Unnamed: 0,id,siri_route_id,journey_ref,scheduled_start_time,vehicle_ref,updated_first_last_vehicle_locations,first_vehicle_location_id,last_vehicle_location_id,updated_duration_minutes,duration_minutes,...,gtfs_route__date,gtfs_route__line_ref,gtfs_route__operator_ref,gtfs_route__route_short_name,gtfs_route__route_long_name,gtfs_route__route_mkt,gtfs_route__route_direction,gtfs_route__route_alternative,gtfs_route__agency_name,gtfs_route__route_type
0,94566583,8104,2025-02-02-584858609,2025-02-02 03:00:00+00:00,61565202,,,,,,...,,,,,,,,,,
1,94567426,2093,2025-02-02-0,2025-02-02 03:20:00+00:00,27724703,,,,,,...,,,,,,,,,,
2,94567159,2090,2025-02-02-585191043,2025-02-02 03:20:00+00:00,7670169,,,,,,...,,,,,,,,,,
3,94570883,8104,2025-02-02-584858610,2025-02-02 04:00:00+00:00,13687902,,,,,,...,,,,,,,,,,
4,94570628,2093,2025-02-02-57859219,2025-02-02 04:00:00+00:00,61558002,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,94660255,8105,2025-02-02-584858707,2025-02-02 16:25:00+00:00,79195401,,,,,,...,,,,,,,,,,
96,94661269,2093,2025-02-02-0,2025-02-02 16:30:00+00:00,4284234,,,,,,...,,,,,,,,,,
97,94662286,2093,2025-02-02-585160364,2025-02-02 16:40:00+00:00,79797002,,,,,,...,,,,,,,,,,
98,94661725,2090,2025-02-02-585160380,2025-02-02 16:40:00+00:00,79180101,,,,,,...,,,,,,,,,,


In [14]:
for siri_ride in siri_rides:
    if siri_ride["scheduled_start_time"].hour >= 10:
        break
siri_ride

{'id': 94614201,
 'siri_route_id': 2093,
 'journey_ref': '2025-02-02-57859225',
 'scheduled_start_time': datetime.datetime(2025, 2, 2, 10, 0, tzinfo=datetime.timezone.utc),
 'vehicle_ref': '7670369',
 'updated_first_last_vehicle_locations': None,
 'first_vehicle_location_id': None,
 'last_vehicle_location_id': None,
 'updated_duration_minutes': None,
 'duration_minutes': None,
 'journey_gtfs_ride_id': None,
 'route_gtfs_ride_id': None,
 'gtfs_ride_id': None,
 'siri_route__line_ref': 28774,
 'siri_route__operator_ref': 4,
 'gtfs_ride__gtfs_route_id': None,
 'gtfs_ride__journey_ref': None,
 'gtfs_ride__start_time': None,
 'gtfs_ride__end_time': None,
 'gtfs_route__date': None,
 'gtfs_route__line_ref': None,
 'gtfs_route__operator_ref': None,
 'gtfs_route__route_short_name': None,
 'gtfs_route__route_long_name': None,
 'gtfs_route__route_mkt': None,
 'gtfs_route__route_direction': None,
 'gtfs_route__route_alternative': None,
 'gtfs_route__agency_name': None,
 'gtfs_route__route_type': No

## Get the ride-stops for this ride

Siri ride-stops contain all the list of stops on this ride, the vehicle locations are then related to the ride stops

this contains data from related tables:

* `gtfs_stop__city` / `gtfs_stop__name`: the stop city/name from the gtfs data
* `gtfs_ride_stop__departure_time`: the planned departure time based on the gtfs data
* `nearest_siri_vehicle_location__recorded_at_time`: the date/time from SIRI data of the relevant bus which was on this route/ride and nearest to this gtfs stop (based on lat/lon)

In [17]:
siri_ride_stops = stride.get(
    "/siri_ride_stops/list",
    {"siri_ride_ids": 94614201, "order_by": "order asc", "expand_related_data": False},
    pre_requests_callback="print",
)
df = pd.DataFrame(siri_ride_stops)
df[df["gtfs_stop__city"].notna()].head()
df.head()

https://open-bus-stride-api.hasadna.org.il/siri_ride_stops/list?siri_ride_ids=94614201&order_by=order+asc&expand_related_data=False


Unnamed: 0,id,siri_stop_id,siri_ride_id,order,gtfs_stop_id,nearest_siri_vehicle_location_id,siri_stop__code,siri_ride__siri_route_id,siri_ride__journey_ref,siri_ride__scheduled_start_time,...,gtfs_route__date,gtfs_route__line_ref,gtfs_route__operator_ref,gtfs_route__route_short_name,gtfs_route__route_long_name,gtfs_route__route_mkt,gtfs_route__route_direction,gtfs_route__route_alternative,gtfs_route__agency_name,gtfs_route__route_type
0,2395943394,1689,94614201,1,,,1106,2093,2025-02-02-57859225,2025-02-02 10:00:00+00:00,...,,,,,,,,,,
1,2395949282,2332,94614201,2,,,5200,2093,2025-02-02-57859225,2025-02-02 10:00:00+00:00,...,,,,,,,,,,
2,2395960622,26056,94614201,3,,,6229,2093,2025-02-02-57859225,2025-02-02 10:00:00+00:00,...,,,,,,,,,,
3,2395980682,344,94614201,4,,,222,2093,2025-02-02-57859225,2025-02-02 10:00:00+00:00,...,,,,,,,,,,
4,2396000311,2597,94614201,5,,,61362,2093,2025-02-02-57859225,2025-02-02 10:00:00+00:00,...,,,,,,,,,,
