# Getting all arrivals to all stops of a given line in a given day

## Install dependencies

This notebook requires two dependencies which can be installed with the following command `pip install pandas open-bus-stride-client`.

You can also launch it online at [this URL](https://mybinder.org/v2/gh/hasadna/open-bus-stride-client/HEAD?labpath=notebooks%2Fgetting%20all%20arrivals%20to%20all%20stops%20of%20a%20given%20line%20in%20a%20given%20day.ipynb), when launching online the dependencies are already installed.

## Import dependencies

In [1]:
import datetime

from ipywidgets import DatePicker
from IPython.display import display

# we use pandas to visualize the results we get
import pandas as pd

# The stride client library, used to make the calls to the stride api
import stride

# For local development, set the following to connect to local api:
# stride.config.STRIDE_API_BASE_URL = "http://localhost:8000"

## Pick a date

First, we pick the day we want to analyze, GTFS data queries should be limited to a specific date

You can use the date picker to choose a date which will be used in the next code blocks

In [2]:
date_widget = DatePicker(description='Date:', value=datetime.date(2022,6,2))
display(date_widget)

DatePicker(value=datetime.date(2022, 6, 2), description='Date:')

## Get list of GTFS Routes

For this example we get routes based on route_short_name which is usually the route number in the GTFS data

We look for all routes on the given day matchin route_short_name `360`

In [3]:
gtfs_routes = stride.get('/gtfs_routes/list', {
    'date_from': date_widget.value, 'date_to': date_widget.value,
    'route_short_name': 360,
}, pre_requests_callback='print')
pd.DataFrame(gtfs_routes)

https://open-bus-stride-api.hasadna.org.il/gtfs_routes/list?date_from=2022-06-02&date_to=2022-06-02&route_short_name=360


Unnamed: 0,id,date,line_ref,operator_ref,route_short_name,route_long_name,route_mkt,route_direction,route_alternative,agency_name,route_type
0,617862,2022-06-02,28774,4,360,היכל המשפט/אבא אבן-ירושלים<->סובת התאנה-אפרת-1#,14360,1,#,אגד תעבורה,3
1,617863,2022-06-02,28775,4,360,סובת התאנה-אפרת<->בנייני האומה/שדרות שז''ר-ירו...,14360,2,#,אגד תעבורה,3
2,618031,2022-06-02,29729,4,360,סובת התאנה-אפרת<->בנייני האומה/שדרות שז''ר-ירו...,14360,2,4,אגד תעבורה,3


## Get list of SIRI rides

Based on this list the GTFS line_refs and operator_refs we can get the list of SIRI rides which occured on these routes

We use the scheduled_start_time field which is populated from the SIRI data to limit the rides to the specific date we want to check

This data contains the duration_minutes field which is the duration of the ride from the SIRI data

In [4]:
siri_rides = stride.get('/siri_rides/list', {
    'scheduled_start_time_from': datetime.datetime.combine(date_widget.value, datetime.time(), datetime.timezone.utc),
    'scheduled_start_time_to': datetime.datetime.combine(date_widget.value, datetime.time(23,59), datetime.timezone.utc),
    'siri_route__line_refs': ','.join([str(gtfs_route['line_ref']) for gtfs_route in gtfs_routes]),
    'siri_route__operator_refs': ','.join([str(gtfs_route['operator_ref']) for gtfs_route in gtfs_routes]),
    'order_by': 'scheduled_start_time asc'
}, pre_requests_callback='print')
pd.DataFrame(siri_rides)

https://open-bus-stride-api.hasadna.org.il/siri_rides/list?scheduled_start_time_from=2022-06-02T00%3A00%3A00.000000%2B0000&scheduled_start_time_to=2022-06-02T23%3A59%3A00.000000%2B0000&siri_route__line_refs=28774%2C28775%2C29729&siri_route__operator_refs=4%2C4%2C4&order_by=scheduled_start_time+asc


Unnamed: 0,id,siri_route_id,journey_ref,scheduled_start_time,vehicle_ref,updated_first_last_vehicle_locations,first_vehicle_location_id,last_vehicle_location_id,updated_duration_minutes,duration_minutes,journey_gtfs_ride_id,route_gtfs_ride_id,gtfs_ride_id
0,8469657,2093,2022-06-02-57859312,2022-06-02 04:00:00+00:00,79180201,2022-06-02 05:00:34.940312+00:00,445811553,445939337,2022-06-02 11:01:56.278296+00:00,22,,9180299,9180299
1,8469656,2090,2022-06-02-57859462,2022-06-02 04:00:00+00:00,7670169,2022-06-02 05:00:34.936698+00:00,445811552,446032568,2022-06-02 11:02:08.341860+00:00,38,,9180334,9180334
2,8473785,2090,2022-06-02-57859463,2022-06-02 04:20:00+00:00,8212185,2022-06-02 06:00:33.237922+00:00,445819131,446147667,2022-06-02 12:02:13.058656+00:00,55,,9180323,9180323
3,8477041,2090,2022-06-02-57859464,2022-06-02 04:40:00+00:00,7609669,2022-06-02 06:00:44.954916+00:00,445951810,446300675,2022-06-02 12:02:34.024869+00:00,55,,9180324,9180324
4,8479317,2093,2022-06-02-57859313,2022-06-02 05:00:00+00:00,7670469,2022-06-02 06:00:53.337587+00:00,446063583,446306855,2022-06-02 12:02:38.824602+00:00,39,,9180300,9180300
5,8479023,2090,2022-06-02-57859465,2022-06-02 05:00:00+00:00,79180201,2022-06-02 06:00:52.383861+00:00,446057113,446405372,2022-06-02 12:02:38.248213+00:00,56,,9180335,9180335
6,8482771,2090,2022-06-02-57859466,2022-06-02 05:30:00+00:00,7669969,2022-06-02 07:00:31.677957+00:00,446269232,446542313,2022-06-02 13:02:44.079571+00:00,46,,9180336,9180336
7,8485476,2093,2022-06-02-57859314,2022-06-02 06:00:00+00:00,8212185,2022-06-02 07:00:40.449262+00:00,446428865,446660788,2022-06-02 13:02:49.859990+00:00,41,,9180301,9180301
8,8485475,2090,2022-06-02-57859467,2022-06-02 06:00:00+00:00,7666769,2022-06-02 07:00:40.445490+00:00,446428863,446699156,2022-06-02 13:02:49.857138+00:00,48,,9180337,9180337
9,8491389,2093,2022-06-02-57859315,2022-06-02 07:00:00+00:00,79180201,2022-06-02 08:01:56.036039+00:00,446766229,446975488,2022-06-02 14:00:24.566688+00:00,42,,9180302,9180302


## Choose a ride

There are usually a lot of rides for a given route, the following code gets the first ride which occured at or after 7:00 in the morning

(the rides are already ordered in ascending order based on dates, so first one which is after or on 7 will be the right one)

In [6]:
for siri_ride in siri_rides:
    if siri_ride['scheduled_start_time'].hour >= 7:
        break
siri_ride

{'id': 8491389,
 'siri_route_id': 2093,
 'journey_ref': '2022-06-02-57859315',
 'scheduled_start_time': datetime.datetime(2022, 6, 2, 7, 0, tzinfo=datetime.timezone.utc),
 'vehicle_ref': '79180201',
 'updated_first_last_vehicle_locations': datetime.datetime(2022, 6, 2, 8, 1, 56, 36039, tzinfo=datetime.timezone.utc),
 'first_vehicle_location_id': 446766229,
 'last_vehicle_location_id': 446975488,
 'updated_duration_minutes': datetime.datetime(2022, 6, 2, 14, 0, 24, 566688, tzinfo=datetime.timezone.utc),
 'duration_minutes': 42,
 'journey_gtfs_ride_id': None,
 'route_gtfs_ride_id': 9180302,
 'gtfs_ride_id': 9180302}

## Get the ride-stops for this ride

Siri ride-stops contain all the list of stops on this ride, the vehicle locations are then related to the ride stops

this contains data from related tables:

* `gtfs_stop__city` / `gtfs_stop__name`: the stop city/name from the gtfs data
* `gtfs_ride_stop__departure_time`: the planned departure time based on the gtfs data
* `nearest_siri_vehicle_location__recorded_at_time`: the date/time from SIRI data of the relevant bus which was on this route/ride and nearest to this gtfs stop (based on lat/lon)

In [7]:
siri_ride_stops = stride.get('/siri_ride_stops/list', {
    'siri_ride_ids': str(siri_ride['id']),
    'order_by': 'order asc',
    'expand_related_data': True
}, pre_requests_callback='print')
df = pd.DataFrame(siri_ride_stops)
df.loc[:, [
    'order', 'gtfs_stop__city', 'gtfs_stop__name', 'gtfs_ride_stop__departure_time', 
    'nearest_siri_vehicle_location__recorded_at_time'
]]

https://open-bus-stride-api.hasadna.org.il/siri_ride_stops/list?siri_ride_ids=8491389&order_by=order+asc&expand_related_data=True


Unnamed: 0,order,gtfs_stop__city,gtfs_stop__name,gtfs_ride_stop__departure_time,nearest_siri_vehicle_location__recorded_at_time
0,1,ירושלים,היכל המשפט/אבא אבן,,
1,2,ירושלים,בנייני האומה,,
2,3,ירושלים,גשר המיתרים/שד' הרצל,,
3,4,ירושלים,אצטדיון טדי/א''ס ביתר,,
4,5,גוש עציון,מחסום המנהרות,,
5,6,אפרת,אפרת שער צפוני/כניסה,,
6,7,אפרת,דוד המלך/הדבש,,
7,9,אפרת,שדרות רחל אמנו/שדרות דוד המלך,,
8,10,אפרת,הדקל/עזרא,,
9,11,אפרת,שדרות דוד המלך/השיירות,,
