### Install dependencies

This notebook requires a dependency which can be installed with the following command `pip install open-bus-stride-client`.

You can also launch it online at [this URL](https://mybinder.org/v2/gh/hasadna/open-bus-stride-client/main?labpath=notebooks%2Fload%20siri%20vehicle%20locations%20to%20pandas%20dataframe.ipynb), when launching online the dependencies are already installed.


In [1]:
# !pip install open-bus-stride-client

In [2]:
import pandas as pd
import datetime
from dateutil import tz
import folium

pd.options.display.max_columns = 1000
pd.options.display.max_colwidth = 1000

import stride

### Find a route to investigave

As SIRI data doesn't hold the `route_short_name` data (the bus line number) we will use the GTFS to find a route.

Let's look for line number `480` (Egged).

In [11]:
routes_df = pd.DataFrame(stride.get('/gtfs_routes/list', {
                                                            'route_mkt': '23056',
                                              'date_from': '2025-02-19',
                                              'date_to':  '2025-02-19'}))

routes_df

Unnamed: 0,id,date,line_ref,operator_ref,route_short_name,route_long_name,route_mkt,route_direction,route_alternative,agency_name,route_type
0,6452370,2025-02-19,979,15,56,מסוף רדינג/רציפים-תל אביב יפו<->מסוף יהוד/הורדה-יהוד מונוסון-10,23056,1,0,מטרופולין,3
1,6452371,2025-02-19,980,15,56,העצמאות/ הרצל-יהוד מונוסון<->מסוף רדינג/הורדה-תל אביב יפו-20,23056,2,0,מטרופולין,3


In [12]:
routes_df = routes_df[routes_df['route_long_name'].apply(lambda s: "רדינג" in s)]
routes_df = routes_df[routes_df['route_direction'] == '1']
routes_df

Unnamed: 0,id,date,line_ref,operator_ref,route_short_name,route_long_name,route_mkt,route_direction,route_alternative,agency_name,route_type
0,6452370,2025-02-19,979,15,56,מסוף רדינג/רציפים-תל אביב יפו<->מסוף יהוד/הורדה-יהוד מונוסון-10,23056,1,0,מטרופולין,3


In [13]:
line_ref = routes_df['line_ref'].iloc[0]

### Get rides data

We use the stride iterate method to efficiently iterate over a possibly long list of results.

Behind the scenes it uses the offset/limit parameters so you don't have to worry about it.

We pass on the iterator directly on to Pandas to create a DataFrame.

In [17]:
siri_vehicle_locations_480 = pd.DataFrame(stride.iterate('/siri_vehicle_locations/list', {
    'siri_routes__line_ref': line_ref,
    'siri_rides__schedualed_start_time_from': datetime.datetime(2025,2, 18, tzinfo=tz.gettz('Israel')),
    'siri_rides__schedualed_start_time_to': datetime.datetime(2025,2,19, tzinfo=tz.gettz('Israel'))+datetime.timedelta(days=1),
    'order_by': 'recorded_at_time desc'
}, limit=1000))

siri_vehicle_locations_480.shape

(100, 22)

In [7]:
siri_vehicle_locations_480[['recorded_at_time','siri_route__line_ref',
                                    'siri_route__operator_ref','siri_ride__scheduled_start_time',
                                   'lon','lat','siri_ride__vehicle_ref']].head()

Unnamed: 0,recorded_at_time,siri_route__line_ref,siri_route__operator_ref,siri_ride__scheduled_start_time,lon,lat,siri_ride__vehicle_ref
0,2025-02-19 14:46:20+00:00,979,15,2025-02-19 13:30:00+00:00,34.866886,32.045742,51430903
1,2025-02-19 14:46:07+00:00,979,15,2025-02-19 14:00:00+00:00,34.809917,32.058898,51430103
2,2025-02-19 14:46:05+00:00,979,15,2025-02-19 14:30:00+00:00,34.781261,32.081954,51432803
3,2025-02-19 14:46:01+00:00,979,15,2025-02-19 13:15:00+00:00,34.892562,32.031631,53130103
4,2025-02-19 14:45:59+00:00,979,15,2025-02-19 14:15:00+00:00,34.790687,32.073394,36750901


The date columns are on UTC timezone, let's localize the dates to Israel timezone.

In [8]:
def localize_dates(data, dt_columns = None):
    if dt_columns is None:
        dt_columns=[]
    
    data = data.copy()
    
    for c in dt_columns:
        data[c] = pd.to_datetime(data[c]).dt.tz_convert('Israel')
    
    return data

In [9]:
dt_columns = ['recorded_at_time','siri_ride__scheduled_start_time']

siri_vehicle_locations_480 = localize_dates(siri_vehicle_locations_480, dt_columns)

In [10]:
# Create an enhanced map visualization
def create_enhanced_bus_locations_map(locations_df):
    # Calculate the center of the map (mean of coordinates)
    center_lat = locations_df['lat'].mean()
    center_lon = locations_df['lon'].mean()
    
    # Create a map centered on the mean position
    m = folium.Map(location=[center_lat, center_lon], 
                  zoom_start=13,
                  tiles='cartodbpositron')  # Using a cleaner map style
    
    # Add a timestamp to show data freshness
    latest_time = locations_df['recorded_at_time'].max()
    earliest_time = locations_df['recorded_at_time'].min()
    
    title_html = f'''
        <div style="position: fixed; 
                    top: 10px; left: 50px; width: 300px; height: 60px; 
                    z-index:9999; font-size:14px; background-color: white;
                    padding: 10px; border-radius: 5px;">
            <b>Bus Locations Data</b><br>
            Time Range: {earliest_time.strftime('%H:%M:%S')} - {latest_time.strftime('%H:%M:%S')}
        </div>
    '''
    m.get_root().html.add_child(folium.Element(title_html))
    
    # Create a feature group for bus markers
    bus_locations = folium.FeatureGroup(name="Bus Locations")
    
    # Add markers for each bus location with enhanced information
    for idx, row in locations_df.iterrows():
        # Create detailed popup text
        popup_text = f"""
        <b>Bus Details:</b><br>
        Time: {row['recorded_at_time'].strftime('%H:%M:%S')}<br>
        Speed: {row['velocity']:.1f} km/h<br>
        Bearing: {row['bearing']}°<br>
        Distance from start: {row['distance_from_journey_start']:.1f}m
        """
        
        # Create a circle marker with rotation based on bearing
        folium.CircleMarker(
            location=[row['lat'], row['lon']],
            radius=8,
            popup=folium.Popup(popup_text, max_width=200),
            tooltip=f"Click for details",
            color='blue',
            fill=True,
            fill_color='blue',
            fill_opacity=0.7,
            weight=2
        ).add_to(bus_locations)
        
        # Add a small line indicating direction (bearing)
        if pd.notna(row['bearing']):
            folium.RegularPolygonMarker(
                location=[row['lat'], row['lon']],
                number_of_sides=3,
                radius=4,
                rotation=row['bearing'],
                color='red',
                fill=True,
                fill_color='red'
            ).add_to(bus_locations)
    
    bus_locations.add_to(m)
    
    # Add layer control
    folium.LayerControl().add_to(m)
    
    return m

# Create and display the| enhanced map
bus_map = create_enhanced_bus_locations_map(siri_vehicle_locations_480)
display(bus_map)

It looks great! (*note 18/03/2022 is Friday*)

Now we can use Pandas to get some information from this data.

### Notes and Resources

siri_rides/list: 
- `siri_route_ids`: route_ids field can be a comma-separated string containing a list of ids.
- all date/time parameters must have a timezone (for example: `datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=1)`).
- `order_by`: any field can be specified in `order_by` with asc or desc specifier, you can specify comma-separated multiple values.
- `limit`: any number can be specified for the limit as we use pagination behind the scenes, default is 10,000.

Documentation: https://open-bus-stride-api.hasadna.org.il/docs#/