# [LADOT]((https://www.ladotbus.com/)) GTFS Realtime Vehicle positions

GTFS Realtime is a feed specification that allows public transportation agencies to provide realtime updates about their fleet to application developers. It is used globally by many transit agencies and is one of the widely adopted open data standards for public transportation industry.

We will be exploring Vehicle Positions from [LADOT (Los Angeles Department of Transportation)](https://www.ladotbus.com/).

## Imports

In [1]:
from google.transit import gtfs_realtime_pb2
from protobuf_to_dict import protobuf_to_dict
import requests
from collections import OrderedDict
import pandas as pd
import folium

## Urls

Vehicle Positions — information about the vehicles including location and congestion level in real time.

In [2]:
# gtfs_schedule_url = "https://ladotbus.com/gtfs"
ladot_vehicle_positions_url = "https://ladotbus.com/gtfs-rt/vehiclepositions"
#ladot_service_alerts_url = "https://ladotbus.com/gtfs-rt/alerts"
#ladot_trip_updates_url = "https://ladotbus.com/gtfs-rt/tripupdates"

## Request

We will download a GTFS-realtime data feed from our URL, parsing it as a [FeedMessage (the root type of the GTFS-realtime schema)](https://developers.google.com/transit/gtfs-realtime/reference), and iterating over the results. Code snippet adapted from [Google Transit APIs](https://developers.google.com/transit/gtfs-realtime/examples/python-sample)

In [3]:
# Get FeedMessage from url
feed = gtfs_realtime_pb2.FeedMessage()
response = requests.get(ladot_vehicle_positions_url)
feed.ParseFromString(response.content)

# Let's look at one entity
feed.entity[0]    

id: "vehicle_6771"
vehicle {
  trip {
    trip_id: "30-neVETEJmxy1"
    start_time: "21:00:00"
    start_date: "20220605"
    direction_id: 1
  }
  position {
    latitude: 34.10550308227539
    longitude: -118.29170989990234
    bearing: 6.942473888397217
    speed: 5.811520099639893
  }
  timestamp: 1654491529
  vehicle {
    id: "6771"
    label: "15344"
  }
  occupancy_status: EMPTY
}

## Data

Each entity contains the following information:

- `id`
- `vehicle`
    - `trip`
        - `trip_id`
        - `start_time`
        - `start_date`
        - `direction_id`: DATETYPE, DESCRIPTION, EXAMPLES
    - `position`
        - `latitude`
        - `longitude`
        - `bearing`
        - `speed`
    - `timestamp`
    - `vehicle`
        - `id`
        - `label`
    - `occupancy_status`

## Convert feed to dataframe

We want to parse GTFS Real time Protobuf into more usable tabular format. Let's use a FOR LOOP to iterate over the nested dictionary structure and collect each rows and append & collect it in form of pandas dataframe.

In [4]:
dict_obj = protobuf_to_dict(feed)

In [5]:
collector = []

for block in dict_obj['entity']:
    row = OrderedDict()
    # id
    row['id'] = block['id']
    # vehicle blocks
    trip = block['vehicle']['trip']
    position = block['vehicle']['position']
    vehicle = block['vehicle']['vehicle']
    
    # trip
    row['trip_id'] = trip.get('trip_id','')
    row['start_time'] = trip.get('start_time','')
    row['start_date'] = trip.get('start_date','')
    row['direction_id'] = trip.get('direction_id','')
    # position 
    row['latitude'] = position.get('latitude','')
    row['longitude'] = position.get('longitude','')
    row['bearing'] = position.get('bearing','')
    row['speed'] = position.get('speed','')
    # timestamp
    row['timestamp'] = block['vehicle']['timestamp']
    # vehicle
    row['id'] = vehicle.get('id','')
    row['label'] = vehicle.get('label','')
    # occupancy_status
    row['occupancy_status'] = block['vehicle']['occupancy_status']
    
    collector.append(row)
    
df = pd.DataFrame(collector)

In [6]:
df

Unnamed: 0,id,trip_id,start_time,start_date,direction_id,latitude,longitude,bearing,speed,timestamp,label,occupancy_status
0,6771,30-neVETEJmxy1,21:00:00,20220605,1,34.105503,-118.29171,6.942474,5.81152,1654491529,15344,0
1,6454,30-tY5TJca356u,21:17:00,20220605,1,34.042389,-118.183632,270.732178,6.7056,1654491526,9314,0
2,6449,30-AcWWM07NUEc,21:07:00,20220605,1,34.033974,-118.271355,119.686852,8.04672,1654491528,9322,0
3,3805,30-fZ4AtJ0uD_x,18:50:00,20220605,0,34.018772,-118.237259,217.0,0.0,1654486958,17305,0
4,709,30-i056GJllk0c,21:47:00,20220605,0,34.043827,-118.277367,29.014027,8.04672,1654491526,13325,0
5,1631,30-A-jTbKPPYy0,21:37:00,20220605,1,34.063576,-118.272362,207.491013,11.176,1654491528,15343,0
6,708,30-aehmDrhJlQE,21:02:00,20220605,0,34.081421,-118.254539,8.71995,6.25856,1654491528,12335,0
7,1641,30-9S7mh9SWI0N,21:50:00,20220605,0,34.114105,-118.290352,209.584473,4.02336,1654491526,15346,1
8,6242,183-WjuweietML,21:00:00,20220605,0,34.052723,-118.235657,98.0,3.57632,1654491527,20325,0
9,6227,183-PZKM-tf2We,20:45:00,20220605,0,34.054436,-118.246849,112.0,0.89408,1654491526,20309,0


## Visualization using folium

In [7]:
this_map = folium.Map(prefer_canvas=True)

def plotDot(point):
    '''input: series that contains a numeric named latitude and a numeric named longitude
    this function creates a CircleMarker and adds it to your this_map'''
    folium.CircleMarker(location=[point.latitude, point.longitude],
                        popup=point.id,
                        radius=5,
                        weight=5).add_to(this_map)

#use df.apply(,axis=1) to "iterate" through every row in your dataframe
df.apply(plotDot, axis = 1)


#Set the zoom to the maximum possible
this_map.fit_bounds(this_map.get_bounds())

#Save the map to an HTML file
this_map.save('html_map_output/simple_dot_plot.html')

this_map