# Parsing GTFS format transit data in real time

In [7]:
from google.transit import gtfs_realtime_pb2
import requests

## Building a barebones data feed

1. Initialize the FeedMessage parser from Google
2. Get the reponnse from the API
3. Pass the response to the parser

## Initialize an instance of FeedMessage
Google defines a FeedMessage class in its library. We’ll add data to this class later, but right now we just need to initialize it.

In [11]:
feed = gtfs_realtime_pb2.FeedMessage()

## Get the response from the API

In [12]:
response = requests.get('http://files.transport.act.gov.au/feeds/lightrail.pb', allow_redirects=True)

## Pass the response to the parser
The FeedMessage class has a ParseFromString() method to read in the data.

In [9]:
feed.ParseFromString(response.content)

505

The parsed data is now available in the entity attribute:

In [10]:
feed.entity[0]

id: "12065010"
vehicle {
  trip {
    trip_id: "686"
  }
  position {
    latitude: -35.185970306396484
    longitude: 149.1375274658203
    odometer: 10299584.0
    speed: 5.308333396911621
  }
  current_stop_sequence: 2
  timestamp: 1558174560
  congestion_level: RUNNING_SMOOTHLY
  vehicle {
    id: "13"
    label: "LRV13"
    license_plate: "LRV13"
  }
}

## Use `trip_update`
Not every entity in the feed will have a real-time update of transit status. Destinations and departure locations might also be included.

If you want to just focus on data that is updating a currently ongoing revenue trip, then filter for trip_update using the FeedMessage’s HasField method

For example, in my data today, only 218 of my 373 entities were actually trip updates:

In [13]:
len(feed.entity)

0

In [14]:
sum([1 for ent in feed.entity if ent.HasField('trip_update')])

0

In [15]:
sum([1 for ent in feed.entity if not ent.HasField('trip_update')])

0