Skip to content

Usage: 2.3. Reading Data: GTFS

Kasia Kozlowska edited this page Jul 14, 2022 · 5 revisions

Reading GTFS data

This page goes through methods for reading in GTFS (General Transit Feed Specification). Here is the reference page for the schema of GTFS data. Available as a jupyter notebook or wiki page.

Small sample of this data can be found in tests/test_data/gtfs.

GeNet ingests zipped or unzipped GTFS feeds. The following files are required in the unzipped folder, or inside the zip file:

  • calendar.txt
  • stop_times.txt
  • stops.txt
  • trips.txt
  • routes.txt

When reading a GTFS feed, GeNet expects a date in YYYYMMDD format. It will raise an error if the selected date yields no services.

GeNet does support extraction of services/routes/stops in the output genet.Schedule object based on a geographical area (Methods: services_on_spatial_condition, routes_on_spatial_condition, stops_on_spatial_condition, more information in notebook on using genet Network), but you might like to use gtfs-lib prior to ingestion in GeNet.

The user assumes responsibility for the quality of their input GTFS feed. There are various validation tools that can be used with GTFS feeds before using with GeNet, see this page for a summary of validation tools.

from genet import read_gtfs

We initiate an empty Schedule

s = read_gtfs('../example_data/example_gtfs', '20190603')
2022-07-14 15:27:50,737 - Reading GTFS from ../example_data/example_gtfs
2022-07-14 15:27:50,740 - Reading the calendar for GTFS
2022-07-14 15:27:50,743 - Reading GTFS data into usable format
2022-07-14 15:27:50,745 - Reading stop times
2022-07-14 15:27:50,784 - Reading trips
2022-07-14 15:27:50,810 - Reading stops
2022-07-14 15:27:50,823 - Reading routes

GTFS is assumed to be in epsg:4326, you need to project the Schedule to projection you require.

s.reproject('epsg:27700')
s.print()
Schedule:
Number of services: 2
Number of routes: 2
Number of stops: 4
# s.plot()