# Quick Start: Downloading and Processing GTFS Feeds

### 🚀 Install dependencies (if you're running this on Google Colab)

In [5]:
!pip install geopandas
!pip install git+https://github.com/MiguelUrenaPliego/pyGTFSHandler.git
!pip install folium matplotlib mapclassify

## Import packages

In [2]:
from pyGTFSHandler import GTFS
from pyGTFSHandler.downloaders.spain.NAP import APIClient
import os

## Create the download path for your gtfs files

In [4]:
gtfs_download_path = './gtfs_files'
if os.path.isdir(gtfs_download_path) == False:
    os.makedirs(gtfs_download_path)

## Option 1: Access the Spanish NAP Official API to download GTFS Feeds  anywhere in Spain

To use this option, you need to obtain an API key from the official portal: [https://nap.transportes.gob.es/](https://nap.transportes.gob.es/)


In [6]:
api_key = 'b753bf33-9300-4bee-8c3d-8e35009de69c'
api = APIClient(api_key)

In [7]:
files = api.find_files(
    region='Vitoria-Gasteiz', # Name of the region you want to explore
    region_type="municipality", # Type of region 'municipality', 'urbanarea', 'province' or 'state'
    transport_type='bus', # Type of transportation system 'bus', 'rail', 'boat' or 'plane'
    file_description='urbano', # Text that the gtfs feed has to include in its name or description
    start_date='today', # The gtfs feed has to include the range of dates specified here
    end_date='today' # Use today for today's date
)

for file in files:
    print(file['nombre']) # print the names of the feeds that were found

Alavabus
Autobús interurbano de Navarra
Transportes Urbanos de Vitoria (TUVISA)


In [8]:
file_paths = api.download_file( # Download the feeds found before
    file_ids=files,
    output_path=gtfs_download_path
)

File gtfs_files/alavabus downloaded successfully.
File gtfs_files/autobus_interurbano_de_navarra downloaded successfully.
File gtfs_files/transportes_urbanos_de_vitoria_tuvisa downloaded successfully.


## Option 2: Execute this command to download an example feed or upload your own feed to the `gtfs_download_path folder`

Download an example gtfs feed with this cell

In [11]:
url = "https://github.com/MiguelUrenaPliego/pyGTFSHandler/raw/refs/heads/main/examples/transportes_urbanos_de_vitoria_tuvisa.zip"
!curl -L -o {os.path.join(gtfs_download_path, 'transportes_urbanos_de_vitoria_tuvisa.zip')} {url}

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 9718k  100 9718k    0     0  6999k      0  0:00:01  0:00:01 --:--:-- 6999k


Extract all .zip files in the `gtfs_download_path`

In [14]:
import zipfile

file_paths = [] # list of gtfs feeds path you want to process

# Loop through files in the directory
for file in os.listdir(gtfs_download_path):
    # Check if the file has a .zip extension
    if file.endswith('.zip'):
        zip_file_path = os.path.join(gtfs_download_path, file)
        # Define the extraction path by removing the '.zip' extension
        extract_path = os.path.join(gtfs_download_path, file.replace('.zip', ''))
        file_paths.append(extract_path)
        # Extract the contents of the zip file
        with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
            zip_ref.extractall(extract_path)

file_paths

['./gtfs_files/transportes_urbanos_de_vitoria_tuvisa']

## GTFS object creation

In [15]:
gtfs = GTFS(
    file_paths,
    service_date='max', # choose the day with the maximum number of services (trips * stops) or write a date in %d-%m-%Y format.
    start_time='06:00:00', # time bounds to select trips from
    end_time='20:00:00',
    stop_group_distance=200, # stops will be clustered with this distance
    trip_group_distance=500, # lines are considered to be overlapping if theur stops are less than this distance apart
    trip_group_overlap = 0.5, # lines will be considered branches of the same line if they overlap more that this percentage of the trip
)

Service date found 2025-03-14 friday with 35202 trips x stops.


## Stop service quality

Now we can use the gtfs object to perform analysis. In this example we want to get 8 categories ranging from 1 (stops served by very frequent lines) to 8 (stops served by very infrecuent lines).

In [20]:
stop_quality = gtfs.stop_service_quality(
    frequencies=[5,10,20,40,60,90,120,240], # maximum line frequency we want for every stop service quality class
    agg='max', # choose for every stop the line with the most frequency or write sum to add the frequencies of all lines.
    start_time='06:00:00', # time bounds to select trips from
    end_time='20:00:00',
)

## Let's plot a map showing the results

In [27]:
stop_quality.explore(column='service_quality',cmap='RdYlGn_r',vmin=1,vmax=8, k=8, marker_kwds={'radius':5})