# Utah Transit Agency Example
In this example, we'll predict the energy consumption for some trips operated by the Utah Transit Authority (UTA) in Salt Lake City. This requires specifying the GTFS data we are analyzing, processing it to produce RouteE-Powertrain inputs, and running a RouteE-Powertrain model to produce energy estimates. 

In [10]:
import logging
import multiprocessing as mp
import os

from nrel.routee.transit import (
    build_routee_features_with_osm,
    predict_for_all_trips,
    repo_root,
)

# Set up logging: Clear any existing handlers
logging.getLogger().handlers.clear()

# Configure basic logging
logging.basicConfig(
    level=logging.INFO, format="%(asctime)s [%(levelname)s] %(name)s - %(message)s"
)

# Suppress GDAL/PROJ warnings, which flood the output when we run gradeit
os.environ["PROJ_DEBUG"] = "0"

In [11]:
# Set inputs
n_proc = mp.cpu_count()
input_directory = repo_root() / "sample-inputs/saltlake"
output_directory = repo_root() / "reports/saltlake"
if not output_directory.exists():
    output_directory.mkdir(parents=True)

## Process GTFS Data into RouteE Inputs
`build_routee_features_with_osm()` analyzes a GTFS feed to prepare input features for energy prediction with RouteE-Powertrain. It performs the following steps:
- Upsamples all shapes so they are suitable for map matching
- Uses NREL's `mappymatch` package to match each shape to a set of OpenStreetMap road links.
- Uses NREL's `gradeit` package to add estimated average grade to each road link. USGS elevation tiles are downloaded and cached if needed.

In [12]:
routee_input_df = build_routee_features_with_osm(
    input_directory=input_directory,
    n_trips=30,  # make predictions for 30 randomly sampled trips
    add_road_grade=True,
    n_processes=n_proc,
)


2025-08-08 14:47:08,069 [INFO] gtfs_feature_processing - Feed contains 12037 trips and 89590 shapes
2025-08-08 14:47:08,072 [INFO] gtfs_feature_processing - Restricted feed to 30 trips and 27 shapes
2025-08-08 14:47:11,776 [INFO] gtfs_feature_processing - Finished upsampling
2025-08-08 14:47:11,777 [INFO] gtfs_feature_processing - Original shapes length: 9431
2025-08-08 14:47:11,779 [INFO] gtfs_feature_processing - Upsampled shapes length: 56830
2025-08-08 14:47:31,768 [INFO] gtfs_feature_processing - Finished map matching
2025-08-08 14:47:35,470 [INFO] gtfs_feature_processing - Finished attaching timestamps
2025-08-08 14:47:35,500 [INFO] /Users/dmccabe/repos/public/routee-transit/nrel/routee/transit/prediction/grade/add_grade.py - Running gradeit on 30 trips with 12 processes.
2025-08-08 14:47:35,569 [INFO] nrel.routee.transit.prediction.grade.download - Downloading 4 USGS tiles at ONE_THIRD_ARC_SECOND resolution.
2025-08-08 14:47:35,569 [INFO] nrel.routee.transit.prediction.grade.dow

The output of `build_routee_features_with_osm()` is a DataFrame where each row represents the traversal of a particular road network edge during a particular bus trip. It includes the features needed to make energy predictions with RouteE, such as the travel time reported by OpenStreetMap (`travel_time_osm`), the distance (`distances_ft`), and the estimated road grade as a decimal value (`grade_dec_unfiltered`/`grade_dec_filtered`, depending on whether filtering is used in `gradeit`). 

In [13]:
routee_input_df.head()

Unnamed: 0,trip_id,shape_id,road_id,start_lat,start_lon,end_lat,end_lon,geom,start_timestamp,end_timestamp,kilometers,travel_time_osm,elevation_ft,distances_ft,grade_dec_unfiltered,elevation_ft_filtered,grade_dec_filtered
0,5168740,226323,"(83541843, 1585109177, 0)",40.76717,-111.87962,40.76717,-111.87682,LINESTRING (-12454386.670005087 4978059.947421...,0 days 14:59:33,0 days 15:00:04,0.241364,17.997233,4426.336956,25732.087438,0.0038,4355.188997,0.0004
1,5168740,226323,"(83542296, 83559828, 0)",40.70112,-111.85118,40.69992,-111.85074,LINESTRING (-12451217.827116268 4968359.615441...,0 days 15:24:50,0 days 15:25:08,0.143486,9.170597,4330.995954,8905.84018,-0.0107,4363.653044,0.001
2,5168740,226323,"(83542655, 83542668, 0)",40.72217,-111.865379,40.721866,-111.865379,LINESTRING (-12452797.684461555 4971452.205850...,0 days 15:17:48,0 days 15:17:52,0.042699,3.183834,4330.770876,138.61549,-0.0016,4369.709279,0.0437
3,5168740,226323,"(83542668, 83548429, 0)",40.72179,-111.865379,40.719935,-111.86539,LINESTRING (-12452797.7401213 4971395.80316137...,0 days 15:17:53,0 days 15:18:20,0.21367,15.932243,4370.07928,23873.425961,0.0016,4373.512217,0.0002
4,5168740,226323,"(83543083, 83629137, 0)",40.66143,-111.83634,40.660323,-111.835233,LINESTRING (-12449567.537929153 4962536.570911...,0 days 15:34:53,0 days 15:35:13,0.165096,9.232749,4337.885637,29568.209608,-0.0011,4375.21637,0.0001


## Predict Energy Consumption with RouteE-Powertrain
We can now make energy predictions with the data in `routee_input_df` and any trained RouteE Powertrain model. We'll use `"Transit_Bus_Battery_Electric"`, included in `nrel.routee.powertrain` 1.3.2, which is trained on real-world energy data from an electric bus in Salt Lake City.

`predict_with_all_trips()` provides a convenient wrapper for making energy consumption predictions given a RouteE model and the input variables necessary to predict with it:

In [14]:
routee_vehicle_model = "Transit_Bus_Battery_Electric"
routee_results = predict_for_all_trips(
    routee_input_df=routee_input_df,
    routee_vehicle_model=routee_vehicle_model,
    n_processes=n_proc,
)


`routee_results` contains link-level energy predictions for each trip.

In [15]:
routee_results.head()

Unnamed: 0,trip_id,shape_id,road_id,geom,kilometers,travel_time_osm,grade_dec_unfiltered,kWhs
0,5168740,226323,"(83541843, 1585109177, 0)",LINESTRING (-12454386.670005087 4978059.947421...,0.241364,17.997233,0.0038,0.180449
1,5168740,226323,"(83542296, 83559828, 0)",LINESTRING (-12451217.827116268 4968359.615441...,0.143486,9.170597,-0.0107,-0.037439
2,5168740,226323,"(83542655, 83542668, 0)",LINESTRING (-12452797.684461555 4971452.205850...,0.042699,3.183834,-0.0016,0.022878
3,5168740,226323,"(83542668, 83548429, 0)",LINESTRING (-12452797.7401213 4971395.80316137...,0.21367,15.932243,0.0016,0.09938
4,5168740,226323,"(83543083, 83629137, 0)",LINESTRING (-12449567.537929153 4962536.570911...,0.165096,9.232749,-0.0011,0.088458


We can aggregate over trip IDs to get the total energy estimated per trip.

In [16]:
energy_by_trip = routee_results.groupby("trip_id").agg(
    {"kilometers": "sum", "kWhs": "sum"}
)

In [17]:
energy_by_trip["miles"] = 0.6213712 * energy_by_trip["kilometers"]
energy_by_trip["kwh_per_mi"] = energy_by_trip["kWhs"] / energy_by_trip["miles"]
energy_by_trip.head(10)

Unnamed: 0_level_0,kilometers,kWhs,miles,kwh_per_mi
trip_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
5168740,29.216242,20.059818,18.154132,1.104973
5171401,12.255023,8.948623,7.614918,1.175144
5171504,12.255023,8.948623,7.614918,1.175144
5171928,20.08298,9.536594,12.478985,0.764212
5172181,20.496304,11.095408,12.735813,0.871197
5173198,12.812999,7.011186,7.961629,0.880622
5173843,24.387409,12.795452,15.153633,0.844382
5173915,30.576677,15.152437,18.999466,0.797519
5174100,14.894245,10.277691,9.254855,1.110519
5175543,20.765415,17.011624,12.903031,1.318421


In [18]:
energy_by_trip["kwh_per_mi"].describe()

count    30.000000
mean      1.190660
std       0.459268
min       0.764212
25%       0.873554
50%       1.090047
75%       1.311554
max       2.572348
Name: kwh_per_mi, dtype: float64

Note that the predicted energy consumption values are relatively low because the current RouteE Transit pipeline does not account for HVAC loads, which are a major contributor to BEB energy usage.