# Simple Guide: Route-Tankerkoenig Integration

**Version v2 - Dec 2025**

This notebook shows how to:
1. Get fuel station data for any route
2. Prepare data for the prediction model
3. Access and manipulate the results

## Key Features:
- speed up 3-12 seconds per route (first request ~12s, cached ~3s)
- One function call gets everything
- Returns all data needed for predictions

## Requirements:
- Google Maps API key in `.env` file
- Supabase credentials in `.env` file
- Tankerkoenig API key in `.env` file

## Setup

In [1]:
import pandas as pd
from route_tankerkoenig_integration import get_fuel_prices_for_route

## Get Data for a Route

Simply provide start and end locations:

In [2]:
# Get fuel station data for Tübingen → Reutlingen
model_input = get_fuel_prices_for_route(
    start_locality="Tübingen",
    end_locality="Reutlingen",
    start_address="Wilhelmstraße 7",      # Optional: specific address
    end_address="Charlottenstraße 45",    # Optional: specific address
    use_realtime=False  # False = use historical prices
)

print(f"✓ Found {len(model_input)} stations along the route")


COMPLETE FUEL PRICE PIPELINE
Route: Tübingen → Reutlingen
Mode: HISTORICAL (demo)

Step 1: Geocoding addresses...
  Start: Wilhelmstraße 7, 72074 Tübingen, Germany (48.52451, 9.05951)
  End: Charlottenstraße 45, 72764 Reutlingen, Germany (48.49494, 9.21958)

Step 2: Calculating route...
  Distance: 18.1 km
  Duration: 23 min

Step 3: Finding fuel stations along route...

 Google Places Search Along Route status code: 200 Reason: OK

 Google Places Search Along Route status code: 200 Reason: OK

 Found 33 stations (all pages).
  Found 33 stations (with ETAs)

Step 4: Matching to Tankerkoenig database and fetching prices...

ROUTE-TANKERKOENIG INTEGRATION
Mode: HISTORICAL (yesterday = current)
Input stations: 33
Loading all stations from Supabase...
  Loaded 5,000 stations...
  Loaded 10,000 stations...
  Loaded 15,000 stations...
  Filtered out 1 stations with invalid coordinates
Loaded 17,689 valid stations from Supabase

Matching stations to Tankerkoenig database...
Matched: 29, Unma

## What You Get Back

A list of dictionaries, one per station:

In [3]:
# Look at the first station
first_station = model_input[0]

print("First Station Data:")
print(f"  Name: {first_station['tk_name']}")
print(f"  Brand: {first_station['brand']}")
print(f"  City: {first_station['city']}")
print(f"  E5 Price (yesterday): €{first_station['price_lag_1d_e5']}")
print(f"  E5 Price (7 days ago): €{first_station['price_lag_7d_e5']}")

First Station Data:
  Name: Aral Tankstelle
  Brand: ARAL
  City: Tübingen
  E5 Price (yesterday): €1.779
  E5 Price (7 days ago): €1.779


## Convert to DataFrame

For easier analysis and manipulation:

In [4]:
# Convert to pandas DataFrame
df = pd.DataFrame(model_input)

print(f"Shape: {df.shape}")
print(f"Columns: {len(df.columns)}")

# Show key columns
df[['tk_name', 'brand', 'city', 'price_lag_1d_e5', 'price_lag_7d_e5']].head()

Shape: (26, 32)
Columns: 32


Unnamed: 0,tk_name,brand,city,price_lag_1d_e5,price_lag_7d_e5
0,Aral Tankstelle,ARAL,Tübingen,1.779,1.779
1,Esso Tankstelle,ESSO,TUEBINGEN,1.729,1.739
2,Esso Tankstelle,ESSO,TUEBINGEN,1.729,1.739
3,Shell Reutlingen Karlstr. 66,Shell,Reutlingen,1.729,1.719
4,Esso Tankstelle,ESSO,REUTLINGEN,1.779,1.799


## Available Data Fields

### Station Information:
- `station_uuid` - Unique station ID
- `tk_name` - Station name
- `brand` - Brand (ARAL, SHELL, ESSO, etc.)
- `city` - City name
- `lat`, `lon` - Coordinates

### Route Information:
- `eta` - Estimated time of arrival
- `time_cell` - Time cell (0-47, used by model)
- `detour_distance_km` - Detour distance from route
- `detour_duration_min` - Detour time in minutes

### Price Data (for each fuel type: e5, e10, diesel):
- `price_lag_1d_{fuel}` - Price from 1 day ago
- `price_lag_2d_{fuel}` - Price from 2 days ago
- `price_lag_3d_{fuel}` - Price from 3 days ago
- `price_lag_7d_{fuel}` - Price from 7 days ago
- `price_current_{fuel}` - Current/predicted price

## Prepare Data for Model

Extract the 4 features needed for prediction:

In [5]:
def prepare_model_input(stations_list, fuel_type="e5"):
    """
    Extract model features from integration output.
    
    Args:
        stations_list: Output from get_fuel_prices_for_route()
        fuel_type: "e5", "e10", or "diesel"
    
    Returns:
        DataFrame with 4 lag features ready for model.predict()
    """
    df = pd.DataFrame(stations_list)
    
    # Extract the 4 lag columns
    features = df[[
        f'price_lag_1d_{fuel_type}',
        f'price_lag_2d_{fuel_type}',
        f'price_lag_3d_{fuel_type}',
        f'price_lag_7d_{fuel_type}'
    ]].copy()
    
    # Rename to match model training
    features.columns = ['price_lag_1d', 'price_lag_2d', 'price_lag_3d', 'price_lag_7d']
    
    return features

# Example usage
X = prepare_model_input(model_input, fuel_type="e5")
print("Model input ready:")
print(X.head())

Model input ready:
   price_lag_1d  price_lag_2d  price_lag_3d  price_lag_7d
0         1.779         1.789         1.779         1.779
1         1.729         1.689         1.729         1.739
2         1.729         1.689         1.729         1.739
3         1.729         1.729         1.729         1.719
4         1.779         1.789         1.829         1.799


## Example: Find Cheapest Station

In [6]:
# Find cheapest station based on yesterday's E5 price
cheapest = df.loc[df['price_lag_1d_e5'].idxmin()]

print("Cheapest Station:")
print(f"  Name: {cheapest['tk_name']}")
print(f"  Brand: {cheapest['brand']}")
print(f"  City: {cheapest['city']}")
print(f"  Price: €{cheapest['price_lag_1d_e5']}")
print(f"  Detour: {cheapest['detour_distance_km']:.1f} km")

Cheapest Station:
  Name: Supermarkt-Tankstelle KIRCHENTELLINSFURT WANNWEILER STR. 77
  Brand: Supermarkt-Tankstelle
  City: KIRCHENTELLINSFURT
  Price: €1.659
  Detour: 0.9 km


## Example: Compare Brands

In [7]:
# Average price by brand
brand_comparison = df.groupby('brand')['price_lag_1d_e5'].agg(['mean', 'min', 'max', 'count'])
brand_comparison.columns = ['Average', 'Min', 'Max', 'Stations']
brand_comparison = brand_comparison.sort_values('Average')

print("\nPrice Comparison by Brand (E5):")
print(brand_comparison.round(3))


Price Comparison by Brand (E5):
                       Average    Min    Max  Stations
brand                                                 
Supermarkt-Tankstelle    1.659  1.659  1.659         6
JET                      1.672  1.669  1.679         3
AVIA                     1.679  1.679  1.679         4
Shell                    1.734  1.729  1.739         4
ESSO                     1.749  1.729  1.779         7
ARAL                     1.779  1.779  1.779         2
