# HERE Traffic API Data Extraction

This notebook demonstrates how to fetch traffic data from the HERE Traffic API v7 and prepare it for merging with map data.

## Step 1: Install Required Libraries

In [None]:
#!pip install requests pandas geopandas matplotlib folium

Collecting folium
  Downloading folium-0.20.0-py2.py3-none-any.whl.metadata (4.2 kB)
Collecting branca>=0.6.0 (from folium)
  Downloading branca-0.8.2-py3-none-any.whl.metadata (1.7 kB)
Collecting xyzservices (from folium)
  Downloading xyzservices-2025.4.0-py3-none-any.whl.metadata (4.3 kB)
Downloading folium-0.20.0-py2.py3-none-any.whl (113 kB)
Downloading branca-0.8.2-py3-none-any.whl (26 kB)
Downloading xyzservices-2025.4.0-py3-none-any.whl (90 kB)
Installing collected packages: xyzservices, branca, folium

   ---------------------------------------- 0/3 [xyzservices]
   ------------- -------------------------- 1/3 [branca]
   -------------------------- ------------- 2/3 [folium]
   -------------------------- ------------- 2/3 [folium]
   -------------------------- ------------- 2/3 [folium]
   -------------------------- ------------- 2/3 [folium]
   -------------------------- ------------- 2/3 [folium]
   -------------------------- ------------- 2/3 [folium]
   -------------------


[notice] A new release of pip is available: 25.1.1 -> 25.2
[notice] To update, run: C:\Users\maxyj\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


## Step 2: Import Libraries

In [2]:
import requests
import pandas as pd
import geopandas as gpd
import json
import matplotlib.pyplot as plt
import folium
from datetime import datetime
import os
import hashlib

# Create cache directory if it doesn't exist
if not os.path.exists('cache'):
    os.makedirs('cache')

print("Libraries imported successfully!")

Libraries imported successfully!


## Step 3: Set Up HERE API Credentials

You need to:
1. Sign up at [HERE Developer Portal](https://developer.here.com/)
2. Create a project and generate an API key
3. Replace `YOUR_API_KEY` below with your actual API key

In [3]:
# HERE API Configuration
HERE_API_KEY = "JOoCtqGrQSmEuAH-BcGqb3a8fJy0Qm89fpfMUVh1qmY"  # Replace with your actual API key

# HERE Traffic API v7 Endpoints
TRAFFIC_FLOW_URL = "https://data.traffic.hereapi.com/v7/flow"
TRAFFIC_INCIDENTS_URL = "https://data.traffic.hereapi.com/v7/incidents"

print("API configuration set!")

API configuration set!


## Step 4: Helper Functions for API Requests

In [4]:
def get_cache_filename(url, params):
    """Generate a unique cache filename based on request parameters"""
    cache_key = f"{url}_{json.dumps(params, sort_keys=True)}"
    hash_key = hashlib.sha1(cache_key.encode()).hexdigest()
    return f"cache/{hash_key}.json"

def fetch_traffic_flow(bbox, api_key, use_cache=True):
    """
    Fetch traffic flow data for a bounding box
    
    Parameters:
    - bbox: tuple (west, south, east, north) in WGS84 coordinates
    - api_key: HERE API key
    - use_cache: whether to use cached data
    
    Returns:
    - JSON response with traffic flow data
    """
    params = {
        'apiKey': api_key,
        'in': f'bbox:{bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]}',
        'locationReferencing': 'shape'  # Include geometry
    }
    
    cache_file = get_cache_filename(TRAFFIC_FLOW_URL, params)
    
    # Check cache first
    if use_cache and os.path.exists(cache_file):
        print(f"Loading from cache: {cache_file}")
        with open(cache_file, 'r') as f:
            return json.load(f)
    
    # Make API request
    print(f"Fetching traffic flow data from API...")
    response = requests.get(TRAFFIC_FLOW_URL, params=params)
    
    if response.status_code == 200:
        data = response.json()
        
        # Cache the response
        with open(cache_file, 'w') as f:
            json.dump(data, f)
        print(f"Data cached to: {cache_file}")
        
        return data
    else:
        print(f"Error: {response.status_code}")
        print(response.text)
        return None

def fetch_traffic_incidents(bbox, api_key, use_cache=True):
    """
    Fetch traffic incidents for a bounding box
    
    Parameters:
    - bbox: tuple (west, south, east, north) in WGS84 coordinates
    - api_key: HERE API key
    - use_cache: whether to use cached data
    
    Returns:
    - JSON response with traffic incident data
    """
    params = {
        'apiKey': api_key,
        'in': f'bbox:{bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]}',
        'locationReferencing': 'shape'
    }
    
    cache_file = get_cache_filename(TRAFFIC_INCIDENTS_URL, params)
    
    # Check cache first
    if use_cache and os.path.exists(cache_file):
        print(f"Loading from cache: {cache_file}")
        with open(cache_file, 'r') as f:
            return json.load(f)
    
    # Make API request
    print(f"Fetching traffic incidents from API...")
    response = requests.get(TRAFFIC_INCIDENTS_URL, params=params)
    
    if response.status_code == 200:
        data = response.json()
        
        # Cache the response
        with open(cache_file, 'w') as f:
            json.dump(data, f)
        print(f"Data cached to: {cache_file}")
        
        return data
    else:
        print(f"Error: {response.status_code}")
        print(response.text)
        return None

print("Helper functions defined!")

Helper functions defined!


## Step 5: Define Area of Interest

Define the bounding box for the area you want to fetch traffic data. Example coordinates for central Kuala Lumpur.

In [5]:
# Define bounding box: (west, south, east, north)
# Example: Central Kuala Lumpur
bbox_kl = (101.67, 3.13, 101.70, 3.16)

# Example: Singapore
bbox_singapore = (103.80, 1.28, 103.86, 1.32)

# Choose which area to query
selected_bbox = bbox_kl
location_name = "Kuala Lumpur"

print(f"Selected location: {location_name}")
print(f"Bounding box: West={selected_bbox[0]}, South={selected_bbox[1]}, East={selected_bbox[2]}, North={selected_bbox[3]}")

Selected location: Kuala Lumpur
Bounding box: West=101.67, South=3.13, East=101.7, North=3.16


## Step 6: Fetch Traffic Flow Data

In [6]:
# Fetch traffic flow data
traffic_flow_data = fetch_traffic_flow(selected_bbox, HERE_API_KEY, use_cache=True)

if traffic_flow_data:
    print(f"\nTraffic Flow Data Retrieved!")
    print(f"Number of results: {len(traffic_flow_data.get('results', []))}")
    
    # Display first result as example
    if traffic_flow_data.get('results'):
        print("\nExample result structure:")
        print(json.dumps(traffic_flow_data['results'][0], indent=2)[:500])
else:
    print("Failed to fetch traffic flow data. Check your API key and internet connection.")

Fetching traffic flow data from API...
Data cached to: cache/c490e0e07c8c89cd20defd442bde397c878fbe6c.json

Traffic Flow Data Retrieved!
Number of results: 335

Example result structure:
{
  "location": {
    "description": "Jalan Sultan Sulaiman/Jalan Sultan Sulaiman",
    "length": 286.0,
    "shape": {
      "links": [
        {
          "points": [
            {
              "lat": 3.14012,
              "lng": 101.69544
            },
            {
              "lat": 3.14009,
              "lng": 101.69537
            }
          ],
          "length": 8.0,
          "functionalClass": 2
        },
        {
          "points": [
            {
              "lat": 3.140
Data cached to: cache/c490e0e07c8c89cd20defd442bde397c878fbe6c.json

Traffic Flow Data Retrieved!
Number of results: 335

Example result structure:
{
  "location": {
    "description": "Jalan Sultan Sulaiman/Jalan Sultan Sulaiman",
    "length": 286.0,
    "shape": {
      "links": [
        {
          "points

## Step 7: Parse Traffic Flow Data into DataFrame

In [7]:
def parse_traffic_flow_to_dataframe(traffic_data):
    """Convert traffic flow JSON to pandas DataFrame"""
    
    if not traffic_data or 'results' not in traffic_data:
        return None
    
    records = []
    for result in traffic_data['results']:
        location = result.get('location', {})
        current_flow = result.get('currentFlow', {})
        
        record = {
            'location_description': location.get('description', ''),
            'speed': current_flow.get('speed', None),
            'speed_limit': current_flow.get('speedLimit', None),
            'jam_factor': current_flow.get('jamFactor', None),
            'confidence': current_flow.get('confidence', None),
            'free_flow_speed': current_flow.get('freeFlowSpeed', None),
            'traversability': current_flow.get('traversability', ''),
        }
        
        # Extract geometry if available
        if 'shape' in location:
            shape = location['shape']
            if 'links' in shape:
                # Extract coordinates from links
                coords = []
                for link in shape['links']:
                    if 'points' in link:
                        for point in link['points']:
                            coords.append((point.get('lng'), point.get('lat')))
                record['geometry'] = coords
        
        records.append(record)
    
    df = pd.DataFrame(records)
    return df

# Parse the data
if traffic_flow_data:
    flow_df = parse_traffic_flow_to_dataframe(traffic_flow_data)
    
    if flow_df is not None:
        print(f"Traffic Flow DataFrame created with {len(flow_df)} records")
        print("\nDataFrame Info:")
        print(flow_df.info())
        print("\nFirst few records:")
        print(flow_df.head())
        
        # Display statistics
        print("\nTraffic Statistics:")
        print(f"Average Speed: {flow_df['speed'].mean():.2f} km/h")
        print(f"Average Jam Factor: {flow_df['jam_factor'].mean():.2f}")
        print(f"Average Free Flow Speed: {flow_df['free_flow_speed'].mean():.2f} km/h")

Traffic Flow DataFrame created with 335 records

DataFrame Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 335 entries, 0 to 334
Data columns (total 8 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   location_description  335 non-null    object 
 1   speed                 333 non-null    float64
 2   speed_limit           0 non-null      object 
 3   jam_factor            335 non-null    float64
 4   confidence            333 non-null    float64
 5   free_flow_speed       0 non-null      object 
 6   traversability        335 non-null    object 
 7   geometry              335 non-null    object 
dtypes: float64(3), object(5)
memory usage: 21.1+ KB
None

First few records:
                          location_description      speed speed_limit  \
0  Jalan Sultan Sulaiman/Jalan Sultan Sulaiman  12.500000        None   
1                                 Leboh Ampang   3.055556        None   
2                 Jalan 

## Step 8: Fetch Traffic Incidents

In [8]:
# Fetch traffic incidents
traffic_incidents_data = fetch_traffic_incidents(selected_bbox, HERE_API_KEY, use_cache=True)

if traffic_incidents_data:
    print(f"\nTraffic Incidents Data Retrieved!")
    print(f"Number of incidents: {len(traffic_incidents_data.get('results', []))}")
    
    # Display first incident as example
    if traffic_incidents_data.get('results'):
        print("\nExample incident structure:")
        print(json.dumps(traffic_incidents_data['results'][0], indent=2)[:500])
else:
    print("No incidents found or failed to fetch data.")

Fetching traffic incidents from API...
Data cached to: cache/ba32628a8125a8b66479bd721678955754eae4b0.json

Traffic Incidents Data Retrieved!
Number of incidents: 3

Example incident structure:
{
  "location": {
    "length": 159.0,
    "shape": {
      "links": [
        {
          "points": [
            {
              "lat": 3.15528,
              "lng": 101.69305
            },
            {
              "lat": 3.15527,
              "lng": 101.69312
            },
            {
              "lat": 3.15521,
              "lng": 101.69322
            },
            {
              "lat": 3.15489,
              "lng": 101.69337
            }
          ],
          "length": 60.0,
Data cached to: cache/ba32628a8125a8b66479bd721678955754eae4b0.json

Traffic Incidents Data Retrieved!
Number of incidents: 3

Example incident structure:
{
  "location": {
    "length": 159.0,
    "shape": {
      "links": [
        {
          "points": [
            {
              "lat": 3.15528,
   

## Step 9: Parse Traffic Incidents into DataFrame

In [9]:
def parse_incidents_to_dataframe(incidents_data):
    """Convert traffic incidents JSON to pandas DataFrame"""
    
    if not incidents_data or 'results' not in incidents_data:
        return None
    
    records = []
    for incident in incidents_data['results']:
        location = incident.get('location', {})
        incident_details = incident.get('incidentDetails', {})
        
        record = {
            'incident_id': incident.get('incidentId', ''),
            'original_id': incident.get('originalId', ''),
            'type': incident_details.get('type', ''),
            'description': incident_details.get('description', {}).get('value', ''),
            'criticality': incident_details.get('criticality', ''),
            'start_time': incident_details.get('startTime', ''),
            'end_time': incident_details.get('endTime', ''),
            'entry_time': incident_details.get('entryTime', ''),
        }
        
        # Extract location coordinates
        if 'shape' in location:
            shape = location['shape']
            if 'links' in shape:
                coords = []
                for link in shape['links']:
                    if 'points' in link:
                        for point in link['points']:
                            coords.append((point.get('lng'), point.get('lat')))
                record['geometry'] = coords
        
        records.append(record)
    
    df = pd.DataFrame(records)
    return df

# Parse incidents data
if traffic_incidents_data:
    incidents_df = parse_incidents_to_dataframe(traffic_incidents_data)
    
    if incidents_df is not None and len(incidents_df) > 0:
        print(f"Traffic Incidents DataFrame created with {len(incidents_df)} records")
        print("\nDataFrame Info:")
        print(incidents_df.info())
        print("\nFirst few incidents:")
        print(incidents_df.head())
        
        # Display incident type counts
        print("\nIncident Types:")
        print(incidents_df['type'].value_counts())
    else:
        print("No incidents to display.")

Traffic Incidents DataFrame created with 3 records

DataFrame Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   incident_id  3 non-null      object
 1   original_id  3 non-null      object
 2   type         3 non-null      object
 3   description  3 non-null      object
 4   criticality  3 non-null      object
 5   start_time   3 non-null      object
 6   end_time     3 non-null      object
 7   entry_time   3 non-null      object
 8   geometry     3 non-null      object
dtypes: object(9)
memory usage: 348.0+ bytes
None

First few incidents:
  incident_id original_id         type  \
0                           roadHazard   
1                          roadClosure   
2                          roadClosure   

                                      description criticality  \
0                 Road surface in poor condition.       minor   
1         

## Step 10: Visualize Traffic Data on a Map

In [11]:
def visualize_traffic_on_map(flow_df, incidents_df, bbox, location_name):
    """Create an interactive map with traffic flow and incidents"""
    
    # Calculate center of bounding box
    center_lat = (bbox[1] + bbox[3]) / 2
    center_lon = (bbox[0] + bbox[2]) / 2
    
    # Create map
    m = folium.Map(location=[center_lat, center_lon], zoom_start=13)
    
    # Add traffic flow lines
    if flow_df is not None and 'geometry' in flow_df.columns:
        for idx, row in flow_df.iterrows():
            if row['geometry'] and len(row['geometry']) > 0:
                # Determine color based on jam factor
                jam_factor = row['jam_factor'] if pd.notna(row['jam_factor']) else 0
                if jam_factor < 2:
                    color = 'green'  # Free flow
                elif jam_factor < 4:
                    color = 'yellow'  # Moderate
                elif jam_factor < 8:
                    color = 'orange'  # Slow
                else:
                    color = 'red'  # Congested
                
                # Convert geometry to lat/lon format for folium
                coords = [(lat, lon) for lon, lat in row['geometry']]
                
                # Create popup with traffic info (handle None values)
                speed_text = f"{row['speed']:.1f}" if pd.notna(row['speed']) else "N/A"
                speed_limit_text = f"{row['speed_limit']}" if pd.notna(row['speed_limit']) else "N/A"
                jam_factor_text = f"{row['jam_factor']:.1f}" if pd.notna(row['jam_factor']) else "N/A"
                free_flow_text = f"{row['free_flow_speed']:.1f}" if pd.notna(row['free_flow_speed']) else "N/A"
                
                popup_text = f"""
                <b>Location:</b> {row['location_description']}<br>
                <b>Speed:</b> {speed_text} km/h<br>
                <b>Speed Limit:</b> {speed_limit_text} km/h<br>
                <b>Jam Factor:</b> {jam_factor_text}<br>
                <b>Free Flow Speed:</b> {free_flow_text} km/h
                """
                
                folium.PolyLine(
                    coords,
                    color=color,
                    weight=3,
                    opacity=0.7,
                    popup=folium.Popup(popup_text, max_width=300)
                ).add_to(m)
    
    # Add traffic incidents
    if incidents_df is not None and len(incidents_df) > 0 and 'geometry' in incidents_df.columns:
        for idx, row in incidents_df.iterrows():
            if row['geometry'] and len(row['geometry']) > 0:
                # Use first coordinate as marker location
                lat, lon = row['geometry'][0][1], row['geometry'][0][0]
                
                # Create popup with incident info
                popup_text = f"""
                <b>Type:</b> {row['type']}<br>
                <b>Description:</b> {row['description']}<br>
                <b>Criticality:</b> {row['criticality']}<br>
                <b>Start Time:</b> {row['start_time']}<br>
                """
                
                folium.Marker(
                    [lat, lon],
                    popup=folium.Popup(popup_text, max_width=300),
                    icon=folium.Icon(color='red', icon='exclamation-triangle', prefix='fa')
                ).add_to(m)
    
    # Add legend
    legend_html = '''
    <div style="position: fixed; 
                bottom: 50px; right: 50px; width: 180px; height: 150px; 
                background-color: white; border:2px solid grey; z-index:9999; 
                font-size:14px; padding: 10px">
    <p><strong>Traffic Flow</strong></p>
    <p><span style="color:green;">&#9632;</span> Free Flow (JF < 2)</p>
    <p><span style="color:yellow;">&#9632;</span> Moderate (JF 2-4)</p>
    <p><span style="color:orange;">&#9632;</span> Slow (JF 4-8)</p>
    <p><span style="color:red;">&#9632;</span> Congested (JF > 8)</p>
    </div>
    '''
    m.get_root().html.add_child(folium.Element(legend_html))
    
    return m

# Create the map
if 'flow_df' in locals():
    traffic_map = visualize_traffic_on_map(
        flow_df if 'flow_df' in locals() else None,
        incidents_df if 'incidents_df' in locals() else None,
        selected_bbox,
        location_name
    )
    
    # Save map
    map_filename = f"traffic_map_{location_name.replace(' ', '_')}.html"
    traffic_map.save(map_filename)
    print(f"Map saved to: {map_filename}")
    
    # Display map
    traffic_map
else:
    print("No traffic data available to visualize.")

Map saved to: traffic_map_Kuala_Lumpur.html


## Step 11: Save Data for Merging with Map Data

In [None]:
# Save traffic flow data
if 'flow_df' in locals() and flow_df is not None:
    flow_filename = f"traffic_flow_{location_name.replace(' ', '_')}.csv"
    flow_df.to_csv(flow_filename, index=False)
    print(f"Traffic flow data saved to: {flow_filename}")

# Save traffic incidents data
if 'incidents_df' in locals() and incidents_df is not None and len(incidents_df) > 0:
    incidents_filename = f"traffic_incidents_{location_name.replace(' ', '_')}.csv"
    incidents_df.to_csv(incidents_filename, index=False)
    print(f"Traffic incidents data saved to: {incidents_filename}")

# Also save as GeoJSON for easier merging with map data
if 'flow_df' in locals() and flow_df is not None and 'geometry' in flow_df.columns:
    # Convert to GeoDataFrame
    from shapely.geometry import LineString
    
    geometries = []
    valid_indices = []
    
    for idx, row in flow_df.iterrows():
        if row['geometry'] and len(row['geometry']) > 1:
            try:
                line = LineString(row['geometry'])
                geometries.append(line)
                valid_indices.append(idx)
            except:
                pass
    
    if geometries:
        gdf = gpd.GeoDataFrame(
            flow_df.loc[valid_indices].drop(columns=['geometry']),
            geometry=geometries,
            crs='EPSG:4326'
        )
        
        geojson_filename = f"traffic_flow_{location_name.replace(' ', '_')}.geojson"
        gdf.to_file(geojson_filename, driver='GeoJSON')
        print(f"Traffic flow GeoJSON saved to: {geojson_filename}")

print("\nData saved and ready to merge with map data!")

## Step 12: Summary Statistics and Visualization

In [None]:
# Create visualizations
if 'flow_df' in locals() and flow_df is not None:
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    # Speed distribution
    axes[0, 0].hist(flow_df['speed'].dropna(), bins=30, color='blue', alpha=0.7, edgecolor='black')
    axes[0, 0].set_xlabel('Speed (km/h)')
    axes[0, 0].set_ylabel('Frequency')
    axes[0, 0].set_title('Distribution of Current Speeds')
    axes[0, 0].axvline(flow_df['speed'].mean(), color='red', linestyle='--', label=f'Mean: {flow_df["speed"].mean():.1f}')
    axes[0, 0].legend()
    
    # Jam factor distribution
    axes[0, 1].hist(flow_df['jam_factor'].dropna(), bins=20, color='orange', alpha=0.7, edgecolor='black')
    axes[0, 1].set_xlabel('Jam Factor')
    axes[0, 1].set_ylabel('Frequency')
    axes[0, 1].set_title('Distribution of Jam Factors')
    axes[0, 1].axvline(flow_df['jam_factor'].mean(), color='red', linestyle='--', label=f'Mean: {flow_df["jam_factor"].mean():.2f}')
    axes[0, 1].legend()
    
    # Speed vs Speed Limit
    axes[1, 0].scatter(flow_df['speed_limit'], flow_df['speed'], alpha=0.5)
    axes[1, 0].plot([0, flow_df['speed_limit'].max()], [0, flow_df['speed_limit'].max()], 'r--', label='Equal line')
    axes[1, 0].set_xlabel('Speed Limit (km/h)')
    axes[1, 0].set_ylabel('Current Speed (km/h)')
    axes[1, 0].set_title('Current Speed vs Speed Limit')
    axes[1, 0].legend()
    
    # Traffic congestion categories
    flow_df['congestion'] = pd.cut(flow_df['jam_factor'], 
                                    bins=[0, 2, 4, 8, 10], 
                                    labels=['Free Flow', 'Moderate', 'Slow', 'Congested'])
    congestion_counts = flow_df['congestion'].value_counts()
    axes[1, 1].bar(congestion_counts.index.astype(str), congestion_counts.values, 
                   color=['green', 'yellow', 'orange', 'red'])
    axes[1, 1].set_xlabel('Traffic Condition')
    axes[1, 1].set_ylabel('Number of Road Segments')
    axes[1, 1].set_title('Traffic Congestion Categories')
    axes[1, 1].tick_params(axis='x', rotation=45)
    
    plt.tight_layout()
    plt.savefig(f'traffic_analysis_{location_name.replace(" ", "_")}.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print(f"\nVisualization saved to: traffic_analysis_{location_name.replace(' ', '_')}.png")

## Next Steps: Merging with Map Data

To merge this traffic data with your OSM map data:

1. **Spatial Join**: Use GeoPandas to perform a spatial join between the traffic flow GeoJSON and your road network from OSM
2. **Match by Coordinates**: The traffic flow geometry (LineStrings) can be matched with OSM road segments
3. **Attribute Transfer**: Transfer traffic attributes (speed, jam_factor) to the corresponding road segments

Example code for merging:
```python
import osmnx as ox
import geopandas as gpd

# Load your OSM road network
G = ox.graph_from_place("Kuala Lumpur, Malaysia", network_type='drive')
edges = ox.graph_to_gdfs(G, nodes=False)

# Load traffic data
traffic_gdf = gpd.read_file("traffic_flow_Kuala_Lumpur.geojson")

# Perform spatial join (nearest neighbor)
merged = gpd.sjoin_nearest(edges, traffic_gdf, how='left', max_distance=0.001)

# Now you have road segments with traffic data!
```