# 01 — Kigali Road Network (OSM → SUMO)

## Objective
Build a Kigali road network suitable for simulation:
1) Download and clean a drivable road network from OpenStreetMap (OSM)
2) Export it in a format that can be converted to a SUMO network
3) Convert using SUMO `netconvert` to produce `kigali.net.xml`

This notebook is designed to be **resumable** (cache outputs and reload when available).

## 1.0 Scope & Expected Outputs

### In scope
- Download Kigali drivable street network using `osmnx`
- Basic cleaning and sanity checks
- Export to OSM XML for SUMO conversion
- Convert to SUMO network using `netconvert`
- Quick validation (file existence + basic stats)

### Outputs (saved to repo)
- Raw OSM export: `data/raw/osm/kigali.osm.xml`
- Cached graph: `data/processed/network/kigali.graphml`
- SUMO network: `sim/net/kigali.net.xml`
- Optional: quick plots saved to `reports/figures/`

In [1]:
import os
from pathlib import Path
import random

# Reproducibility
SEED = 42
random.seed(SEED)

# Paths
PROJECT_ROOT = Path.cwd().parent if Path.cwd().name == "notebooks" else Path.cwd()
DATA_RAW_OSM = PROJECT_ROOT / "data" / "raw" / "osm"
DATA_NET = PROJECT_ROOT / "data" / "processed" / "network"
SIM_NET = PROJECT_ROOT / "sim" / "net"
FIGURES = PROJECT_ROOT / "reports" / "figures"

for p in [DATA_RAW_OSM, DATA_NET, SIM_NET, FIGURES]:
    p.mkdir(parents=True, exist_ok=True)

# Location config
KIGALI_PLACE_QUERY = "Kigali, Rwanda"
NETWORK_TYPE = "drive"

# Cache files
GRAPHML_PATH = DATA_NET / "kigali.graphml"
OSM_XML_PATH = DATA_RAW_OSM / "kigali.osm.xml"
SUMO_NET_PATH = SIM_NET / "kigali.net.xml"

# Behavior
USE_CACHE = True

print("## Config")
print("PROJECT_ROOT:", PROJECT_ROOT)
print("KIGALI_PLACE_QUERY:", KIGALI_PLACE_QUERY)
print("NETWORK_TYPE:", NETWORK_TYPE)
print("USE_CACHE:", USE_CACHE)
print("GRAPHML_PATH:", GRAPHML_PATH)
print("OSM_XML_PATH:", OSM_XML_PATH)
print("SUMO_NET_PATH:", SUMO_NET_PATH)

## Config
PROJECT_ROOT: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems
KIGALI_PLACE_QUERY: Kigali, Rwanda
NETWORK_TYPE: drive
USE_CACHE: True
GRAPHML_PATH: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/data/processed/network/kigali.graphml
OSM_XML_PATH: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/data/raw/osm/kigali.osm.xml
SUMO_NET_PATH: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/sim/net/kigali.net.xml


## 2.0 Load or Download the Kigali Road Network (OSM)

We use a **cache-first** pattern:
- If a cached graph exists and `USE_CACHE=True`, we load it from disk.
- Otherwise, we download the network from OSM using `osmnx`.

After loading/downloading, we:
- keep the largest connected component (to avoid disconnected fragments)
- save the graph to `GraphML` for fast reload
- export to `OSM XML` for SUMO conversion

In [2]:
import osmnx as ox

print("## Load or Download Network")

G = None

if USE_CACHE and GRAPHML_PATH.exists():
    print("[CACHE] Loading GraphML:", GRAPHML_PATH)
    G = ox.load_graphml(GRAPHML_PATH)
else:
    print("[DOWNLOAD] Fetching from OSM:", KIGALI_PLACE_QUERY)
    G = ox.graph_from_place(KIGALI_PLACE_QUERY, network_type=NETWORK_TYPE, simplify=True)

print("Loaded graph.")
print("Nodes:", len(G.nodes))
print("Edges:", len(G.edges))

## Load or Download Network
[DOWNLOAD] Fetching from OSM: Kigali, Rwanda
Loaded graph.
Nodes: 18941
Edges: 50228


## 2.1 Clean and Export (GraphML + OSM XML)

We apply minimal cleaning to improve conversion stability:
- Keep the **largest connected component** (common for city extracts)
- Save a GraphML cache for fast reload
- Export the graph to OSM XML (`.osm.xml`) which will be used by SUMO `netconvert`

In [4]:
from pathlib import Path

print("## Cleaning: Largest Connected Component")

nodes_before = len(G.nodes)
edges_before = len(G.edges)

# Keep the largest weakly connected component (works well for directed graphs)
G = ox.truncate.largest_component(G, strongly=False)

nodes_after = len(G.nodes)
edges_after = len(G.edges)

print("Before - nodes:", nodes_before, "edges:", edges_before)
print("After  - nodes:", nodes_after, "edges:", edges_after)

print("\n## Saving cache")
ox.save_graphml(G, GRAPHML_PATH)
print("[OK] Saved GraphML:", GRAPHML_PATH)

print("\n## Exporting to OSM XML for SUMO netconvert (requires unsimplified graph)")

# Download an unsimplified graph for OSM XML export (OSMnx requirement)
if USE_CACHE and OSM_XML_PATH.exists():
    print("[CACHE] OSM XML already exists:", OSM_XML_PATH)
else:
    print("[DOWNLOAD] Fetching UNSIMPLIFIED graph for export:", KIGALI_PLACE_QUERY)

    # Helps produce consistent OSM-way directionality for export
    ox.settings.all_oneway = True

    G_raw = ox.graph_from_place(
        KIGALI_PLACE_QUERY,
        network_type=NETWORK_TYPE,
        simplify=False,
    )

    print("Unsimplified graph loaded.")
    print("Nodes:", len(G_raw.nodes))
    print("Edges:", len(G_raw.edges))

    ox.io.save_graph_xml(G_raw, filepath=OSM_XML_PATH)
    print("[OK] Saved OSM XML:", OSM_XML_PATH)

## Cleaning: Largest Connected Component
Before - nodes: 18941 edges: 50228
After  - nodes: 18941 edges: 50228

## Saving cache
[OK] Saved GraphML: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/data/processed/network/kigali.graphml

## Exporting to OSM XML for SUMO netconvert (requires unsimplified graph)
[DOWNLOAD] Fetching UNSIMPLIFIED graph for export: Kigali, Rwanda
Unsimplified graph loaded.
Nodes: 157727
Edges: 164449
[OK] Saved OSM XML: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/data/raw/osm/kigali.osm.xml


## 3.0 Convert OSM XML to a SUMO Network

We convert the exported `kigali.osm.xml` into a SUMO network file (`.net.xml`) using `netconvert`.

Output:
- `sim/net/kigali.net.xml`

We run `netconvert` from Python to keep the process reproducible and easier to debug.

In [6]:
import shutil
import subprocess
from pathlib import Path

print("## netconvert: OSM XML → SUMO .net.xml")

# Ensure input exists
if not OSM_XML_PATH.exists():
    raise FileNotFoundError(f"Missing input OSM XML: {OSM_XML_PATH}")

netconvert_bin = shutil.which("netconvert")
if not netconvert_bin:
    raise RuntimeError("netconvert not found on PATH. SUMO may not be configured correctly.")

cmd = [
    netconvert_bin,
    "--osm-files", str(OSM_XML_PATH),
    "--output-file", str(SUMO_NET_PATH),

    # Common useful options for urban road networks
    "--geometry.remove",         # simplify geometry a bit
    "--ramps.guess",             # guess ramps where applicable
    "--junctions.join",          # join close junctions to reduce noise
    "--tls.guess",               # guess traffic lights when OSM data is missing
    "--tls.default-type", "static",

    # Keep connections reasonable
    "--roundabouts.guess",
]

print("Command:")
print(" ".join(cmd))

result = subprocess.run(cmd, text=True, capture_output=True)

print("\n## netconvert stdout (last 50 lines)")
stdout_lines = result.stdout.splitlines()
print("\n".join(stdout_lines[-50:]))

print("\n## netconvert stderr (last 50 lines)")
stderr_lines = result.stderr.splitlines()
print("\n".join(stderr_lines[-50:]))

if result.returncode != 0:
    raise RuntimeError(f"netconvert failed with code {result.returncode}")

print("\n## Output check")
print("SUMO_NET_PATH:", SUMO_NET_PATH)
print("Exists:", SUMO_NET_PATH.exists())

if SUMO_NET_PATH.exists():
    size_mb = SUMO_NET_PATH.stat().st_size / (1024 * 1024)
    print(f"Size: {size_mb:.2f} MB")

    # Show first few lines to confirm it's a SUMO net
    print("\n## Preview (first 10 lines)")
    with open(SUMO_NET_PATH, "r", encoding="utf-8") as f:
        for i in range(10):
            print(f.readline().rstrip())

## netconvert: OSM XML → SUMO .net.xml
Command:
/Library/Frameworks/EclipseSUMO.framework/Versions/1.26.0/EclipseSUMO/bin/netconvert --osm-files /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/data/raw/osm/kigali.osm.xml --output-file /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/sim/net/kigali.net.xml --geometry.remove --ramps.guess --junctions.join --tls.guess --tls.default-type static --roundabouts.guess

## netconvert stdout (last 50 lines)
Success.

## netconvert stderr (last 50 lines)
pj_obj_create: Cannot find proj.db

## Output check
SUMO_NET_PATH: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/sim/net/kigali.net.xml
Exists: True
Size: 154.61 MB

## Preview (first 10 lines)
<?xml version="1.0" encoding="UTF-8"?>

<!-- generated on 2026-02-13T13:09:17.489198+02:00 by Eclipse SUMO netconvert 1.26.0
<netconvertConfiguration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http:

## 3.1 Fix `proj.db` Warning (Coordinate Reference Data)

During `netconvert` we saw:

`pj_obj_create: Cannot find proj.db`

This happens when the PROJ library cannot locate its coordinate reference database.
SUMO’s macOS framework includes PROJ data, but the environment variable `PROJ_LIB` may not be set.

In this section we:
- locate `proj.db` (prefer the SUMO framework location)
- set `PROJ_LIB` for the current notebook process (and thus any subprocess calls from the notebook)

In [7]:
import os
from pathlib import Path
import shutil

print("## PROJ_LIB check")

def find_proj_db() -> Path | None:
    """Try to locate proj.db from common locations (prefer SUMO framework)."""
    candidates: list[Path] = []

    # 1) If PROJ_LIB already set, check it first
    proj_lib = os.environ.get("PROJ_LIB")
    if proj_lib:
        candidates.append(Path(proj_lib) / "proj.db")

    # 2) From SUMO binary location (Framework install)
    sumo_bin = shutil.which("sumo")
    if sumo_bin:
        # .../EclipseSUMO/bin/sumo -> prefix is .../EclipseSUMO
        prefix = Path(sumo_bin).resolve().parent.parent
        candidates.append(prefix / "share" / "proj" / "proj.db")

    # 3) From pyproj (often installed via geopandas deps)
    try:
        from pyproj.datadir import get_data_dir
        data_dir = Path(get_data_dir())
        candidates.append(data_dir / "proj.db")
    except Exception:
        pass

    # 4) Common Homebrew path (if proj installed)
    candidates.append(Path("/opt/homebrew/share/proj/proj.db"))

    for p in candidates:
        if p.exists():
            return p

    return None


proj_db = find_proj_db()
print("Current PROJ_LIB:", os.environ.get("PROJ_LIB", "(not set)"))

if proj_db is None:
    print("[WARN] Could not locate proj.db automatically.")
    print("       If you keep seeing the warning, we will set PROJ_LIB manually.")
else:
    proj_dir = proj_db.parent
    os.environ["PROJ_LIB"] = str(proj_dir)
    print("[OK] Found proj.db:", proj_db)
    print("[OK] Set PROJ_LIB to:", os.environ["PROJ_LIB"])

## PROJ_LIB check
Current PROJ_LIB: (not set)
[OK] Found proj.db: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/.venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj/proj.db
[OK] Set PROJ_LIB to: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/.venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj


## 3.2 SUMO Network Sanity Stats

We load the generated `kigali.net.xml` using `sumolib` and print:
- number of nodes
- number of edges
- number of traffic lights (TLS IDs)

This helps confirm the network is readable and gives us a baseline size.

In [8]:
import sumolib

print("## SUMO network stats")

if not SUMO_NET_PATH.exists():
    raise FileNotFoundError(f"SUMO network not found: {SUMO_NET_PATH}")

net = sumolib.net.readNet(str(SUMO_NET_PATH))

num_nodes = len(net.getNodes())
num_edges = len(net.getEdges())

# TLS IDs helper (sumolib usually exposes getTLSIDs)
tls_ids = []
if hasattr(net, "getTLSIDs"):
    tls_ids = net.getTLSIDs()

print("Network file:", SUMO_NET_PATH)
print("Nodes:", num_nodes)
print("Edges:", num_edges)
print("Traffic lights (count):", len(tls_ids))
if tls_ids:
    print("Traffic lights (sample 10):", tls_ids[:10])

## SUMO network stats
Network file: /Users/testsolutions/Documents/Academics/mission-capstone/marl-in-ems/sim/net/kigali.net.xml
Nodes: 17716
Edges: 47795
Traffic lights (count): 0
