# Valhalla Cluster Setup - Example Usage

This notebook demonstrates how to set up and manage a Valhalla routing cluster using the `valhalla_cluster` module.

## Architecture Overview

The Valhalla cluster consists of:

- **Multiple Valhalla Workers**: Independent routing engines that can process requests in parallel
- **Shared Tile Storage**: Pre-built routing tiles shared by all workers (read-only)
- **Docker Compose**: Orchestrates the container cluster

This architecture enables high-throughput parallel routing calculations - ideal for batch processing millions of routes.

## Prerequisites

1. **Docker** must be installed and running
2. **Python dependencies**: Install via `pip install -r requirements.txt`
3. **Disk space**: ~15-25GB for UK tiles, ~100GB+ for larger regions

## Installation

```bash
pip install -r requirements.txt
```

In [2]:
%pip install -r requirements.txt

Collecting geopandas>=0.14.0 (from -r requirements.txt (line 9))
  Downloading geopandas-1.1.2-py3-none-any.whl.metadata (2.3 kB)
Collecting scipy>=1.10.0 (from -r requirements.txt (line 12))
  Downloading scipy-1.16.3-cp312-cp312-macosx_14_0_arm64.whl.metadata (62 kB)
Collecting pyarrow>=14.0.0 (from -r requirements.txt (line 13))
  Downloading pyarrow-22.0.0-cp312-cp312-macosx_12_0_arm64.whl.metadata (3.2 kB)
Collecting tqdm>=4.65.0 (from -r requirements.txt (line 14))
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting pyogrio>=0.7.2 (from geopandas>=0.14.0->-r requirements.txt (line 9))
  Downloading pyogrio-0.12.1-cp312-cp312-macosx_12_0_arm64.whl.metadata (5.9 kB)
Collecting pyproj>=3.5.0 (from geopandas>=0.14.0->-r requirements.txt (line 9))
  Using cached pyproj-3.7.2-cp312-cp312-macosx_14_0_arm64.whl.metadata (31 kB)
Collecting shapely>=2.0.0 (from geopandas>=0.14.0->-r requirements.txt (line 9))
  Using cached shapely-2.1.2-cp312-cp312-macosx_11_0_arm64.wh

In [5]:
# Add the code directory to the path
import sys
sys.path.insert(0, './code')

from valhalla_cluster import ValhallaCluster

---

## Start the Cluster

---

## Parameter Reference

The `ValhallaCluster` class accepts several configuration parameters. Here's a detailed explanation of each:

### Core Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `num_workers` | int | 8 | Number of parallel Valhalla instances |
| `base_worker_port` | int | 8010 | Starting port (workers use 8010, 8011, ...) |
| `tiles_dir` | str/Path | "./valhalla_tiles" | Directory for tile storage |
| `project_dir` | str/Path | cwd | Base directory for docker-compose.yml |

### Docker Settings

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `valhalla_image` | str | "ghcr.io/gis-ops/docker-valhalla/valhalla:latest" | Docker image to use |

### Tile Building Settings

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `tile_url` | str | UK PBF URL | OpenStreetMap data source |
| `build_elevation` | bool | False | Include elevation data |
| `build_admins` | bool | True | Build administrative boundaries |
| `build_time_zones` | bool | True | Include timezone data |

### Example: Custom Configuration

In [7]:
# Example: Custom cluster configuration
cluster = ValhallaCluster(
    # Number of workers - adjust based on your CPU cores
    # Recommendation: 1 worker per 2 CPU cores
    num_workers=7,
    
    # Port range - workers will use 8020, 8021, 8022, 8023
    base_worker_port=8020,
    
    # Tile storage location
    tiles_dir="./valhalla_tiles",
    
    # OSM data source - find extracts at https://download.geofabrik.de/
    tile_url="https://download.geofabrik.de/europe/united-kingdom-latest.osm.pbf",
    
    # Tile building options
    build_elevation=False,   # Skip elevation (faster build, smaller tiles)
    build_admins=True,       # Include admin boundaries
    build_time_zones=True,   # Include timezone data
)

print(f"Cluster configuration:")
print(f"  Workers: {cluster.num_workers}")
print(f"  Ports: {cluster.worker_ports}")
print(f"  Tiles directory: {cluster.tiles_dir}")

Cluster configuration:
  Workers: 7
  Ports: [8020, 8021, 8022, 8023, 8024, 8025, 8026]
  Tiles directory: /Users/alex/github/valhalla_cluster/valhalla_tiles


---

## Step-by-Step Setup

### Step 1: Build Tiles (One-Time)

This downloads OSM data and builds the routing graph. **This only needs to be done once** - all workers share the same tiles.

‚è±Ô∏è **Expected time**: 10-30+ minutes depending on region size and hardware

In [5]:
# Build tiles (skip if already built)
# Set force=True to rebuild existing tiles
cluster.build_tiles(force=False)

Building tiles with ghcr.io/gis-ops/docker-valhalla/valhalla:latest...
Downloading from: https://download.geofabrik.de/europe/united-kingdom-latest.osm.pbf
This will take 10-30+ minutes depending on your hardware.

Pulling Docker image...
Starting builder container...
Builder container started (id=81cca381a78c)

Monitoring tile build progress (this will take a while):
Ctrl+C to stop monitoring (build continues in background)

INFO: Running container with user valhalla UID 59999 and GID 59999.
find: ‚Äò/custom_files/transit_tiles‚Äô: No such file or directory

  üì• Downloading OSM data...

Downloading  https://download.geofabrik.de/europe/united-kingdom-latest.osm.pbf
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dload  Upload   Total   Spent    Left  Speed
= Building admin db =
2025/12/24 20:25:40.147023 [32;1m[INFO][0m Parsing files: /custom_files/united-kingdom-latest.osm.pbf
2025/12/24 20:25:40.147201 [32;1m[INFO][0m Parsing relations...

  üîÑ

True

### Step 2: Generate Docker Compose

Creates a `docker-compose.yml` file that defines the worker containers.

In [3]:
# Generate docker-compose.yml
compose_path = cluster.generate_compose()
print(f"\nGenerated: {compose_path}")

‚úÖ Created: /Users/alex/github/valhalla_cluster/docker-compose.yml
   Workers: 7
   Ports: 8020-8026
   Volume: ./valhalla_tiles:/custom_files:ro

Generated: /Users/alex/github/valhalla_cluster/docker-compose.yml


### Step 3: Apply Performance Optimisations (Optional)

For batch routing workloads (like AHAH), apply optimisations that prioritise throughput over flexibility.

In [None]:
# Apply optimisations with defaults (tuned for batch routing)
# cluster.apply_optimisations()

### Optimisation Parameters

You can customise the optimisations:

```python
cluster.apply_optimisations(
    # LOKI - Point snapping settings
    search_cutoff=1000,          # Max search radius (metres) for snapping to roads
    node_snap_tolerance=100,     # Tolerance for snapping to road nodes
    street_side_tolerance=100,   # Tolerance for snapping to street sides
    
    # SERVICE LIMITS - Request capacity
    max_locations=1000,                  # Max locations per request
    max_matrix_distance=2000000,       # Max total distance for matrix (metres)
    max_matrix_location_pairs=5000000, # Max O-D pairs in matrix
    
    # HTTPD - Server settings
    timeout_seconds=300,         # Request timeout
    
    # THOR - Routing engine settings
    costmatrix_allow_second_pass=True,          # Retry failed routes
    costmatrix_check_reverse_connection=False,  # Skip one-way checks at dest
    source_to_target_algorithm="costmatrix",    # Fast matrix algorithm
    max_reserved_locations_costmatrix=1000,     # Pre-allocated memory
    
    # MJOLNIR - Tile caching
    use_simple_mem_cache=True,        # Enable in-memory caching
    max_cache_size=5000000000,     # Cache size (5GB)
)
```

### Step 4: Start the Cluster

In [17]:
cluster.apply_optimisations(
    # LOKI - Point snapping settings
    search_cutoff=20000,          # Max search radius (metres) for snapping to roads
    node_snap_tolerance=500,     # Tolerance for snapping to road nodes
    street_side_tolerance=100,   # Tolerance for snapping to street sides
    
    # SERVICE LIMITS - Request capacity
    max_locations=1000,                  # Max locations per request
    max_matrix_distance=2000000,       # Max total distance for matrix (metres)
    max_matrix_location_pairs=5000000, # Max O-D pairs in matrix
    
    # HTTPD - Server settings
    timeout_seconds=300,         # Request timeout
    
    # THOR - Routing engine settings
    costmatrix_allow_second_pass=True,          # Retry failed routes
    costmatrix_check_reverse_connection=False,  # Skip one-way checks at dest
    source_to_target_algorithm="costmatrix",    # Fast matrix algorithm
    max_reserved_locations_costmatrix=1000,     # Pre-allocated memory
    
    # MJOLNIR - Tile caching
    use_simple_mem_cache=True,        # Enable in-memory caching
    max_cache_size=5000000000,     # Cache size (5GB)
)

‚úÖ Saved optimised config: /Users/alex/github/valhalla_cluster/valhalla_tiles/valhalla.json

Optimisations applied:
  LOKI - search_cutoff: 20000m
  LOKI - snap tolerances: 500m
  SERVICE LIMITS - max_locations: 1000
  SERVICE LIMITS - max_matrix_pairs: 5,000,000
  THOR - algorithm: costmatrix
  THOR - second_pass: True
  MJOLNIR - cache: True, 5.0GB
  HTTPD - timeout: 300s

‚ö†Ô∏è  Restart the cluster for changes to take effect:
   docker-compose restart


---

## Cluster Management

In [15]:
# Check container status
statuses = cluster.status()
for worker, status in statuses.items():
    print(f"{worker}: {status}")

valhalla_worker_0: running
valhalla_worker_1: running
valhalla_worker_2: running
valhalla_worker_3: running
valhalla_worker_4: running
valhalla_worker_5: running
valhalla_worker_6: running


In [16]:
# Health check - verify workers are responding
health = cluster.health_check(timeout=5)
for port, healthy in health.items():
    status = "‚úÖ healthy" if healthy else "‚ùå unhealthy"
    print(f"Port {port}: {status}")

Port 8020: ‚úÖ healthy
Port 8021: ‚úÖ healthy
Port 8022: ‚úÖ healthy
Port 8023: ‚úÖ healthy
Port 8024: ‚úÖ healthy
Port 8025: ‚úÖ healthy
Port 8026: ‚úÖ healthy


In [None]:
# Restart the cluster (e.g., after config changes)
# cluster.restart()

In [None]:
# Stop the cluster
# cluster.stop()

---

## Command Line Usage

The module can also be run from the command line:

```bash
# Full setup (build tiles, generate compose, apply optimisations)
python code/valhalla_cluster.py setup --workers 8

# Just generate docker-compose (tiles already built)
python code/valhalla_cluster.py compose --workers 4

# Apply optimisations to existing config
python code/valhalla_cluster.py optimise

# Start/stop/restart
python code/valhalla_cluster.py start
python code/valhalla_cluster.py stop
python code/valhalla_cluster.py restart

# Check status
python code/valhalla_cluster.py status
```