# Maritime Graph Weighting and Pathfinding Pipeline

This notebook provides a comprehensive, end-to-end workflow for applying intelligent weighting to a maritime navigation graph and calculating optimal routes using file-based storage (GeoPackage or SpatiaLite).

## Workflow Overview

This is **Step 3** in the four-step maritime routing pipeline:

1. **Base Graph Creation** (`graph_PostGIS_v2.ipynb` or `graph_GeoPackage_v2.ipynb`): Creates coarse-resolution graph (0.3 NM spacing)
2. **Fine/H3 Graph Creation** (`graph_fine_PostGIS_v2.ipynb` or `graph_fine_GeoPackage_v2.ipynb`): Creates high-resolution graph (0.02-0.3 NM or hexagonal)
3. **Graph Weighting & Pathfinding** (THIS NOTEBOOK): Applies intelligent weights and calculates routes
4. Configuration & Orchestration: Use workflow scripts for full automation

## What This Notebook Does

This notebook implements a **three-tier weighting system** that combines:

1. **Conversion**: Converting an undirected graph to a directed one to support traffic-flow constraints.
2. **Enrichment**: Adding S-57 feature data (depth, clearance, orientation) to graph edges for smart routing.
3. **Weighting**: Applying three tiers of weights:
   - **Static weights**: Distance-based penalties/bonuses from navigational features (land, fairways, TSS lanes)
   - **Directional weights**: Traffic flow alignment penalties/rewards (follow one-way lanes, align with fairways)
   - **Dynamic weights**: Vessel-specific constraints (draft, height, environmental conditions)
4. **Pathfinding**: Calculating optimal routes on the fully weighted graph using A* algorithm.

The entire pipeline is optimized for large graphs by performing all intensive operations directly within the file-based database (SpatiaLite). The graph is only loaded into memory at the final pathfinding step.

## Key Features

**File-Based Design**: All data is stored in portable GeoPackage/SpatiaLite files:
- No database server required
- Easy to share and archive
- Works on any system with GDAL/GeoPandas installed
- Perfect for offline or distributed use

**Database-Side Operations**: All intensive computations happen within the file database:
- Spatial joins with R-tree index optimization
- Window functions for feature prioritization
- Trigonometric calculations for bearing analysis
- Zero Python memory usage during weight calculations

## Prerequisites

This notebook requires:
1. **Directed Graph** (or will create from undirected): Pre-computed fine/H3 graph from Step 2
2. **ENC Data**: S-57 charts converted to GeoPackage/SpatiaLite format
3. **Configuration Files**:
   - `graph_config.yml`: Graph layer definitions and weight settings
4. **Data Files**: GeoPackage files in the `output` directory

**Setup Instructions:** See `docs/SETUP.md` for converting S-57 data to GeoPackage/SpatiaLite backend.

**Troubleshooting:** See `docs/TROUBLESHOOTING.md` for common issues and solutions.

## 1. Notebook Configuration

Adjust the parameters in this section to control the notebook's behavior. You can run the entire weighting pipeline or toggle individual steps to skip completed operations.

### Key Configuration Notes:
- **Graph Names**: Must match output from fine/H3 graph creation step (Step 2). Test names include `"h3_graph_gpkg_6_11"` for H3 graphs or `"fine_graph_10"` for fine grids
- **Workflow Steps**: Set to `False` to skip already-completed operations (useful for re-running portions without reprocessing)
- **Vessel Parameters**: Adjust draft/height to match your specific vessel (affects navigable areas and route costs):
  - `draft`: Distance from waterline to bottom of vessel (determines minimum water depth)
  - `height`: Distance from waterline to highest point (determines maximum bridge clearance)
  - `safety_margin`: Extra clearance to add for safety (under-keel clearance buffer)
- **Environmental Conditions**: Factors that increase routing penalties:
  - `weather_factor`: 1.0 = good conditions, >1.0 = poor (e.g., storms increase costs)
  - `visibility_factor`: 1.0 = good conditions, >1.0 = poor (reduces speed confidence)
  - `time_of_day`: 'day' or 'night' (night navigation has higher penalties)
- **Usage Bands**: Controls which ENC chart scales contribute to weights (Band 3-5 recommended):
  - Band 3: Approach (1:90K scale) - Regional planning
  - Band 4: Harbour Entrance (1:22K-45K) - Harbor entrance
  - Band 5: Harbour (1:4K-12K) - Detailed harbor operations

See `graph_config.yml` for production-ready configuration with full parameter documentation.

In [None]:
# --- Graph Naming ---
graph_name_undirected = "fine_graph_20"  # Source undirected graph test_names: "h3_graph_gpkg_6_11", "fine_graph_10"
graph_name_directed = "fine_graph_directed_gpkg_20"  # Target directed graph

# --- Workflow Steps ---
# Enable/disable individual pipeline steps
workflow_steps = {
    "run_conversion_to_directed": True,  # Create directed graph with bidirectional edges
    "run_enrichment": True,              # Extract S-57 feature data into edge attributes
    "run_static_weights": True,          # Apply distance-based weights from maritime features
    "run_directional_weights": True,     # Apply traffic flow alignment weights
    "run_dynamic_weights": True,         # Apply vessel-specific and environmental weights
    "run_pathfinding": True              # Calculate route on weighted graph
}

# --- Vessel Parameters ---
# Define vessel characteristics for dynamic weight calculations
vessel_params = {
    'draft': 7.5,           # Vessel draft in meters (depth below waterline)
    'height': 30.0,         # Height above waterline in meters (for bridge clearance)
    'safety_margin': 2.0,   # Under-keel clearance safety margin in meters
    'vessel_type': 'cargo'  # Vessel type (affects routing preferences)
}

# --- Environmental Conditions ---
# Define conditions that affect routing penalties
env_conditions = {
    'weather_factor': 1.2,      # 1.0 = good weather, >1.0 = poor weather (increases penalties)
    'visibility_factor': 1.1,   # 1.0 = good visibility, >1.0 = poor (increases penalties)
    'time_of_day': 'day'        # 'day' or 'night' (night adds penalties)
}

# --- Route Endpoints ---
# Define departure and arrival locations
departure_port_name = "SF Pilot"
arrival_port_name = "San Francisco Arrival"

departure_coords = {"lon": -122.780, "lat": 37.006}
arrival_coords = {"lon": -122.400, "lat": 37.805}

## 2. Setup and Initialization

This section imports all necessary libraries and initializes the core classes for the weighting pipeline:

- **ENCDataFactory**: Provides unified interface for accessing S-57 ENC data from GeoPackage/SpatiaLite
  - Handles file connections and layer queries
  - Used by all other classes for data access
  
- **H3Graph & FineGraph**: Manage graph operations (conversion, loading, saving, export)
  - Convert undirected graphs to directed
  - Load/save graphs from/to GeoPackage files
  - This notebook uses generic graph operations that work with **both fine grids and H3 graphs**
  - The specific graph type doesn't affect weighting - only the input filename matters
  
- **Weights**: Implements the three-tier weighting system (static, directional, dynamic)
  - Static weights: Distance-based penalties/bonuses from features
  - Directional weights: Traffic flow alignment penalties/rewards
  - Dynamic weights: Vessel-specific constraints (draft, height) combined with environmental conditions
  - Combines all tiers into final `adjusted_weight`
  
- **Route**: Calculates optimal routes on weighted graphs using A* algorithm
  
- **PortData**: Manages port definitions (World Port Index + custom ports)

The output directory is created here for saving routes and benchmarks. Database files are loaded from the `output` directory (portable file-based storage, no server required).

In [None]:
import os
import sys
from pathlib import Path
from dotenv import load_dotenv
import time
import geopandas as gpd
import pandas as pd
import plotly.express as px
from shapely.geometry import Point

# --- Add Project to Python Path ---
project_root = Path.cwd().parent.parent
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

# --- Import Maritime Module Components ---
# Graph: Generic class for graph operations (works with fine grids, H3, etc.)
from src.nautical_graph_toolkit.core.graph import H3Graph, Weights
from src.nautical_graph_toolkit.core.s57_data import ENCDataFactory
from src.nautical_graph_toolkit.core.pathfinding_lite import Route
from src.nautical_graph_toolkit.utils.port_utils import PortData

# Load environment variables
load_dotenv(project_root / ".env")

# --- Define Paths ---
output_dir = Path.cwd() / 'output'
output_dir.mkdir(exist_ok=True)

# Define database files
enc_data_file = output_dir / "enc_west.gpkg"
graph_file_undirected = output_dir / f"{graph_name_undirected}.gpkg"
graph_file_directed = output_dir / f"{graph_name_directed}.gpkg"

# --- Initialize Core Classes ---
# ENCDataFactory: Interface for S-57 ENC data access
# graph: Generic graph management (works with fine grids, H3, and other graph types)
# Weights: Three-tier weighting system implementation
# PortData: Port location management (World Port Index + custom ports)
factory = ENCDataFactory(source=enc_data_file)
graph = H3Graph(data_factory=factory)  # Generic graph operations (not H3-specific in this workflow)
weights_manager = Weights(data_factory=factory)
port_manager = PortData()

# --- Performance Tracking ---
performance_metrics = {}

print("✓ Setup complete. Core classes initialized.")
print(f"✓ Using ENC data from: {enc_data_file}")
print(f"✓ Graph files: {graph_file_undirected}")

### Determine Relevant ENCs

Identify which Electronic Navigational Charts (ENCs) overlap with the graph area. This list is used throughout the workflow for feature enrichment and weight calculations.

In [None]:
print("Analyzing graph boundary to determine relevant ENCs...")
try:
    # Read nodes from the undirected graph
    nodes_gdf = gpd.read_file(graph_file_undirected, layer='nodes')
    
    # Create a convex hull around all nodes to define the graph area
    graph_boundary = nodes_gdf.geometry.union_all().convex_hull
    
    # Query ENC database for charts that intersect this boundary
    enc_list = factory.get_encs_by_boundary(graph_boundary)
    
    print(f"✓ Found {len(enc_list)} ENCs overlapping the graph area")
    if not enc_list:
        print("⚠ Warning: No ENCs found. Subsequent steps may fail.")
except Exception as e:
    print(f"✗ Error determining ENC list: {e}")
    enc_list = []

## 3. Convert to Directed Graph

**Why this step is needed:** The undirected fine/H3 graph treats edges bidirectionally. However, maritime traffic often has directional rules:
- One-way traffic lanes (Traffic Separation Schemes)
- Current/wind patterns that favor certain directions
- Fairway orientation preferences

Converting to a directed graph allows us to:
1. Assign different weights to forward and reverse directions
2. Model one-way traffic lanes and channels
3. Apply directional bonuses/penalties based on traffic flow
4. Prepare for sophisticated routing that considers real-world constraints

**How it works:** Each undirected edge (A-B) becomes two directed edges (A→B and B→A). Feature data is propagated to both directions during enrichment. The operation is performed entirely on the file-based database side using SpatiaLite SQL for maximum performance.

**Database Operation**: This conversion is performed entirely database-side using SQL, creating new edge and node layers in the GeoPackage file.

In [None]:
if workflow_steps["run_conversion_to_directed"]:
    start_time = time.perf_counter()
    print(f"Converting '{graph_name_undirected}' to directed graph...")
    
    graph.convert_to_directed_gpkg(
        source_path=str(graph_file_undirected),
        target_path=str(graph_file_directed)
    )
    
    end_time = time.perf_counter()
    performance_metrics['Conversion to Directed'] = end_time - start_time
    print(f"✓ Conversion complete in {performance_metrics['Conversion to Directed']:.2f}s")
    print(f"✓ Directed graph saved to: {graph_file_directed}")
else:
    print("⊘ Skipping conversion to directed graph")

## 4. Enrich Edges with S-57 Features

**Critical prerequisite for all weighting steps.** This process extracts navigational data from S-57 Electronic Navigational Charts and adds it to graph edges as feature attributes.

### What This Step Does

1. **Spatial Joins**: Uses SpatiaLite R-tree indexes to efficiently join edges with nearby S-57 features
2. **Feature Extraction**: Adds `ft_*` columns to the edge table for each relevant attribute:
   - `ft_depth`: Water depth from soundings/depth areas (for draft clearance)
   - `ft_orient`: Traffic flow orientation in degrees (for directional weights)
   - `ft_trafic`: Traffic direction code (1=one-way, 2=two-way)
   - `ft_verclr`: Vertical clearance from bridges/cables (for height checks)
   - `ft_valsou`: Minimum sounding value (for shallow water detection)
   - Plus feature flags like `ft_lndare`, `ft_fairwy`, `ft_tsslpt`, etc.

3. **Usage Band Prioritization**: When multiple ENCs cover the same area, prioritizes higher-detail charts (band 6 > 5 > 4 > 3 > 2 > 1)
4. **Reverse Edge Propagation**: Copies feature data to reverse edges in directed graphs

### Why This Is Required

- **Static weights** need feature layers to calculate distance-based penalties/bonuses
- **Directional weights** need traffic orientation data to align routes with traffic flow
- **Dynamic weights** need depth and clearance data for vessel-specific constraints

### Performance

Database-side spatial joins with R-tree indexes provide 10-20x faster performance than loading data into Python memory. Processing 100,000 edges typically takes 10-30 seconds.

In [None]:
if workflow_steps["run_enrichment"]:
    start_time = time.perf_counter()

    try:
        # Get feature layers from the S-57 classifier configuration
        feature_layers_to_enrich = weights_manager.get_feature_layers_from_classifier()
        
        print(f"Enriching edges with {len(feature_layers_to_enrich)} S-57 feature layers...")
        print(f"Processing {len(enc_list)} ENCs: {', '.join(enc_list[:5])}{'...' if len(enc_list) > 5 else ''}")
        
        # Run database-side enrichment
        summary = weights_manager.enrich_edges_with_features_gpkg_v3(
            graph_gpkg_path=str(graph_file_directed),
            enc_data_path=str(enc_data_file),
            enc_names=enc_list,
            feature_layers=feature_layers_to_enrich,
            is_directed=True,  # Propagate features to reverse edges
            include_sources=False,  # Don't store ENC source names (saves space)
            soundg_buffer_meters=30,  # Buffer radius for sounding point queries

            ram_cache_mb=8192, # Increase SQLite RAM cache
            skip_layers_without_rtree=True # Avoid slow queries on unindexed layers
        )
        
        end_time = time.perf_counter()
        performance_metrics['Edge Enrichment'] = end_time - start_time
        
        print(f"\n✓ Enrichment complete in {performance_metrics['Edge Enrichment']:.2f}s")
        print("\nFeatures extracted:")
        for feature, count in summary.items():
            print(f"  • {feature}: {count:,} edges")
            
    except Exception as e:
        print(f"✗ Enrichment failed: {e}")
        print("Ensure the directed graph file exists.")
else:
    print("⊘ Skipping edge enrichment")

## 5. Apply Static Weights

Apply **distance-based penalties and bonuses** from static maritime features using a three-tier weight system.

### Three-Tier Weight Categories

#### Tier 1: Blocking Weights (`wt_static_blocking`)
Absolute avoidance zones that make edges effectively impassable:
- **Land areas** (`lndare`): weight factor = 100
- **Underwater rocks** (`uwtroc`): weight factor = 100
- **Shoreline constructions** (`slcons`): weight factor = 90

#### Tier 2: Penalty Weights (`wt_static_penalty`)
Areas to avoid when possible (currently unused in this configuration, but can be configured for anchorages, restricted zones, etc.)

#### Tier 3: Bonus Weights (`wt_static_bonus`)
Preferred routing areas with reduced costs:
- **Fairways** (`fairwy`): weight factor = 0.5 (50% cost reduction)
- **Traffic Separation Schemes** (`tsslpt`): weight factor = 0.7
- **Dredged areas** (`drgare`): weight factor = 0.9
- **Precautionary areas** (`prcare`): weight factor = 0.9
- **Recommended tracks** (`rectrc`, `dwrtcl`): weight factor = 0.5

### Distance-Based Degradation

Weight intensity is based on distance from features:
- Each feature has an influence **buffer zone** (e.g., 500m for rocks, 200m for fairways)
- **Inside buffer**: Full weight factor applied
- **Outside buffer**: Neutral weight (1.0)
- This creates smooth transitions rather than hard boundaries

### Database Operation

Uses SpatiaLite `ST_DWithin()` with R-tree spatial indexes for efficient buffer queries. All weight calculations happen database-side without loading data into Python memory.

In [None]:
if workflow_steps["run_static_weights"]:
    start_time = time.perf_counter()
    print("Applying static weights...")
    
    # Load weight configuration from graph_config.yml
    config = weights_manager._load_config()
    
    # Apply static weights using database-side operations
    # summary = weights_manager.apply_static_weights_gpkg(
    #     graph_gpkg_path=str(graph_file_directed),
    #     enc_data_path=str(enc_data_file),
    #     enc_names=enc_list,
    #     static_layers=config["weight_settings"]["static_layers"],
    #     usage_bands=[3, 4, 5],  # Focus on detailed charts: Coastal (3), Approach (4), Harbour (5)
    #     land_area_layer="land_grid", # Use pre-computed land geometry for LNDARE optimization
    # )

    summary = weights_manager.apply_static_weights(
        gpkg_path=str(graph_file_directed),
        enc_names=enc_list,
        static_layers=config["weight_settings"]["static_layers"],
        usage_bands=[3, 4, 5],  # Focus on detailed charts: Coastal (3), Approach (4), Harbour (5)
        land_area_layer="land_grid", # Use pre-computed land geometry for LNDARE optimization
    )
    
    end_time = time.perf_counter()
    performance_metrics['Static Weights'] = end_time - start_time
    
    print(f"\n✓ Static weights applied in {performance_metrics['Static Weights']:.2f}s")
    print(f"\nProcessed {summary['layers_processed']} layers:")
    print(f"  • Applied weights: {summary['layers_applied']} layers")
    print(f"  • Skipped (no data): {summary['layers_processed'] - summary['layers_applied']} layers")
else:
    print("⊘ Skipping static weight application")

## 6. Apply Directional Weights

Calculate **traffic flow alignment penalties and rewards** based on how well edges align with intended maritime traffic directions.

### How Directional Weights Work

1. **Uses enriched orientation data** from the previous enrichment step:
   - `ft_orient`: Intended traffic direction (0-360 degrees) from TSS lanes, fairways, and recommended tracks
   - `ft_trafic`: Traffic direction code (1=one-way, 2=two-way)

2. **Calculates edge bearing** using SpatiaLite trigonometry:
   - `ST_Azimuth()` computes the bearing from start node to end node
   - Converted to 0-360 degree format for comparison

3. **Computes angular difference** between edge bearing and traffic orientation:
   - Handles 0-360 wrap-around (e.g., difference between 5° and 355° is 10°, not 350°)
   - Determines alignment quality

4. **Applies graduated penalties/rewards**:
   - **Well-aligned** (0-30°): weight = 0.8 (20% faster)
   - **Moderately aligned** (30-150°): weight = 1.5 (50% slower)
   - **Opposite direction** (150-210°): weight = 3.0 (strongly discouraged)

### Two-Way Traffic Handling

For features marked as two-way traffic (`ft_trafic=2`):
- System checks if the reverse edge is well-aligned
- If reverse edge has good alignment, allows travel in both directions
- Prevents penalizing legitimate bidirectional routes

### Affected Features

Directional weights are applied to:
- **Traffic Separation Schemes** (`tsslpt`): One-way shipping lanes
- **Fairways** (`fairwy`): Main navigation channels
- **Recommended tracks** (`rectrc`, `dwrtcl`): Preferred routes

This ensures routes follow established maritime traffic patterns and avoid wrong-way travel.

In [None]:
if workflow_steps["run_directional_weights"]:
    start_time = time.perf_counter()
    print("Applying directional weights...")
    
    # Apply directional weights using database-side trigonometry
    summary = weights_manager.calculate_directional_weights_gpkg(
        graph_gpkg_path=str(graph_file_directed),
        alignment_bonus=0.8,        # 20% faster when aligned with traffic
        misalignment_penalty=1.5,   # 50% slower when moderately misaligned
        opposite_penalty=3.0        # 3x slower when traveling opposite to traffic
    )
    
    end_time = time.perf_counter()
    performance_metrics['Directional Weights'] = end_time - start_time
    
    print(f"\n✓ Directional weights applied in {performance_metrics['Directional Weights']:.2f}s")
    print(f"  • Edges updated: {summary['edges_updated']:,}")
else:
    print("⊘ Skipping directional weight application")

## 7. Apply Dynamic (Vessel-Specific) Weights

**Final weighting step** that combines all previous weights with vessel-specific constraints and environmental conditions to produce the `adjusted_weight` used for pathfinding.

### Three-Tier Integration

Dynamic weights integrate all three tiers from previous steps:

#### Tier 1: Blocking Factor
Combines static blocking weights with vessel physical constraints:
- **Under-Keel Clearance (UKC)**: Checks if water depth (`ft_depth`) is sufficient for vessel draft + safety margin
- **Vertical Clearance**: Checks if bridge clearance (`ft_verclr`) exceeds vessel height
- **Result**: Shallow water or low bridges get extremely high weights (effectively blocked)

#### Tier 2: Penalty Factor
Combines static penalties with environmental conditions:
- Weather degradation (e.g., 1.2x penalty for poor weather)
- Visibility reduction (e.g., 1.1x penalty for poor visibility)
- Time-of-day adjustments (e.g., night navigation penalties)
- **Result**: Moderate weight increases for less favorable conditions

#### Tier 3: Bonus Factor
Uses static bonus weights with vessel type preferences:
- Fairways and TSS lanes (from static weights)
- Deep water channels (preferred by large vessels)
- **Result**: Weight reductions for preferred routes

### Final Weight Formula

```
adjusted_weight = base_distance × blocking_factor × penalty_factor × bonus_factor × directional_weight
```

Where:
- `base_distance`: Original edge length in nautical miles
- `blocking_factor`: From `wt_static_blocking` + UKC/clearance checks
- `penalty_factor`: From `wt_static_penalty` + environmental conditions
- `bonus_factor`: From `wt_static_bonus` + vessel preferences
- `directional_weight`: From `wt_dir` (traffic flow alignment)

### Result

Edges that are **safe, aligned with traffic, in preferred areas, and suitable for the vessel** get low weights (preferred routes). Edges that are **dangerous, misaligned, or unsuitable** get high weights (avoided routes).

In [None]:
if workflow_steps["run_dynamic_weights"]:
    start_time = time.perf_counter()
    print("Calculating final dynamic weights...")
    
    # Combine all weight tiers using database-side calculations
    summary = weights_manager.calculate_dynamic_weights_gpkg(
        graph_gpkg_path=str(graph_file_directed),
        vessel_parameters=vessel_params,
        environmental_conditions=env_conditions
    )
    
    end_time = time.perf_counter()
    performance_metrics['Dynamic Weights'] = end_time - start_time
    
    print(f"\n✓ Dynamic weights calculated in {performance_metrics['Dynamic Weights']:.2f}s")
    print(f"  • Edges updated: {summary['edges_updated']:,}")
    print("\n⚠ IMPORTANT: Use weight_key='adjusted_weight' for pathfinding")
else:
    print("⊘ Skipping dynamic weight calculation")

## 8. Pathfinding and Route Export

With the graph fully weighted, calculate an optimal route between departure and arrival points using the `adjusted_weight` that incorporates all weight tiers.

### 8.1. Load Weighted Graph into Memory

Load the final, fully weighted graph from the GPKG file into a NetworkX graph object. This is the only step that loads data into Python memory.

In [None]:
if workflow_steps["run_pathfinding"]:
    start_time = time.perf_counter()
    print(f"Loading weighted graph from {graph_file_directed.name}...")
    
    try:
        G = graph.load_graph_from_gpkg(gpkg_path = str(graph_file_directed),
                                    directed=True)
        end_time = time.perf_counter()
        performance_metrics['Graph Loading'] = end_time - start_time
        
        print(f"✓ Graph loaded in {performance_metrics['Graph Loading']:.2f}s")
        print(f"  • Nodes: {G.number_of_nodes():,}")
        print(f"  • Edges: {G.number_of_edges():,}")
    except Exception as e:
        print(f"✗ Failed to load graph: {e}")
        G = None
else:
    print("⊘ Skipping pathfinding")
    G = None

### 8.2. Calculate and Export Route

Calculate the optimal route using the final `adjusted_weight` and export to GeoJSON for visualization in GIS applications.

In [None]:
if workflow_steps["run_pathfinding"] and G is not None:
    start_time = time.perf_counter()
    print("\nCalculating optimal route...")
    
    # Create/update custom port locations
    port_manager.create_custom_port(
        port_name=departure_port_name,
        lon=departure_coords['lon'],
        lat=departure_coords['lat'],
        if_exists='update'
    )
    port_manager.create_custom_port(
        port_name=arrival_port_name,
        lon=arrival_coords['lon'],
        lat=arrival_coords['lat'],
        if_exists='update'
    )
    
    # Get port geometries
    departure_port = port_manager.get_port_by_name(departure_port_name)
    arrival_port = port_manager.get_port_by_name(arrival_port_name)
    
    if departure_port.empty or arrival_port.empty:
        print("✗ Error: Could not find departure or arrival port.")
    else:
        # Initialize routing engine
        route_finder = Route(graph=G, data_manager=factory.manager)
        
        # Calculate detailed route
        print(f"Route: {departure_port_name} → {arrival_port_name}")
        route_detail = route_finder.detailed_route(
            departure_point=departure_port.geometry,
            arrival_point=arrival_port.geometry,
            weight_key='adjusted_weight'  # Use final weighted edges
        )
        
        # Export route to GeoJSON
        output_path = output_dir / f"route_{vessel_params['draft']}m_draft.geojson"
        route_finder.save_detailed_route_to_file(route_detail, output_path=str(output_path))
        
        end_time = time.perf_counter()
        performance_metrics['Route Calculation'] = end_time - start_time
        
        print(f"\n✓ Route calculated in {performance_metrics['Route Calculation']:.2f}s")
        print(f"✓ Route exported to: {output_path}")

elif workflow_steps["run_pathfinding"] and G is None:
    print("⊘ Skipping route calculation (graph failed to load)")
else:
    print("⊘ Skipping route calculation")

## 9. Workflow Summary and Next Steps

Congratulations! You've completed the weighting and pathfinding pipeline. Here's what was accomplished:

### What You've Created

1. **Directed Graph**: Undirected graph converted to support directional routing, enabling traffic flow alignment
2. **Enriched Edges**: All edges now have S-57 feature attributes (depth, orientation, clearance, etc.)
3. **Three-Tier Weights**: Edges have static, directional, and dynamic weights combined into `adjusted_weight`
4. **Optimal Route**: A route computed using the final weighted graph that balances:
   - Safe passage (avoiding land, shallow areas, overhead hazards)
   - Traffic compliance (following fairways, TSS lanes, recommended tracks)
   - Vessel constraints (draft, height, type-specific preferences)
   - Environmental factors (weather, visibility, time of day)

### Understanding the Weights

The final `adjusted_weight` on each edge combines:
```
adjusted_weight = base_distance × blocking_factor × penalty_factor × bonus_factor × directional_weight
```

Where:
- **base_distance**: Original edge length (nautical miles)
- **blocking_factor**: Absolute obstacles (land, shallow water) - high = impassable
- **penalty_factor**: Areas to avoid (environmental conditions, hazards)
- **bonus_factor**: Preferred areas (fairways, TSS lanes, dredged channels) - <1.0 = encouraged
- **directional_weight**: Traffic flow alignment (follow one-way lanes, align with fairways)

### Key Differences: File-Based vs PostGIS

This GeoPackage workflow provides:
- **Portability**: No database server required - all data in single .gpkg file
- **Performance**: SpatiaLite database-side operations (10-20x faster than Python memory)
- **Simplicity**: Works on any system with GDAL/GeoPandas, no PostgreSQL setup needed
- **Scalability**: Can handle graphs with millions of nodes/edges efficiently

### Next Steps

**For Further Analysis:**
- Examine route segments in QGIS to understand routing decisions
- Compare routes with different vessel parameters (draft, height)
- Analyze weight distributions to identify bottleneck areas
- Validate against real-world maritime practices

**For Production Use:**
- Store final weighted graphs in version control (small .gpkg files)
- Create multiple graphs for different regions/scales
- Build batch routing workflows for multiple vessel types
- Integrate routes into navigation systems, ETA calculators, or fuel estimation tools
- Update weighting factors based on operational experience and feedback

**For Performance Optimization:**
- Review benchmark metrics (`benchmark_graph_weighted_directed_gpkg.csv`) to identify slow steps
- Consider which features are most impactful for your area of operations
- Experiment with different vessel parameters to understand weight sensitivities
- Monitor edge enrichment performance (usually the longest step)
- For very large graphs (>1M nodes), consider splitting by region

### GeoPackage-Specific Tips

**File Management:**
- Keep backup copies of enriched graphs (expensive to recreate)
- Compress .gpkg files with 7zip for archival (typically 70-80% reduction)
- Use `layer='edges'` or `layer='nodes'` when reading from multi-layer GeoPackage

**Performance Tuning:**
- Increase `ram_cache_mb` in enrichment step for faster processing on systems with ample RAM
- Ensure R-tree spatial indexes are created (`skip_layers_without_rtree=True`)
- SpatiaLite SQL typically runs 10-20x faster than Python operations for spatial queries

**Interoperability:**
- GeoPackage files open in QGIS, ArcGIS, and most GIS software
- Convert to PostGIS with `export_postgis_to_gpkg()` if transitioning to server-based workflows
- Layer names and column names are preserved for compatibility

### Performance Expectations

Typical timing for different graph sizes (fine grid at 0.02-0.1 NM spacing):
- **Small graphs** (10K-50K nodes): 5-15 min total pipeline
- **Medium graphs** (50K-200K nodes): 15-45 min total pipeline  
- **Large graphs** (200K-1M nodes): 1-3 hours total pipeline
- **Edge enrichment** is typically 70% of time; directional/dynamic weights are fast

See performance summary and benchmarks above for your specific graph's timing.

In [None]:
if performance_metrics:
    # Create performance DataFrame
    perf_df = pd.DataFrame(
        list(performance_metrics.items()),
        columns=['Step', 'Time (seconds)']
    )
    perf_df = perf_df.sort_values(by='Time (seconds)', ascending=False)
    
    # Create interactive bar chart
    fig = px.bar(
        perf_df,
        x='Step',
        y='Time (seconds)',
        title='GeoPackage Maritime Graph Pipeline Performance',
        text_auto='.2f',
        labels={'Step': 'Pipeline Step', 'Time (seconds)': 'Time (seconds)'}
    )
    fig.update_traces(textposition='outside')
    fig.show()
    
    # Print summary
    print("\n" + "="*60)
    print("PERFORMANCE SUMMARY")
    print("="*60)
    total_time = sum(performance_metrics.values())
    print(f"Total Pipeline Time: {total_time:.2f}s ({total_time/60:.2f} minutes)")
    print("\nStep-by-step breakdown:")
    for step, time_val in sorted(performance_metrics.items(), key=lambda x: x[1], reverse=True):
        percentage = (time_val / total_time) * 100
        print(f"  • {step:.<30} {time_val:>6.2f}s ({percentage:>5.1f}%)")
    print("="*60)
else:
    print("No performance metrics recorded. Run workflow steps to generate summary.")

### Export Benchmark Data

Save detailed performance metrics to CSV for long-term tracking and comparison across different configurations.

In [None]:
if performance_metrics:
    from datetime import datetime
    
    # Get graph statistics
    if G is not None:
        node_count = G.number_of_nodes()
        edge_count = G.number_of_edges()
    else:
        try:
            edges_gdf = gpd.read_file(graph_file_directed, layer='edges', read_geometry=False)
            nodes_gdf = gpd.read_file(graph_file_directed, layer='nodes', read_geometry=False)
            node_count = len(nodes_gdf)
            edge_count = len(edges_gdf)
        except:
            node_count = 0
            edge_count = 0
    
    # Calculate normalized metrics (per 100K nodes)
    time_per_100k_nodes = {}
    if node_count > 0:
        for step, time_val in performance_metrics.items():
            time_per_100k_nodes[step] = (time_val / node_count) * 100000
    
    # Build benchmark record
    benchmark_record = {
        'timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'workflow': 'graph_weighted_directed_gpkg_v2',
        'graph_name': graph_name_directed,
        'node_count': node_count,
        'edge_count': edge_count,
        'vessel_draft_m': vessel_params['draft'],
        'vessel_height_m': vessel_params['height'],
        'vessel_type': vessel_params['vessel_type'],
        'weather_factor': env_conditions['weather_factor'],
        'enc_count': len(enc_list),
        # Timing metrics (seconds)
        'conversion_to_directed_sec': performance_metrics.get('Conversion to Directed', 0),
        'edge_enrichment_sec': performance_metrics.get('Edge Enrichment', 0),
        'static_weights_sec': performance_metrics.get('Static Weights', 0),
        'directional_weights_sec': performance_metrics.get('Directional Weights', 0),
        'dynamic_weights_sec': performance_metrics.get('Dynamic Weights', 0),
        'graph_loading_sec': performance_metrics.get('Graph Loading', 0),
        'route_calculation_sec': performance_metrics.get('Route Calculation', 0),
        # Normalized metrics (per 100K nodes)
        'conversion_per_100k_nodes': time_per_100k_nodes.get('Conversion to Directed', 0),
        'enrichment_per_100k_nodes': time_per_100k_nodes.get('Edge Enrichment', 0),
        'static_weights_per_100k_nodes': time_per_100k_nodes.get('Static Weights', 0),
        'directional_weights_per_100k_nodes': time_per_100k_nodes.get('Directional Weights', 0),
        'dynamic_weights_per_100k_nodes': time_per_100k_nodes.get('Dynamic Weights', 0),
        'total_pipeline_sec': sum(performance_metrics.values()),
    }
    
    # Convert to DataFrame
    benchmark_df = pd.DataFrame([benchmark_record])
    
    # Define CSV path
    benchmark_csv = output_dir / 'benchmark_graph_weighted_directed_gpkg.csv'
    
    # Append or create CSV
    if benchmark_csv.exists():
        existing_df = pd.read_csv(benchmark_csv)
        combined_df = pd.concat([existing_df, benchmark_df], ignore_index=True)
        combined_df.to_csv(benchmark_csv, index=False)
        print(f"\n✓ Appended benchmark to: {benchmark_csv}")
        print(f"  Total records: {len(combined_df)}")
    else:
        benchmark_df.to_csv(benchmark_csv, index=False)
        print(f"\n✓ Created benchmark file: {benchmark_csv}")
    
    # Display current benchmark
    print("\n" + "="*60)
    print("BENCHMARK RECORD")
    print("="*60)
    print(f"Timestamp:     {benchmark_record['timestamp']}")
    print(f"Workflow:      {benchmark_record['workflow']}")
    print(f"Data Source:   {'GeoPackage'}")
    print(f"Graph:         {benchmark_record['graph_name']}")
    print(f"Nodes:         {benchmark_record['node_count']:,}")
    print(f"Edges:         {benchmark_record['edge_count']:,}")
    print(f"Vessel:        {benchmark_record['vessel_type']} "
          f"(draft={benchmark_record['vessel_draft_m']}m, "
          f"height={benchmark_record['vessel_height_m']}m)")
    print(f"ENCs:          {benchmark_record['enc_count']}")
    print(f"Total Time:    {benchmark_record['total_pipeline_sec']:.2f}s")
    print("="*60)
else:
    print("No performance metrics to export.")