# Fine-Resolution Graph Creation from GeoPackage/SpatiaLite

This notebook demonstrates advanced maritime navigation graph creation using high-resolution grids and H3 hexagonal tessellation with file-based backends (GeoPackage or SpatiaLite).

The workflow covers:
1. **Base Route Loading**: Import an existing base route to focus graph creation on relevant areas.
2. **Route Buffering**: Create a buffer zone around the route to define the area of interest.
3. **Area Slicing**: Optionally reduce the AOI to a specific geographic region for testing/optimization.
4. **Fine Grid Generation**: Create high-density navigable grids (0.1-0.3 NM spacing) or H3 hexagonal grids.
5. **Graph Construction**: Build NetworkX graphs with connectivity analysis and component filtering.
6. **Efficient Persistence**: Save graphs to GeoPackage format.
7. **Route Calculation**: Compute optimal paths on the fine-resolution graph.

This notebook builds upon the base graph workflow by adding:
- **Higher resolution grids** for precise routing near coastlines, channels, and harbors
- **H3 hexagonal tessellation** as an alternative to regular square grids
- **Performance optimization** techniques for handling large graphs (millions of nodes/edges)
- **File-based backends** (GeoPackage/SpatiaLite) for portability and offline use

---

## Required Data

This notebook requires:
1. **Base Route**: Pre-computed route from base graph workflow (stored in GeoPackage)
2. **ENC Data**: S-57 charts converted to GeoPackage or SpatiaLite format
3. **Graph Configuration**: YAML config file defining layers and H3 settings

**Setup Instructions:**
See `docs/SETUP.md` for converting S-57 charts to GeoPackage/SpatiaLite.

**Troubleshooting:**
If you encounter issues, see `docs/TROUBLESHOOTING.md` for common problems and solutions.

In [1]:
# =================================================================
# Notebook Settings
# =================================================================
# This cell contains all configurable parameters for the notebook.
# Modify these settings to customize the graph creation workflow.

# --- General Workflow Flags ---
# Toggle major workflow steps on/off for testing or partial runs
buffer_sliced = True           # If True, slice buffer to reduce area (for testing/optimization)
graph_mode = "fine"            # Graph type: "fine" (regular grid) or "h3" (hexagonal tessellation)
save_gpkg = True              # Save graph to GeoPackage file
calc_route = True             # Calculate and visualize a route on the created graph

# --- Data Sources & Paths ---
base_route_name = "base_route"
base_route_table = "base_routes"
config_file_path = "src/maritime_module/data/graph_config.yml"

# --- Geographic & Route Parameters ---
route_buffer_size_nm = 24.0           # Buffer distance around route in nautical miles
slice_south_degree = 37.0             # Southern latitude limit for buffer slicing
departure_port_name = "Pilot"         # Starting point name
arrival_port_name = "San Francisco"   # Ending point name

# --- Fine Graph Settings ---
# Parameters for regular grid graph creation
fine_grid_spacing_nm = 0.2            # Node spacing in nautical miles
fine_grid_max_points = 1000000        # Safety limit to prevent excessive memory usage
fine_graph_max_edge_factor = 2.0      # Max edge length multiplier (also used for bridging)
fine_graph_bridge_components = True  # Bridge disconnected components (useful for spacing <0.1 NM)

# --- H3 Graph Settings ---
keep_largest_component = True         # Keep only the largest connected component

def _to_str(value_to_format: float) -> str:
    """
    Helper function to format a numeric value for file naming.

    It converts a float into a string representation of the value multiplied
    by 100, formatted to at least two digits with leading zeros. This is
    useful for creating consistent file suffixes from grid spacing values.

    Examples:
    - 0.1   -> "10"
    - 0.05  -> "05"
    - 1.0   -> "100"
    - 0.25  -> "25"

    Args:
        value_to_format (float): The numeric value to format.

    Returns:
        str: A string representation of the value * 100, zero-padded to at least 2 digits.
    """
    value = value_to_format * 100
    int_value = int(round(value))
    return f"{int_value:02d}"

# --- Output & Saving ---
gpkg_h3_name_suffix = "gpkg_6_11"
fine_grid_name_suffix = _to_str(fine_grid_spacing_nm)


## 2. Environment Setup and Imports

This section sets up the Python environment and imports all necessary libraries for graph creation, geospatial processing, and visualization.

In [2]:
import os
import sys
from pathlib import Path
import time
import pandas as pd
import plotly.express as px

import plotly.graph_objects as go
import plotly.io as pio
from dotenv import load_dotenv

# Add the src directory to the Python path
project_root = Path.cwd().parent.parent
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

from src.maritime_module.core.graph import (FineGraph, GraphConfigManager,
                                            H3Graph)
from src.maritime_module.core.pathfinding_lite import Route
from src.maritime_module.core.s57_data import (ENCDataFactory,
                                                S57AdvancedConfig)
from src.maritime_module.utils.geometry_utils import Buffer, Slicer
from src.maritime_module.utils.plot_utils import PlotlyChart
from src.maritime_module.utils.port_utils import Boundaries, PortData

# Load environment variables from .env file at the project root
load_dotenv(project_root / ".env")
pio.renderers.default = "notebook_connected"

# Define paths for data and output
output_dir = Path.cwd() / 'output'
output_dir.mkdir(exist_ok=True)

# Define database file
data_file = Path.cwd() / "output" / "us_enc_all.gpkg"


print(f"Output directory: {output_dir}")
print(f"Data file: {data_file}")

# --- Performance Tracking ---
performance_metrics = {}


Output directory: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/docs/notebooks/output
Data file: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/docs/notebooks/output/us_enc_all.gpkg


## 3. Initialize GeoPackage/SpatiaLite Data Factory

In [3]:
gpkg_factory = ENCDataFactory(source=data_file)

2025-10-29 11:58:57,327 - src.maritime_module.core.s57_data - INFO - Source is a .gpkg file, initializing GPKGManager.
2025-10-29 11:58:57,328 - src.maritime_module.core.s57_data - INFO - Routes will be managed in: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/docs/notebooks/output/maritime_routes.gpkg
2025-10-29 11:58:57,329 - src.maritime_module.core.s57_data - INFO - Successfully connected to GeoPackage 'us_enc_all.gpkg'


## 4. Initialize Plotly Visualization

In [4]:
# --- Create Base Plotly Map ---
# Initialize the plotting utility and create an interactive base map
# The Mapbox token is loaded from .env for security (not hardcoded)
ply = PlotlyChart()
ply_fig = ply.create_base_map(mapbox_token=os.getenv('MAPBOX_TOKEN'))
ply.plotly_base_config(ply_fig)

{'displayModeBar': 'hover',
 'responsive': True,
 'scrollZoom': True,
 'displaylogo': False,
 'modeBarButtonsToRemove': ['zoomIn',
  'zoomOut',
  'pan',
  'select',
  'lassoSelect'],
 'modeBarButtonsToAdd': ['autoScale', 'hoverCompareCartesian']}

## 5. Base Route Loading and Area of Interest Definition

In [5]:
start_time = time.perf_counter()
base_route = gpkg_factory.load_route(route_name=base_route_name,
                                     table_name=base_route_table)
end_time = time.perf_counter()
performance_metrics['Load Base Route'] = end_time - start_time
print(f"Loading base route took: {end_time - start_time:.2f}s")

# This performs a deep copy, ensuring that changes to the copy do not affect the original.
ply_route = go.Figure(ply_fig)

ply.add_route_trace(figure=ply_route,
                    line=base_route,
                    name="Base Route")
ply_route.show()

Loading base route took: 0.05s


### 5.1 Create Buffer Around Route

In [6]:
start_time = time.perf_counter()
route_buffer = Buffer.create_buffer(base_route, route_buffer_size_nm)
end_time = time.perf_counter()
performance_metrics['Create Buffer'] = end_time - start_time
print(f"Creating buffer took: {end_time - start_time:.2f}s")

2025-10-29 11:58:57,599 - src.maritime_module.utils.geometry_utils - INFO - Creating buffer of 24.0 nm (0.400000 degrees) for LineString.
Creating buffer took: 0.03s


In [7]:
ply.add_polygon_trace(fig=ply_route,
                      polygon=route_buffer,
                      name="Base Route Buffer")
ply_route.show()

### 5.2 Slice Buffer (Optional Area Reduction)

In [8]:
start_time = time.perf_counter()
sliced_buffer = Slicer.slice_by_bbox(route_buffer, south=slice_south_degree)
end_time = time.perf_counter()
performance_metrics['Slice Buffer'] = end_time - start_time
print(f"Slicing buffer took: {end_time - start_time:.2f}s")
ply.add_polygon_trace(fig=ply_route,
                      polygon=sliced_buffer,
                      name="Sliced Buffer")
ply_route.show()

2025-10-29 11:58:57,764 - src.maritime_module.utils.geometry_utils - INFO - Slicing Polygon with bounding box: N=38.2167, E=-117.8370, S=37.0000, W=-122.9220
Slicing buffer took: 0.00s


## 6. Graph Creation Prerequisites

In [9]:
# --- Define Geographic Points for Routing ---
# Initialize port data utility which merges World Port Index with custom ports.
# This allows using standard ports (e.g., "San Francisco") or defining custom
# test points (e.g., "Pilot") for route calculations.
port = PortData()

# Create or update a custom port entry.
# The 'if_exists' parameter controls behavior when a port already exists:
#   - 'update': Replace coordinates with new values
#   - 'skip': Keep existing port unchanged
#   - 'raise': Throw an error if port exists
port.create_custom_port(port_name=departure_port_name,
                        lon=-122.27,
                        lat=37.0,
                        if_exists='update')

# Get departure and arrival points by name.
# These will be used as endpoints for route calculation.
dep_point = port.get_port_by_name(departure_port_name)
arr_point = port.get_port_by_name(arrival_port_name)

# Create detailed DataFrames for inspection (optional)
port1_df = port.get_port_details_df(dep_point)
port2_df = port.get_port_details_df(arr_point)

# Print formatted port information
print(port.format_port_string(dep_point))
print(port.format_port_string(arr_point))

# --- Visualize Ports on Map ---
# Add departure port (blue) and arrival port (red) to the map
ply.add_single_port_trace(ply_route, dep_point, name=dep_point['PORT_NAME'], color='blue')
ply.add_single_port_trace(ply_route, arr_point, name=arr_point['PORT_NAME'], color='red')
ply_route.show()

2025-10-29 11:58:58,241 - src.maritime_module.utils.port_utils - INFO - Loaded 7 custom ports. Merging with standard ports.
2025-10-29 11:58:58,255 - src.maritime_module.utils.port_utils - INFO - Port 'Pilot' exists. Updating.
2025-10-29 11:58:58,272 - src.maritime_module.utils.port_utils - INFO - Updated attributes for custom port 'Pilot'.
2025-10-29 11:58:58,277 - src.maritime_module.utils.port_utils - INFO - Loaded 7 custom ports. Merging with standard ports.
PILOT, CUSTOM (LAT: 37° 0.0' N  LONG: 122° 16.19999999999976' W)
SAN FRANCISCO, US (LAT: 37° 49.0' N  LONG: 122° 25.0' W)


### 6.2 Load Graph Configuration and Filter ENCs

In [10]:
# --- Select Active Buffer (Full or Sliced) ---
# Choose which buffer polygon to use based on the buffer_sliced setting.
# Slicing is useful for:
#   - Testing workflows on smaller areas (faster iteration)
#   - Reducing memory requirements for experimentation
#   - Focusing on specific geographic regions of interest
if buffer_sliced:
    active_buffer = sliced_buffer
else:
    active_buffer = route_buffer

# --- Load Graph Configuration from YAML ---
# The config file (graph_config.yml) defines:
# - Navigable layers: Define safe water areas for routing
#   * seaare: Main sea areas from ENC charts
#   * fairwy: Designated fairways and channels
#   * drgare: Dredged areas (maintained depth)
#   * tsslpt: Traffic separation scheme lanes
#   * prcare: Precautionary areas
# - Obstacle layers: Define hazards to subtract from navigable areas
#   * lndare: Land areas (hard obstacles)
#   * slcons: Shoreline constructions (piers, breakwaters)
#   * obstrn: Obstructions (underwater hazards)
# - H3 hexagon settings: Resolution ranges and connectivity rules
#   * Multi-resolution hierarchy (typically 6-11 for coastal navigation)
#   * Bridge connectivity parameters for seamless multi-scale routing
config_path = project_root / config_file_path
config_manager = GraphConfigManager(config_path)

# --- Filter ENCs by Boundary ---
# Query GeoPackage to find all ENCs that intersect our area of interest.
# This spatial filtering is CRITICAL for performance:
#   - Reduces data processing to relevant charts only
#   - Prevents loading unnecessary ENC data (thousands of charts)
#   - Typical workflow: 6000+ total ENCs → 20-50 relevant ENCs
# The method queries the enc_summary layer which contains bounding boxes
# for all available charts, using spatial filtering to identify intersections.
start_time = time.perf_counter()
enc_list = gpkg_factory.get_encs_by_boundary(active_buffer)
end_time = time.perf_counter()
performance_metrics['ENC Filtering'] = end_time - start_time
print(f"ENC filtering took: {end_time - start_time:.2f}s")

2025-10-29 11:58:58,350 - src.maritime_module.core.s57_data - INFO - Factory: Filtering ENCs by boundary...
2025-10-29 11:58:58,351 - src.maritime_module.core.s57_data - INFO - Factory: Getting ENC summary...
2025-10-29 11:58:58,430 - src.maritime_module.core.s57_data - INFO - Factory: Getting bounding boxes for 6369 ENCs...
ENC filtering took: 1.02s


## 7. Fine Graph Creation (Regular Grid)

In [11]:
# --- Extract Layer Configuration from YAML ---
# Get the layer settings from the config file
layers_config = config_manager.get_value("layers")

# Extract the specific lists for navigable and obstacle layers.
# These layer definitions follow IHO S-57 standard object classes:
#
# Navigable layers define safe water areas (UNION operation):
#   - seaare: General sea areas (primary coverage)
#   - fairwy: Designated fairways with maintained depth
#   - drgare: Dredged areas with known depth
#   - tsslpt: Traffic separation lanes (routing corridors)
#   - prcare: Precautionary areas (navigable but requiring caution)
#
# Obstacle layers define hazards (SUBTRACTION operation):
#   - lndare: Land areas (absolute obstacles)
#   - slcons: Shoreline constructions (piers, jetties, breakwaters)
#   - obstrn: Underwater obstructions (wrecks, rocks, etc.)
#
# The final navigable grid = (UNION of navigable) - (UNION of obstacles)
navigable_layers_config = layers_config.get('navigable', [])
obstacle_layers_config = layers_config.get('obstacles', [])

# --- Create Fine Grid ---
# Skip this section if graph_mode is set to "h3"
start_time = time.perf_counter()

if graph_mode == "fine" or "h3":
    # Fine grid is not used for H# Graph creation but will be used in static land filtering. As it creates refined land and sea grids from all Usage band levels.
    # Initialize FineGraph class with GeoPackage data factory
    fg = FineGraph(data_factory=gpkg_factory,
                   route_schema_name="routes",
                   graph_schema_name="graph")

    # Create the navigable grid by processing S-57 layers iteratively.
    # The method uses a prioritized approach based on ENC usage bands:
    #   Band 1: Overview (1:1,500,000+) - Coarse ocean coverage
    #   Band 2: General (1:350,000-1:1,499,999) - Coastal approaches
    #   Band 3: Coastal (1:90,000-1:349,999) - Near-shore navigation
    #   Band 4: Approach (1:30,000-1:89,999) - Harbor approaches
    #   Band 5: Harbour (1:12,000-1:29,999) - Port areas
    #   Band 6: Berthing (<1:11,999) - Detailed harbor/dock areas
    #
    # Processing order: 1→6 (coarse to fine) to progressively refine coverage.
    # Each band's features are merged with existing grid, with higher priority
    # bands overwriting lower priority areas where they overlap.
    fg_grid = fg.create_fine_grid(route_buffer=active_buffer,
                                  enc_names=enc_list,
                                  navigable_layers=navigable_layers_config,
                                  obstacle_layers=obstacle_layers_config,
                                  return_geometries=True  # Ensure Shapely geometries are returned
                                  )
    end_time = time.perf_counter()
    performance_metrics['Fine Grid Creation'] = end_time - start_time
    print(f"Fine grid creation took: {end_time - start_time:.2f}s")

2025-10-29 11:58:59,442 - src.maritime_module.core.graph - INFO - Starting iterative grid creation for 6 usage bands
2025-10-29 11:58:59,442 - src.maritime_module.core.graph - INFO - Processing usage band 1 (Overview)...
2025-10-29 11:58:59,443 - src.maritime_module.core.s57_data - INFO - Factory: Getting layer 'seaare'...
2025-10-29 11:59:00,570 - src.maritime_module.core.graph - INFO - Added sea area from band 1 (Overview) to main grid
2025-10-29 11:59:00,571 - src.maritime_module.core.s57_data - INFO - Factory: Getting layer 'lndare'...
2025-10-29 11:59:01,695 - src.maritime_module.core.graph - INFO - Processing usage band 2 (General)...
2025-10-29 11:59:01,696 - src.maritime_module.core.s57_data - INFO - Factory: Getting layer 'seaare'...
2025-10-29 11:59:02,208 - src.maritime_module.core.graph - INFO - Added sea area from band 2 (General) to main grid
2025-10-29 11:59:02,208 - src.maritime_module.core.s57_data - INFO - Factory: Getting layer 'lndare'...
2025-10-29 11:59:02,834 - s

### 7.2 Visualize Fine Grid Components

In [12]:
ply_fine_grid = go.Figure(ply_fig)
if graph_mode == "fine":
    ply.add_grid_trace(ply_fine_grid, name= "Main Grid", grid_geojson=fg_grid["main_grid"], color="blue")
    ply.add_grid_trace(ply_fine_grid, name= "Combined Grid", grid_geojson=fg_grid["combined_grid"], color="red")
    ply.add_grid_trace(ply_fine_grid, name= "Land Grid", grid_geojson=fg_grid["land_grid"], color="yellow")
    ply_fine_grid.show()

### 7.3 Fine Graph Construction

In [13]:
# --- Create Fine-Resolution Graph from Grid ---
# This is the most computationally intensive step.
#
# Performance benchmarks (Buffer=24NM):
# - 0.3 NM spacing: ~1m 12s (158,803 nodes, 627,475 edges)
# - 0.2 NM spacing: ~2m 43s
# - 0.1 NM spacing: ~5m 59s
#
# Parameters:
# - spacing_nm: Distance between adjacent nodes in nautical miles
# - max_points: Safety limit to prevent excessive memory usage
# - max_edge_factor: Max edge length = spacing * max_edge_factor (also used for bridging)
# - bridge_components: If True, bridges disconnected components (recommended for spacing <0.1 NM)
# - keep_largest_component: Remove isolated node clusters for routing reliability
start_time = time.perf_counter()
if graph_mode == "fine":
    G_fine = fg.create_base_graph(grid_data=fg_grid["combined_grid"],
                         spacing_nm=fine_grid_spacing_nm,
                         max_points=fine_grid_max_points,
                         max_edge_factor=fine_graph_max_edge_factor,
                         bridge_components=fine_graph_bridge_components,
                         keep_largest_component=keep_largest_component)
    end_time = time.perf_counter()
    performance_metrics['Fine Graph Creation'] = end_time - start_time
    print(f"Fine graph creation took: {end_time - start_time:.2f}s")

2025-10-29 11:59:12,725 - src.maritime_module.core.graph - INFO - Grid data parsing completed in 0.281s
2025-10-29 11:59:12,726 - src.maritime_module.core.graph - INFO - Starting subgraph creation for MultiPolygon with area 1.095963 deg²
2025-10-29 11:59:12,726 - src.maritime_module.core.graph - INFO - Creating grid: 272x366 = 99,552 potential points
2025-10-29 11:59:12,727 - src.maritime_module.core.graph - INFO - Using database-side graph creation for improved performance
2025-10-29 11:59:12,727 - src.maritime_module.core.s57_data - INFO - Generating mesh grid for polygon...
2025-10-29 11:59:12,729 - src.maritime_module.core.s57_data - INFO - Filtering 99,552 potential points against polygon...
2025-10-29 11:59:12,761 - src.maritime_module.core.s57_data - INFO - Found 46,124 valid nodes.
2025-10-29 11:59:12,804 - src.maritime_module.core.s57_data - INFO - Generating edges...
2025-10-29 11:59:13,429 - src.maritime_module.core.graph - INFO - Database grid subgraph creation completed in

## 8. H3 Graph Creation (Hexagonal Tessellation)

In [14]:
# --- Create H3 Hexagonal Graph ---
# This section only runs if graph_mode is set to "h3"
#
# H3 Graph Advantages:
# -------------------------
# 1. Uniform neighbor distances: All adjacent hexagons are equidistant
#    (eliminates diagonal distance artifacts from square grids)
# 2. Better angular coverage: 6 neighbors vs 8 in square grids
# 3. Hierarchical resolution: Natural multi-scale structure
#    - Parent hexagons subdivide into 7 child hexagons
#    - Seamless bridging between resolution levels
# 4. Efficient connectivity: Natural connectivity across resolution scales
#
# Performance benchmarks (Buffer=24NM, Resolutions 6-11, GeoPackage):
# - H3 generation: ~1m 21s (947,961 hexagons)
# - Graph construction: ~1m 24s (2,832,202 edges)
# - Component selection: ~45s (945,918 final nodes)
# - Total: ~3m 30s
#
# The H3 graph creation process:
# 1. Generates hexagons at multiple resolutions within navigable areas
#    - Each resolution level provides different detail (coarse to fine)
#    - Resolution 6: ~36 km² per hexagon (ocean navigation)
#    - Resolution 11: ~4 m² per hexagon (harbor precision)
# 2. Subtracts obstacles from hexagon coverage (land, constructions, obstructions)
# 3. Creates graph edges between adjacent hexagons (same resolution)
# 4. Bridges between resolution levels for seamless navigation
#    - Connects coarse and fine areas without artificial boundaries
#    - Enables smooth transitions from ocean to harbor routing
# 5. Optionally keeps only the largest connected component
#    - Removes isolated nodes/islands for routing reliability

start_time = time.perf_counter()
if graph_mode == "h3":
    # Initialize H3Graph class with GeoPackage data factory
    h3 = H3Graph(data_factory=gpkg_factory,
                route_schema_name="routes",
                graph_schema_name="graph")

    # Load H3 settings from configuration YAML.
    # Includes resolution ranges, bridge settings, and connectivity rules:
    #   - resolution_ranges: Dict mapping usage bands to H3 resolution levels
    #   - connectivity.bridge_between_resolutions: Enable multi-scale bridging
    #   - connectivity.min_same_res_neighbors: Minimum same-res neighbors before bridging
    #   - connectivity.target_total_neighbors: Target total neighbor count with bridges
    #   - connectivity.max_bridge_distance_nm: Max distance for bridge connections
    h3_settings = config_manager.get_value("h3_settings")
    connectivity_config = h3_settings.get('connectivity', {})

    # Create H3 graph and grid in one operation.
    # Returns both the NetworkX graph and the hexagon GeoDataFrame for visualization.
    # The graph includes:
    #   - Node attributes: lon, lat, h3_index, resolution, geometry
    #   - Edge attributes: length (nautical miles), bridge (boolean flag)
    G_h3, h3_grid = h3.create_h3_graph(route_buffer=active_buffer,
                                       enc_names=enc_list,
                                       navigable_layers=navigable_layers_config,
                                       obstacle_layers=obstacle_layers_config,
                                       connectivity_config=connectivity_config,
                                       keep_largest_component=keep_largest_component)
    end_time = time.perf_counter()
    performance_metrics['H3 Graph Creation'] = end_time - start_time
    print(f"H3 graph creation took: {end_time - start_time:.2f}s")

## 9. Save Graph to GeoPackage

In [15]:
# --- Save Graph to GeoPackage File ---
# GeoPackage (.gpkg) is a SQLite-based format optimized for geospatial data.
# It creates separate layers for nodes and edges with spatial indexes.
#
# GeoPackage Advantages:
# ----------------------
# 1. Portability: Single-file format, easy to share and archive
# 2. No server required: Works offline, no database setup needed
# 3. Open standard: Supported by QGIS, ArcGIS, and most GIS tools
# 4. Fast read/write: SQLite backend with spatial indexing
# 5. Cross-platform: Works on Windows, Mac, Linux without modification
#
# Performance benchmarks (GeoPackage backend):
# - Fine graph (46K nodes, 181K edges): ~4s
# - H3 graph (945K nodes, 2.8M edges): ~2m
#
# The saved GeoPackage contains two layers:
#   - nodes: Point geometries with attributes (node_id, lon, lat, h3_index, resolution)
#   - edges: LineString geometries with attributes (source, target, length, bridge)
#
# Use cases:
#   - Opening in QGIS for visualization and analysis
#   - Sharing graphs with collaborators (no database setup required)
#   - Archiving graph snapshots for version control
#   - Loading back into Python with gpd.read_file() for further processing

start_time = time.perf_counter()
if save_gpkg:
    if graph_mode == "h3":
        name = f"{graph_mode}_graph_{gpkg_h3_name_suffix}.gpkg"
        h3.save_graph_to_gpkg(G_h3, output_path=output_dir/ name)
        h3.save_grid_to_gpkg(fg_grid["land_grid_geom"], layer_name="land_grid", output_path=output_dir / name)
        h3.save_grid_to_gpkg(fg_grid["combined_grid_geom"], layer_name="sea_grid", output_path=output_dir / name)
    else:
        name = f"{graph_mode}_graph_{fine_grid_name_suffix}.gpkg"
        fg.save_graph_to_gpkg(G_fine, output_path=output_dir / name)
        fg.save_grid_to_gpkg(fg_grid["land_grid_geom"], layer_name="land_grid", output_path=output_dir / name)
        fg.save_grid_to_gpkg(fg_grid["combined_grid_geom"], layer_name="sea_grid", output_path=output_dir / name)
    end_time = time.perf_counter()
    performance_metrics['Save to GeoPackage'] = end_time - start_time
    print(f"Saving to GeoPackage took: {end_time - start_time:.2f}s")


2025-10-29 11:59:17,806 - src.maritime_module.core.graph - INFO - Saved 46,022 nodes to /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/docs/notebooks/output/fine_graph_20.gpkg in 1.899s
2025-10-29 11:59:21,311 - pyogrio._io - INFO - Created 180,521 records
2025-10-29 11:59:21,605 - src.maritime_module.core.graph - INFO - Saved 180,521 edges to /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/docs/notebooks/output/fine_graph_20.gpkg in 1.298s
2025-10-29 11:59:21,606 - src.maritime_module.core.graph - INFO - === Graph Save Operation Performance Summary ===
2025-10-29 11:59:21,606 - src.maritime_module.core.graph - INFO - Timing Metrics:
2025-10-29 11:59:21,606 - src.maritime_module.core.graph - INFO -   nodes_processing_time: 0.353s
2025-10-29 11:59:21,607 - src.maritime_module.core.graph - INFO -   nodes_save_time: 1.899s
2025-10-29 11:59:21,607 - src.maritime_module.core.graph - INFO -   edges_processing_time: 2.500s
2025-10-29 11:59:21,607 - src.maritime_module.cor

## 10. Route Calculation on Fine Graph

In [16]:
# --- Calculate Route on Fine-Resolution Graph ---
# Use the A* algorithm to find the shortest path between departure and arrival points.
#
# A* Algorithm Benefits:
# ----------------------
# 1. Optimal pathfinding: Guaranteed to find shortest path (with admissible heuristic)
# 2. Efficient search: Uses heuristic to guide exploration toward goal
# 3. Graph-based: Works on any connected graph structure (fine grid or H3)
#
# The routing process:
# 1. Maps user-provided port coordinates to nearest graph nodes
# 2. Applies A* with Euclidean distance heuristic
# 3. Returns route geometry (LineString) and total distance (nautical miles)
#
# Current implementation uses base distance weighting (all edges weighted equally
# by length). Future enhancements can add:
#   - Traffic separation scheme priorities
#   - Depth-based routing (avoid shallow areas)
#   - Weather/current integration
#   - Vessel-specific constraints (draft, size)

start_time = time.perf_counter()
if calc_route:
    # Get the departure and arrival port geometries
    dep_point = port.get_port_by_name(departure_port_name)
    arr_point = port.get_port_by_name(arrival_port_name)

    # Select the appropriate graph based on mode
    if graph_mode == 'h3':
        graph_for_routing = G_h3
    else:
        graph_for_routing = G_fine

    # Initialize Route class with the graph and data manager.
    # The data manager provides GeoPackage connectivity for saving/loading routes.
    route = Route(graph=graph_for_routing, data_manager=gpkg_factory.manager)
    
    # Compute the route using A* algorithm.
    # The method automatically:
    #   1. Maps port coordinates to nearest graph nodes using spatial index
    #   2. Validates start/end nodes are in the graph
    #   3. Runs A* pathfinding with distance-based weighting
    #   4. Converts node sequence to route geometry (LineString)
    #   5. Calculates total distance in nautical miles
    route_geometry, distance = route.base_route(
        departure_point=dep_point.geometry,
        arrival_point=arr_point.geometry,
    )
    end_time = time.perf_counter()
    performance_metrics['Route Calculation'] = end_time - start_time
    print(f"Route calculation took: {end_time - start_time:.2f}s")


2025-10-29 11:59:21,981 - src.maritime_module.core.pathfinding_lite - INFO - Computing base route with Astar...
2025-10-29 11:59:21,981 - src.maritime_module.core.pathfinding_lite - INFO - Computing A* route...
2025-10-29 11:59:22,695 - src.maritime_module.core.pathfinding_lite - INFO - Mapped start point to graph node: (np.float64(-122.26866666666753), np.float64(37.00333333333333))
2025-10-29 11:59:22,696 - src.maritime_module.core.pathfinding_lite - INFO - Mapped end point to graph node: (np.float64(-122.41533333333406), np.float64(37.816666666665924))
2025-10-29 11:59:22,866 - src.maritime_module.core.pathfinding_lite - INFO - Successfully computed route with 265 nodes.
2025-10-29 11:59:22,867 - src.maritime_module.core.pathfinding_lite - INFO - Route computed successfully. Total distance: 58.38 nautical miles.
Route calculation took: 0.89s


### 10.2 Visualize Computed Route

In [17]:
if calc_route:
    ply_fine_route = go.Figure(ply_fig)
    ply.add_route_trace(figure=ply_fine_route,
                        line=route_geometry,
                        name="Base Route")
    ply.add_single_port_trace(ply_fine_route, dep_point, name=dep_point['PORT_NAME'], color='blue')
    ply.add_single_port_trace(ply_fine_route, arr_point, name=arr_point['PORT_NAME'], color='red')
    ply_fine_route.show()


## 11. Performance Summary and Benchmark Export

This section visualizes the time taken for each step of the pipeline.

In [18]:
# --- Export Performance Benchmarks to CSV ---
# Save detailed performance metrics to CSV for long-term tracking and analysis
# Uses unified schema with PostGIS notebook for cross-backend comparison
if performance_metrics:
    from datetime import datetime
    
    # Determine graph type and get node/edge counts
    if graph_mode == "h3":
        final_graph = G_h3
    else:
        final_graph = G_fine
    
    node_count = final_graph.number_of_nodes()
    edge_count = final_graph.number_of_edges()
    
    # Calculate normalized metrics (per 100K nodes)
    time_per_100k_nodes = {}
    if node_count > 0:
        for step, time_val in performance_metrics.items():
            time_per_100k_nodes[step] = (time_val / node_count) * 100000
    
    # Build benchmark record with metadata and metrics (unified schema)
    benchmark_record = {
        'timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'workflow': 'graph_fine_GPKG_v2',
        'graph_mode': graph_mode,
        'db_schema': '',  # Not applicable for file-based backends
        'node_count': node_count,
        'edge_count': edge_count,
        'spacing_nm': fine_grid_spacing_nm if graph_mode == "fine" else 'h3',
        'buffer_size_nm': route_buffer_size_nm,
        'buffer_sliced': buffer_sliced,
        'keep_largest_component': keep_largest_component,
        # Individual timing metrics
        'load_base_route_sec': performance_metrics.get('Load Base Route', 0),
        'create_buffer_sec': performance_metrics.get('Create Buffer', 0),
        'slice_buffer_sec': performance_metrics.get('Slice Buffer', 0),
        'fine_grid_creation_sec': performance_metrics.get('Fine Grid Creation', 0),
        'fine_graph_creation_sec': performance_metrics.get('Fine Graph Creation', 0),
        'h3_graph_creation_sec': performance_metrics.get('H3 Graph Creation', 0),
        'save_gpkg_sec': performance_metrics.get('Save to GeoPackage', 0),
        'save_postgis_original_sec': 0,  # Not applicable for GPKG/SpatiaLite
        'save_postgis_optimized_sec': 0,  # Not applicable for GPKG/SpatiaLite
        'route_calculation_sec': performance_metrics.get('Route Calculation', 0),
        # Normalized metrics (time per 100K nodes)
        'grid_creation_per_100k_nodes': time_per_100k_nodes.get('Fine Grid Creation', 0) or time_per_100k_nodes.get('H3 Graph Creation', 0),
        'graph_creation_per_100k_nodes': time_per_100k_nodes.get('Fine Graph Creation', 0),
        'save_gpkg_per_100k_nodes': time_per_100k_nodes.get('Save to GeoPackage', 0),
        'save_postgis_orig_per_100k_nodes': 0,  # Not applicable for GPKG/SpatiaLite
        'save_postgis_opt_per_100k_nodes': 0,  # Not applicable for GPKG/SpatiaLite
        # Total pipeline time
        'total_pipeline_sec': sum(performance_metrics.values()),
    }
    
    # Convert to DataFrame
    benchmark_df = pd.DataFrame([benchmark_record])
    
    # UNIFIED CSV: Same file as PostGIS notebook for cross-backend comparison
    benchmark_csv = output_dir / 'benchmark_graph_fine.csv'
    
    # Append to existing CSV or create new one
    if benchmark_csv.exists():
        existing_df = pd.read_csv(benchmark_csv)
        combined_df = pd.concat([existing_df, benchmark_df], ignore_index=True)
        combined_df.to_csv(benchmark_csv, index=False)
        print(f"Appended benchmark to existing file: {benchmark_csv}")
        print(f"Total benchmark records: {len(combined_df)}")
    else:
        benchmark_df.to_csv(benchmark_csv, index=False)
        print(f"Created new benchmark file: {benchmark_csv}")
    
    # Display the current benchmark record
    print("\n=== Current Benchmark Record ===")
    print(f"Timestamp: {benchmark_record['timestamp']}")
    print(f"Graph Mode: {benchmark_record['graph_mode']}")
    print(f"Nodes: {benchmark_record['node_count']:,}")
    print(f"Edges: {benchmark_record['edge_count']:,}")
    print(f"Total Pipeline Time: {benchmark_record['total_pipeline_sec']:.2f}s")
    print(f"\nMost demanding operations:")

    # Show top 3 time-consuming operations
    top_operations = sorted(
        [(k, v) for k, v in performance_metrics.items()],
        key=lambda x: x[1],
        reverse=True
    )[:3]
    for i, (op, time_val) in enumerate(top_operations, 1):
        print(f"  {i}. {op}: {time_val:.2f}s")
else:
    print("No performance metrics to export.")

Appended benchmark to existing file: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/docs/notebooks/output/benchmark_graph_fine.csv
Total benchmark records: 46

=== Current Benchmark Record ===
Timestamp: 2025-10-29 11:59:23
Graph Mode: fine
Nodes: 46,022
Edges: 180,521
Total Pipeline Time: 22.30s

Most demanding operations:
  1. Fine Grid Creation: 10.96s
  2. Save to GeoPackage: 6.38s
  3. Fine Graph Creation: 2.96s


In [19]:
if performance_metrics:
    # Convert the dictionary to a pandas DataFrame for easy plotting
    perf_df = pd.DataFrame(list(performance_metrics.items()), columns=['Step', 'Time (seconds)'])
    perf_df = perf_df.sort_values(by='Time (seconds)', ascending=False)

    # Create an interactive bar chart
    fig = px.bar(
        perf_df,
        x='Step',
        y='Time (seconds)',
        title='Fine Graph Creation Pipeline Performance (GeoPackage)',
        text_auto='.2f',
        labels={'Step': 'Pipeline Step', 'Time (seconds)': 'Time Taken (seconds)'}
    )
    fig.update_traces(textposition='outside')
    fig.show()
else:
    print("No performance metrics were recorded. Run the notebook cells to generate the summary.")