# ENCDataFactory Testing and Exploration

This notebook provides a comprehensive guide to testing and exploring the `ENCDataFactory` class, which provides a unified interface for accessing S-57 Electronic Navigational Chart (ENC) data from different backends (PostGIS, GeoPackage, SpatiaLite).

## What You'll Learn
- How to initialize the `ENCDataFactory` for different data sources
- How to query ENC metadata (edition, update numbers, NOAA versions)
- How to retrieve and analyze layer information across your ENC database
- How to filter ENCs by usage band (scale categories)
- How to obtain bounding boxes for geographic analysis
- When to use different backends for your specific use case

## Prerequisites
- S-57 data converted to GeoPackage, SpatiaLite, or PostGIS format
- Environment variables configured (.env file) for PostGIS connections (optional if using file-based)
- Access to internet for NOAA database validation (optional)

## Architecture

The `ENCDataFactory` abstracts away the differences between backend systems, providing a unified interface for accessing ENC data regardless of storage format. This allows you to switch backends without changing your application code.

### Backend Comparison

| Feature | PostGIS | GeoPackage | SpatiaLite |
|---------|---------|-----------|-----------|
| **Setup** | Requires PostgreSQL + PostGIS | Single file, no server | Single file, no server |
| **Multi-user** | ✅ Yes, concurrent access | ❌ No, file-based | ❌ No, file-based |
| **Portability** | ❌ Requires infrastructure | ✅ Single file, easy backup | ✅ Single file, easy backup |
| **Performance** | ⭐ Best for large datasets | ⭐ Good for medium datasets | ⭐ Good for medium datasets |
| **Offline Use** | ❌ Requires network | ✅ Completely offline | ✅ Completely offline |
| **Best For** | Multi-user server, complex queries, enterprise | Single-user, portability, archival | Single-user, portability, SQLite-based workflows |

### Choosing Your Backend

**Use PostGIS when you need:**
- Multiple concurrent users accessing the same database
- Very large ENC datasets (100,000+ features)
- Complex spatial queries and analysis
- Enterprise infrastructure already in place

**Use GeoPackage/SpatiaLite when you need:**
- Portability (single file, easily shared)
- No server setup required
- Offline access capabilities
- Simple backup and versioning
- Single-user or occasional multi-user access

## Key Components

The `ENCDataFactory` class provides four main capabilities:

1. **ENC Summary** - Retrieve metadata about all ENCs in your database (editions, updates, NOAA status)
2. **Layer Summary** - Examine available S-57 feature layers and their feature counts
3. **Usage Band Filtering** - Select ENCs by their chart scale category (overview through berthing)
4. **Bounding Box Retrieval** - Get geographic extents for spatial analysis and visualization

All capabilities work transparently across all supported backends.

## Step 1: Configuration and Setup

Adjust all user-configurable parameters in the next cell before running the notebook. All settings are centralized for easy modification without touching execution code.

### Configuration Section

All user-configurable parameters are centralized below for easy modification without touching execution code. This section includes:

**Data Source Selection**
- Choose between **PostGIS** (server-based, multi-user) or **File-based** (GeoPackage/SpatiaLite, portable)
- Each has different tradeoffs depending on your use case

**Connection Parameters**  
- Database credentials for PostGIS, or file path for GeoPackage/SpatiaLite

**Query Parameters**
- **NOAA validation**: Compare your data against official NOAA versions (cached by default, can force live refresh)
- **Usage band filtering**: Select ENC charts by scale category (1=overview through 6=berthing)
- **ENC selection**: Query all ENCs or specific ones by name

**Workflow Control**
- Enable/disable individual analysis sections to run only what you need
- Useful for skipping expensive operations or re-running specific analyses

**Display Options**
- Control DataFrame output formatting and geometry display

In [1]:
# =============================================================================
# NOTEBOOK CONFIGURATION - Adjust these parameters before running
# =============================================================================

# --- Data Source Selection ---
# Choose which backend to use: 'postgis' for multi-user server, 'file' for portable single-file
# 
# PostGIS: Advantages - Multi-user concurrent access, large datasets, complex queries
#          Requires: PostgreSQL + PostGIS extension, .env configuration
#
# File-based (GeoPackage/SpatiaLite): Advantages - No server, portable, easy backup, offline work
#                                     Use when: Single user, need portability, no infrastructure
data_source_type = "postgis"  # Options: "postgis", "file"


# --- Data Query Parameters ---
# Control which ENCs and features to query, and how to validate them
query_params = {
    # NOAA Database Validation:
    #   - Compares local ENC editions/updates against official NOAA versions
    #   - check_noaa=True: Validate against NOAA (uses cached CSV, fast ~1-2 seconds)
    #   - force_noaa_refresh=True: Live scrape of NOAA website for latest data (slower ~10-30 seconds)
    #   
    #   When to use each:
    #   - check_noaa=True (default): Normal operations, regular data checks
    #   - check_noaa=False: When speed matters, you don't need NOAA comparison
    #   - force_noaa_refresh=True: After NOAA releases updates, during data audits, weekly validation
    'check_noaa': True,
    'force_noaa_refresh': False,
    
    # ENC Chart Filtering by Usage Band:
    #   Usage bands represent different chart scales and detail levels:
    #   1 = Overview (< 1:10M): Ocean passages, strategic planning
    #   2 = General (1:4M-1:10M): Coastal navigation
    #   3 = Coastal (1:1.2M-1:3M): Approach planning, route selection
    #   4 = Approach (1:350K-1:1.2M): Harbor entrance navigation
    #   5 = Harbor (1:90K-1:350K): Port operations, maneuvering planning
    #   6 = Berthing (< 1:90K): Detailed in-port maneuvering
    #
    #   Examples:
    #   - [1] = Only overview charts for broad planning
    #   - [3, 4, 5] = Coastal to harbor details (recommended for routing)
    #   - [5, 6] = Harbor-level detail for port operations
    'usage_bands': [1, 3],
    
    # Specific ENC Names to Query:
    #   Leave empty list [] to query ALL ENCs in database
    #   Or specify exact ENC names like ['US3CA52M', 'US1GC09M']
    #   
    #   Tip: Use empty list for comprehensive analysis, specific names for focused work
    'specific_encs': []
}

# --- Workflow Steps ---
# Enable/disable different analysis sections to run only what you need
# Useful for: Skipping expensive operations, re-running specific analyses, testing subsets
#
# Toggle any of these to False to skip that section:
# - run_unit_tests: Validates ENCDataFactory functionality with real data and backends
# - show_enc_summary: Retrieves ENC metadata (editions, updates, NOAA comparison)
# - show_layer_summary: Lists all available S-57 feature layers and their counts
# - filter_by_usage_band: Filters ENCs by chart scale category
# - show_bounding_boxes: Displays geographic extents of selected ENCs
workflow_steps = {
    "run_unit_tests": True,              # Run ENCDataFactory unit tests
    "show_enc_summary": True,            # Display ENC metadata and version info
    "show_layer_summary": True,          # Display available S-57 layers
    "filter_by_usage_band": True,        # Filter ENCs by usage band
    "show_bounding_boxes": True,         # Display geographic extent of ENCs
    "compare_postgis_vs_file": False     # Compare results between backends (requires both sources)
}

# --- Display Options ---
# Control how pandas DataFrames are displayed in notebook output
display_options = {
    'max_rows': None,                    # Maximum rows to display in DataFrames (None = show all)
    'float_precision': 2,                # Decimal places for float values in numeric columns
    'show_geometry': True                # Include geometry column in outputs (can be verbose)
}

print("✓ Configuration loaded successfully!")
print("  Adjust parameters above, then run subsequent cells to analyze your ENC data.")

✓ Configuration loaded successfully!
  Adjust parameters above, then run subsequent cells to analyze your ENC data.


## Step 2: Setup and Imports

Load required libraries and initialize the project environment. This step is independent of configuration.

In [2]:
import sys
import os
from pathlib import Path
import unittest
from dotenv import load_dotenv


# --- Setup Project Path ---
# This ensures you can import your own modules like 's57_data'
project_root = Path.cwd().parent.parent
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

# --- Load Environment Variables ---
# This loads your DB_USER, DB_PASSWORD, etc., from the .env file
load_dotenv(project_root / ".env")

# --- Import the Test Class ---
# Now you can import the test case you created
from tests.core__real_data.test_enc_factory import TestENCDataFactory
from src.nautical_graph_toolkit.core.s57_data import ENCDataFactory
from src.nautical_graph_toolkit.utils.plot_utils import PlotlyChart

# --- PostGIS Database Connection ---
# Only used when data_source_type = "postgis"
# Credentials loaded from environment variables in .env file
db_params = {
    'dbname': os.getenv('DB_NAME'),
    'user': os.getenv('DB_USER'),
    'password': os.getenv('DB_PASSWORD'),
    'host': os.getenv('DB_HOST'),
    'port': os.getenv('DB_PORT')
}
db_schema = "enc_west"  # Schema name for PostGIS source

# --- File-Based Database Connection ---
# Only used when data_source_type = "file"
# GeoPackage files are portable, self-contained, and work on any system with GDAL/GeoPandas
# FIX 4: Standardize to use data_paths structure like import_s57.ipynb
data_paths = {
    'output_dir': Path.cwd() / 'output'  # Directory containing data files
}
file_db_name = "enc_west.gpkg"  # Filename in output directory

# --- Setup Plotly Visualization ---
# FIX 3: Add MAPBOX_TOKEN validation with helpful fallback message
mapbox_token = os.getenv('MAPBOX_TOKEN')
if not mapbox_token:
    print("⚠️  Note: MAPBOX_TOKEN not found in .env file.")
    print("   Maps will use default basemap. For custom Mapbox maps, add MAPBOX_TOKEN to .env")

ply = PlotlyChart()
ply_fig = ply.create_base_map(mapbox_token=mapbox_token)
ply.plotly_base_config(ply_fig)

# --- PostGIS Database Connection Test ---
# FIX 2: Add connection error handling with helpful guidance
pg_factory = None
file_factory = None
if data_source_type == "postgis":
    try:
        pg_factory = ENCDataFactory(source=db_params, schema=db_schema)
        print("✓ PostGIS connection initialized successfully")
    except Exception as e:
        print(f"⚠️  Warning: Could not initialize PostGIS connection: {e}")
        print(f"   If using PostGIS backend, check your .env file configuration.")
        print(f"   Database name: {db_params.get('dbname', 'NOT SET')}")
        print(f"   Host: {db_params.get('host', 'NOT SET')}")
elif data_source_type == "file":
    try:
        file_path = data_paths['output_dir'] / file_db_name
        file_factory = ENCDataFactory(source=file_path)
        print(f"✓ File-based database initialized: {file_db_name}")
    except Exception as e:
        print(f"⚠️  Warning: Could not initialize file-based database: {e}")
        print(f"   File path: {file_path}")

print(f"Project Root: {project_root}")
print("Setup complete. All required modules are available.")

2025-11-08 14:17:13,115 - src.nautical_graph_toolkit.core.s57_data - INFO - Source is a dictionary, initializing PostGISManager.
2025-11-08 14:17:13,137 - src.nautical_graph_toolkit.core.s57_data - INFO - Successfully connected to database 'ENC_db'
✓ PostGIS connection initialized successfully
Project Root: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1
Setup complete. All required modules are available.


# Step 3: Data Integrity and Unit Testing

This section runs the complete unit test suite for `ENCDataFactory` to validate:
- Data integrity across conversions
- Correct NOAA version tracking
- Proper layer enumeration and summarization
- Edge cases and error handling

The tests use actual S-57 data and real database backends to ensure end-to-end reliability.

**Control**: Toggle `workflow_steps["run_unit_tests"]` in the configuration section to enable/disable this section.

In [3]:
# --- Run the Unittest Suite in the Notebook ---

if workflow_steps["run_unit_tests"]:
    print("Running ENCDataFactory unit test suite...\n")
    
    # 1. Create a TestLoader to find the tests in your class
    loader = unittest.TestLoader()
    suite = loader.loadTestsFromTestCase(TestENCDataFactory)

    # 2. Create a TextTestRunner to execute the tests
    # verbosity=2 provides detailed output for each test run
    runner = unittest.TextTestRunner(verbosity=2)
    result = runner.run(suite)

    # The output will appear below this cell, showing the setup, test execution, and teardown logs.
else:
    print("⊘ Skipping unit test suite")

Running ENCDataFactory unit test suite...


--- Setting up test environment for ENCDataFactory ---
Creating PostGIS data source...
2025-11-08 14:17:13,455 - nautical_graph_toolkit.core.s57_data - INFO - Found 6 S-57 file(s).
2025-11-08 14:17:13,456 - nautical_graph_toolkit.core.s57_data - INFO - --- Starting optimized 'by_layer' conversion ---
2025-11-08 14:17:13,456 - nautical_graph_toolkit.core.s57_data - INFO - Files to process: 6, Batch size: 3
2025-11-08 14:17:13,457 - nautical_graph_toolkit.core.s57_data - INFO - Pre-processing files to extract schemas and ENC names...
2025-11-08 14:17:13,548 - nautical_graph_toolkit.core.s57_data - INFO - Built unified schemas for 86 layers
2025-11-08 14:17:13,549 - nautical_graph_toolkit.utils.db_utils - INFO - Successfully connected to database 'ENC_db' for schema management.
2025-11-08 14:17:13,569 - nautical_graph_toolkit.utils.db_utils - INFO - Schema 'factory_test_schema' is ready.
2025-11-08 14:17:13,570 - nautical_graph_toolkit.core.s57_

         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:35,969 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'DSID' with 6 files
2025-11-08 14:17:35,970 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: IsolatedNode
2025-11-08 14:17:36,073 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'IsolatedNode' with 6 files
2025-11-08 14:17:36,074 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: ConnectedNode


         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:36,159 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'ConnectedNode' with 6 files
2025-11-08 14:17:36,160 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: Edge
2025-11-08 14:17:36,244 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'Edge' with 6 files
2025-11-08 14:17:36,244 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: Face
2025-11-08 14:17:36,328 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'Face' with 6 files
2025-11-08 14:17:36,329 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: ADMARE


         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:36,890 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'ADMARE' with 6 files
2025-11-08 14:17:36,890 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: AIRARE


         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:37,093 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'AIRARE' with 2 files
2025-11-08 14:17:37,094 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BCNLAT


         being appended to.


2025-11-08 14:17:37,398 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BCNLAT' with 3 files
2025-11-08 14:17:37,399 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BCNSPP


         being appended to.
         being appended to.


2025-11-08 14:17:37,722 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BCNSPP' with 3 files
2025-11-08 14:17:37,722 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BRIDGE


         being appended to.
         being appended to.


2025-11-08 14:17:38,034 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BRIDGE' with 3 files
2025-11-08 14:17:38,035 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BUISGL


         being appended to.
         being appended to.


2025-11-08 14:17:38,332 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BUISGL' with 3 files
2025-11-08 14:17:38,333 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BUAARE


         being appended to.
         being appended to.


2025-11-08 14:17:37,369 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BUAARE' with 3 files
2025-11-08 14:17:37,370 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BOYLAT


         being appended to.
         being appended to.


2025-11-08 14:17:37,694 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BOYLAT' with 3 files
2025-11-08 14:17:37,695 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BOYSAW


         being appended to.
         being appended to.


2025-11-08 14:17:38,006 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BOYSAW' with 3 files
2025-11-08 14:17:38,007 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: BOYSPP


         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:38,501 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'BOYSPP' with 5 files
2025-11-08 14:17:38,502 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: CBLSUB


         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:38,916 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'CBLSUB' with 4 files
2025-11-08 14:17:38,917 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: CTNARE


         being appended to.


2025-11-08 14:17:39,249 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'CTNARE' with 3 files
2025-11-08 14:17:39,250 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: CGUSTA


         being appended to.
         being appended to.


2025-11-08 14:17:39,374 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'CGUSTA' with 1 files
2025-11-08 14:17:39,375 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: COALNE


         being appended to.
         being appended to.


2025-11-08 14:17:40,029 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'COALNE' with 5 files
2025-11-08 14:17:40,029 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: CONZNE


         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:40,408 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'CONZNE' with 4 files
2025-11-08 14:17:40,408 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: DAYMAR


         being appended to.


2025-11-08 14:17:40,729 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'DAYMAR' with 3 files
2025-11-08 14:17:40,730 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: DEPARE


         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:41,452 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'DEPARE' with 6 files
2025-11-08 14:17:41,453 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: DEPCNT


         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:42,141 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'DEPCNT' with 6 files
2025-11-08 14:17:42,141 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: DMPGRD


         being appended to.
         being appended to.


2025-11-08 14:17:42,461 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'DMPGRD' with 3 files
2025-11-08 14:17:42,462 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: EXEZNE


         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:43,098 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'EXEZNE' with 6 files
2025-11-08 14:17:43,098 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: FOGSIG


         being appended to.


2025-11-08 14:17:43,427 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'FOGSIG' with 3 files
2025-11-08 14:17:43,427 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: HRBFAC


         being appended to.
         being appended to.


2025-11-08 14:17:43,545 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'HRBFAC' with 1 files
2025-11-08 14:17:43,546 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: LAKARE




2025-11-08 14:17:43,874 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'LAKARE' with 3 files
2025-11-08 14:17:43,875 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: LNDARE


         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:44,545 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'LNDARE' with 5 files
2025-11-08 14:17:44,546 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: LNDELV


         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:44,937 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'LNDELV' with 4 files
2025-11-08 14:17:44,938 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: LNDRGN


         being appended to.


2025-11-08 14:17:45,280 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'LNDRGN' with 3 files
2025-11-08 14:17:45,280 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: LNDMRK


         being appended to.
         being appended to.
         being appended to.
         being appended to.


2025-11-08 14:17:45,698 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'LNDMRK' with 4 files
2025-11-08 14:17:45,699 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: LIGHTS


         being appended to.


2025-11-08 14:17:46,280 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'LIGHTS' with 5 files
2025-11-08 14:17:46,281 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: MAGVAR
2025-11-08 14:17:46,891 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'MAGVAR' with 6 files
2025-11-08 14:17:46,892 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: MIPARE
2025-11-08 14:17:47,318 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'MIPARE' with 4 files
2025-11-08 14:17:47,319 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: OBSTRN
2025-11-08 14:17:47,846 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'OBSTRN' with 5 files
2025-11-08 14:17:47,847 - nautical_graph_toolkit.core.s57_data - INFO - Processing layer: PIPSOL
2025-11-08 14:17:48,189 - nautical_graph_toolkit.core.s57_data - INFO - -> Successfully processed layer 'PIP

test_unanimous_output_across_formats (tests.core__real_data.test_enc_factory.TestENCDataFactory.test_unanimous_output_across_formats)
Core test: Validates that ENCDataFactory produces consistent GeoDataFrames ... 

Using default layer: ['lndmrk']
--- Test setup complete ---

--- Running test: Unanimous Output Across Formats ---
Testing 1 layer(s): ['lndmrk']
Initializing factories for PostGIS, GPKG, and SpatiaLite...
2025-11-08 14:18:11,051 - nautical_graph_toolkit.core.s57_data - INFO - Source is a dictionary, initializing PostGISManager.
2025-11-08 14:18:11,052 - nautical_graph_toolkit.core.s57_data - INFO - Successfully connected to database 'ENC_db'
2025-11-08 14:18:11,052 - nautical_graph_toolkit.core.s57_data - INFO - Source is a .gpkg file, initializing GPKGManager.
2025-11-08 14:18:11,054 - nautical_graph_toolkit.core.s57_data - INFO - Routes will be managed in: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/tests/core__real_data/test_output/temp_factory_output/maritime_routes.gpkg
2025-11-08 14:18:11,055 - nautical_graph_toolkit.core.s57_data - INFO - Successfully connected to GeoPackage 'factory_test.gpkg'
2025-11-08 14:18:11,056 - nautical_graph_toolkit.core.s57_data - INFO 

ok


    Feature counts match: 251 features
    ✅ Passed: Schema and content match

=== TEST RESULTS SUMMARY ===
Total layers tested: 1
✅ Passed: 1
❌ Failed: 0
⏭️  Skipped: 0

✅ Passed layers (1):
  - lndmrk

--- Tearing down test environment ---
Removed temporary directory: /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/tests/core__real_data/test_output/temp_factory_output
2025-11-08 14:18:11,278 - nautical_graph_toolkit.utils.db_utils - INFO - Successfully connected to database 'ENC_db' for schema management.
2025-11-08 14:18:11,403 - nautical_graph_toolkit.utils.db_utils - INFO - Successfully dropped schema 'factory_test_schema'



----------------------------------------------------------------------
Ran 1 test in 60.536s

OK


Dropped PostGIS schema: factory_test_schema


# Step 4: Functional Testing and Data Exploration

This section demonstrates the key functions of `ENCDataFactory`:

1. **ENC Summary**: Retrieve metadata for all ENCs in your database, including version information
2. **Layer Summary**: Examine all available S-57 object classes and their feature counts
3. **Usage Band Filtering**: Select ENCs by their usage band (scale category)
4. **Bounding Box Retrieval**: Get geographic extents for spatial analysis

Each function works transparently across both PostGIS and file-based backends. Configure:
- `data_source_type`: Select which backend to query
- `workflow_steps`: Enable/disable individual analysis sections

## 4.1 ENC Summary

Retrieve a comprehensive summary of all ENC datasets in your database, including:
- Current edition and update numbers stored locally
- Latest available versions from NOAA (with internet access)
- Status comparison: whether your data is current or outdated
- Count of outdated editions vs. outdated updates

### Understanding NOAA Data Validation

When `check_noaa=True`, this function compares your local ENC versions against official NOAA versions. This is useful for:
- **Audit purposes**: Verify your charts are current before operations
- **Update planning**: Identify which charts need updating
- **Compliance**: Document that you're using approved chart versions
- **Data quality**: Ensure consistency across your fleet or organization

**Two validation modes:**

| Mode | Behavior | Speed | Use When |
|------|----------|-------|----------|
| **Cached (default)** | Uses local CSV file updated Nov 3, 2025 | Fast (1-2 sec) | Normal operations, routine checks |
| **Live refresh** | Scrapes NOAA website for latest data | Slower (10-30 sec) | After NOAA releases updates, audits, weekly checks |

**Why caching exists:** NOAA updates their database infrequently (roughly weekly), so caching provides significant speed benefit without sacrificing freshness for most operations. The cache age is logged so you can see exactly how current the comparison data is.

### Configuration
- `query_params['check_noaa']`: Set to `True` to validate against NOAA (uses cached data by default, adds 1-2 seconds)
- `query_params['force_noaa_refresh']`: Set to `True` to refresh NOAA data from live source instead of cache (scrapes NOAA website, adds 10-30 seconds but ensures latest data)
- Cache age is logged automatically so you can see how fresh the comparison data is

**Recommendation:** Start with default `check_noaa=True, force_noaa_refresh=False` for routine operations. Use `force_noaa_refresh=True` when:
- NOAA has announced new chart releases
- You're performing official audits or compliance documentation
- You want to ensure absolutely latest versions before a voyage
- You're running automated weekly validation processes

The code below runs for **both PostGIS and file-based backends**. Use `data_source_type` in configuration to select which one runs.

### PostGIS Backend (if data_source_type = "postgis")

In [4]:
if workflow_steps["show_enc_summary"] and data_source_type == "postgis":
    print("PostGIS Backend: Getting ENC summary...\n")
    
    pg_factory = ENCDataFactory(source=db_params, schema=db_schema)
    pg_summary = pg_factory.get_enc_summary(
        check_noaa=query_params['check_noaa'],
        force_noaa_refresh=query_params['force_noaa_refresh']
    )
    
    if display_options['show_geometry'] and 'geometry' in pg_summary.columns:
        # Hide geometry column for cleaner display
        display_cols = [col for col in pg_summary.columns if col != 'geometry']
        display(pg_summary[display_cols])
    else:
        display(pg_summary)
else:
    if workflow_steps["show_enc_summary"]:
        print("⊘ Skipping PostGIS summary (data_source_type not set to 'postgis')")
    else:
        print("⊘ Skipping PostGIS summary (workflow_steps['show_enc_summary'] disabled)")

PostGIS Backend: Getting ENC summary...

2025-11-08 14:18:11,463 - src.nautical_graph_toolkit.core.s57_data - INFO - Source is a dictionary, initializing PostGISManager.
2025-11-08 14:18:11,463 - src.nautical_graph_toolkit.core.s57_data - INFO - Successfully connected to database 'ENC_db'
2025-11-08 14:18:11,464 - src.nautical_graph_toolkit.core.s57_data - INFO - Factory: Getting ENC summary...
2025-11-08 14:18:11,491 - src.nautical_graph_toolkit.core.s57_data - INFO - Checking against NOAA database for latest versions...
2025-11-08 14:18:11,501 - src.nautical_graph_toolkit.utils.s57_utils - INFO - Loaded cached NOAA ENC data from /home/vikont_tux/python_projects_wsl2/1_MaritimeModule_V1/src/nautical_graph_toolkit/data/noaa_database.csv (last updated: 2025-11-07 18:29:04)


Unnamed: 0,enc_name,db_edition,db_update,noaa_edition,noaa_update,noaa_update_date,status,is_outdated
0,US5CA9AM.000,7,0,7,0,10/11/2024,Up-to-date,False
1,US3CA52M.000,31,7,31,8,06/11/2024,Outdated (Update),True
2,US2WC11M.000,28,1,28,1,05/02/2023,Up-to-date,False
3,US3CA70M.000,44,2,44,3,03/19/2025,Outdated (Update),True
4,US5CA43M.000,24,2,24,2,10/11/2024,Up-to-date,False
5,US4CA11M.000,39,4,39,5,06/11/2024,Outdated (Update),True
6,US5CA53M.000,5,3,5,3,05/25/2022,Up-to-date,False
7,US5CA41M.000,44,1,44,1,10/11/2024,Up-to-date,False
8,US5CA67M.000,10,0,10,0,05/29/2024,Up-to-date,False
9,US5CA44M.000,35,1,35,1,06/12/2025,Up-to-date,False


### File-Based Backend (if data_source_type = "file")

In [5]:
if workflow_steps["show_enc_summary"] and data_source_type == "file":
    print("File-Based Backend: Getting ENC summary...\n")
    
    file_path = data_paths["output_dir"] / file_db_name
    file_factory = ENCDataFactory(source=file_path)
    file_summary = file_factory.get_enc_summary(
        check_noaa=query_params['check_noaa'],
        force_noaa_refresh=query_params['force_noaa_refresh']
    )
    
    if display_options['show_geometry'] and 'geometry' in file_summary.columns:
        # Hide geometry column for cleaner display
        display_cols = [col for col in file_summary.columns if col != 'geometry']
        display(file_summary[display_cols])
    else:
        display(file_summary)
else:
    if workflow_steps["show_enc_summary"]:
        print("⊘ Skipping file-based summary (data_source_type not set to 'file')")
    else:
        print("⊘ Skipping file-based summary (workflow_steps['show_enc_summary'] disabled)")

⊘ Skipping file-based summary (data_source_type not set to 'file')


## 4.2 Layer Summary

Examine the structure of your S-57 database by retrieving a summary of all layers:
- **LayerName**: Full English name of the S-57 object class (e.g., 'Sounding', 'Land area')
- **Acronym**: 6-letter S-57 code (e.g., 'soundg', 'lndare')
- **FeatureCount**: Number of features of this type in your database

This helps you understand what data is available and plan your analysis accordingly.
For example, if 'soundg' has many features, you have good depth data for routing analysis.

### PostGIS Backend (if data_source_type = "postgis")

In [6]:
if workflow_steps["show_layer_summary"] and data_source_type == "postgis":
    print("PostGIS Backend: Layer summary\n")
    pg_factory.get_layers_summary()
else:
    if workflow_steps["show_layer_summary"]:
        print("⊘ Skipping PostGIS layer summary (data_source_type not set to 'postgis')")
    else:
        print("⊘ Skipping PostGIS layer summary (workflow_steps['show_layer_summary'] disabled)")

PostGIS Backend: Layer summary

2025-11-08 14:18:11,699 - src.nautical_graph_toolkit.core.s57_data - INFO - Factory: Getting layers summary...
2025-11-08 14:18:11,700 - src.nautical_graph_toolkit.core.s57_data - INFO - Fetching layer summary for schema 'enc_west'...


### File-Based Backend (if data_source_type = "file")

In [7]:
if workflow_steps["show_layer_summary"] and data_source_type == "file":
    print("File-Based Backend: Layer summary\n")
    layer_summary = file_factory.get_layers_summary()
    display(layer_summary)
else:
    if workflow_steps["show_layer_summary"]:
        print("⊘ Skipping file-based layer summary (data_source_type not set to 'file')")
    else:
        print("⊘ Skipping file-based layer summary (workflow_steps['show_layer_summary'] disabled)")

⊘ Skipping file-based layer summary (data_source_type not set to 'file')


## 4.3 Filter ENC by Usage Band

S-57 charts are classified into **usage bands** representing different map scales and detail levels. Each band corresponds to specific operational contexts with appropriate chart detail for that context.

### Usage Band Reference

| Band | Name | Scale Range | Detail Level | Typical Use |
|------|------|-------------|--------------|------------|
| 1 | Overview | < 1:10M | Very low | Ocean basin passages, strategic planning |
| 2 | General | 1:4M - 1:10M | Low | Coastal route overview, passage planning |
| 3 | Coastal | 1:1.2M - 1:3M | Medium | Approach to coasts, route selection, regional analysis |
| 4 | Approach | 1:350K - 1:1.2M | High | Harbor entrance, detailed piloting, coastal maneuvering |
| 5 | Harbor | 1:90K - 1:350K | Very High | Port operations, anchorage selection, detailed navigation |
| 6 | Berthing | < 1:90K | Extreme | Detailed maneuvering in tight spaces, pier/berth approach |

### When to Use Each Band

**For Route Planning:**
- Use bands 1-3: Overview → coastal detail, suitable for macro route planning
- Use bands 3-5: Coastal detail → harbor detail, suits most navigation routing algorithms
- Recommended: `[3, 4, 5]` for comprehensive routing coverage

**For Port Operations:**
- Use bands 5-6: Harbor and berthing detail for maneuvering in ports
- Recommended: `[5, 6]` for detailed port analysis

**For Real-time Navigation:**
- Start with band matching vessel position
- Band 3 for offshore, band 4-5 for approaches, band 6 for maneuvering

**For Analysis/Planning:**
- Use band 1-2: Quick overview without detail
- Use band 3: Sweet spot for most analyses (good coverage + reasonable detail)
- Use bands 3-5: Complete analysis from approach to harbor operations

**Why band filtering matters:**
- **Performance**: Fewer bands = faster queries on large databases
- **Relevance**: Use detail appropriate for your operational context
- **Consistency**: Charts at similar scales have consistent symbology and conventions
- **Data quality**: Lower bands (1-2) may have less detailed feature coverage

Filter by one or multiple bands to select appropriate charts for your operational area.

**Configuration**: Edit `query_params['usage_bands']` in the configuration cell above to select which bands to query.

### PostGIS Backend (if data_source_type = "postgis")

In [8]:
if workflow_steps["filter_by_usage_band"] and data_source_type == "postgis":
    print("PostGIS Backend: Filtering ENCs by usage band\n")
    
    usage_bands = query_params['usage_bands']
    enc_list = pg_factory.get_encs_by_usage_band(usage_bands)
    
    print(f"ENCs found for usage band(s) {usage_bands}:")
    print(enc_list)
else:
    if workflow_steps["filter_by_usage_band"]:
        print("⊘ Skipping PostGIS usage band filter (data_source_type not set to 'postgis')")
    else:
        print("⊘ Skipping PostGIS usage band filter (workflow_steps['filter_by_usage_band'] disabled)")

PostGIS Backend: Filtering ENCs by usage band

2025-11-08 14:18:12,006 - src.nautical_graph_toolkit.core.s57_data - INFO - Factory: Getting ENCs for usage band [1, 3]...
ENCs found for usage band(s) [1, 3]:
['US1WC01M', 'US3CA14M', 'US3CA52M', 'US3CA69M', 'US3CA70M', 'US3CA85M']


### File-Based Backend (if data_source_type = "file")

In [9]:
if workflow_steps["filter_by_usage_band"] and data_source_type == "file":
    print("File-Based Backend: Filtering ENCs by usage band\n")
    
    usage_bands = query_params['usage_bands']
    enc_list = file_factory.get_encs_by_usage_band(usage_bands)
    
    print(f"ENCs found for usage band(s) {usage_bands}:")
    print(enc_list)
else:
    if workflow_steps["filter_by_usage_band"]:
        print("⊘ Skipping file-based usage band filter (data_source_type not set to 'file')")
    else:
        print("⊘ Skipping file-based usage band filter (workflow_steps['filter_by_usage_band'] disabled)")

⊘ Skipping file-based usage band filter (data_source_type not set to 'file')


## 4.4 ENC Bounding Box Retrieval

Obtain the geographic extent (bounding box) of selected ENCs for spatial analysis, visualization, and data assessment. Each ENC can have multiple bounding boxes (some charts cover non-contiguous areas).

### Use Cases for Bounding Boxes

**Spatial Filtering & Analysis:**
- **Area of Interest (AOI) Queries**: "Which ENCs cover this region?" → Filter bounding boxes against your AOI
- **Coverage Assessment**: "Do I have complete chart coverage?" → Visualize all ENCs, identify gaps
- **Data Availability**: "Which ENCs exist near this port?" → Query boxes near your location of interest
- **Regional Planning**: Create coverage maps for operational areas

**Visualization & Mapping:**
- **Chart Coverage Maps**: Overlay ENC footprints on basemaps to show coverage
- **Fleet Planning**: Visualize which charts vessels need for their routes
- **Coverage Gaps**: Identify areas with poor chart coverage or chart data
- **Navigation Briefings**: Create visual briefing maps showing available chart scales

**Routing & Pathfinding:**
- **Route Corridor Analysis**: Get ENCs along planned route corridor
- **Detailed Routing**: Use detailed (band 5-6) ENCs from boxes near your route
- **Alternative Route Comparison**: Analyze ENCs available for different possible routes

**Data Management:**
- **Chart Inventory**: Catalog which ENCs you have and their coverage areas
- **Update Tracking**: Track which geographic areas are covered by different ENC editions
- **Archive Management**: Understand historical coverage as you maintain ENC archives

### Technical Details

The output includes:
- `dsid_dsnm`: ENC name (e.g., 'US3CA52M.000') to associate bounding box with source
- `wkb_geometry` / `geometry`: Polygon geometry defining the ENC extent
- One or more polygons per ENC (some charts have non-contiguous coverage areas)

### Examples

**Query all ENCs:**
```python
# Leave query_params['specific_encs'] = [] to retrieve all ENCs
```

**Query specific ENCs:**
```python
# Set query_params['specific_encs'] = ['US3CA52M', 'US1WC01M']
```

**Filter by geographic region:**
(See "Spatial Filtering & Analysis" use cases above)

The output includes the `dsid_dsnm` field (ENC name) to associate each polygon with its source chart.

**Configuration**: Edit `query_params['specific_encs']` to specify which ENCs to retrieve (empty list `[]` queries all ENCs).

In [10]:
if workflow_steps["show_bounding_boxes"] and data_source_type == "postgis":
    print("PostGIS Backend: Retrieving ENC bounding boxes\n")
    
    # FIX 1: Handle undefined enc_list variable by checking if it exists
    # enc_list only exists if filter_by_usage_band workflow step ran first
    if query_params['specific_encs']:
        enc_names = query_params['specific_encs']
    elif 'enc_list' in locals():
        enc_names = enc_list
    else:
        print("Getting all ENCs from database (no usage band filter ran)...")
        enc_names = pg_factory.get_encs_by_usage_band([1,2,3,4,5,6])
    
    if enc_names:
        print(f"Bounding boxes for ENCs: {enc_names}\n")
        enc_bbox = pg_factory.get_enc_bounding_boxes(enc_names)
    else:
        print("No ENCs to retrieve bounding boxes for")
        enc_bbox = None
    
    if enc_bbox is not None and len(enc_bbox) > 0:
        display_cols = [col for col in enc_bbox.columns if col != 'wkb_geometry'] if not display_options['show_geometry'] else enc_bbox.columns
        display(enc_bbox[display_cols])

        ply.add_enc_bbox_trace(figure=ply_fig, bbox_df=enc_bbox, usage_bands=query_params['usage_bands'])
        ply_fig.show()
else:
    if workflow_steps["show_bounding_boxes"]:
        print("⊘ Skipping PostGIS bounding boxes (data_source_type not set to 'postgis')")
    else:
        print("⊘ Skipping PostGIS bounding boxes (workflow_steps['show_bounding_boxes'] disabled)")

PostGIS Backend: Retrieving ENC bounding boxes

Bounding boxes for ENCs: ['US1WC01M', 'US3CA14M', 'US3CA52M', 'US3CA69M', 'US3CA70M', 'US3CA85M']

2025-11-08 14:18:12,111 - src.nautical_graph_toolkit.core.s57_data - INFO - Factory: Getting bounding boxes for 6 ENCs...


Unnamed: 0,dsid_dsnm,wkb_geometry
0,US1WC01M,"POLYGON ((-132.35736 31.9999, -132.32185 31.99..."
1,US3CA14M,"POLYGON ((-124.58442 37.79306, -124.58442 37.5..."
2,US3CA52M,"POLYGON ((-123.50108 36.41847, -123.50108 36.1..."
3,US3CA69M,"POLYGON ((-121.48684 33.2, -121.48644 33.2, -1..."
4,US3CA70M,"POLYGON ((-117.08423 32.534, -117.08423 32.537..."
5,US3CA85M,"POLYGON ((-122.28437 34.22727, -122.28437 34.2..."


In [11]:
if workflow_steps["show_bounding_boxes"] and data_source_type == "file":
    print("File-Based Backend: Retrieving ENC bounding boxes\n")
    
    # FIX 1: Handle undefined enc_list variable by checking if it exists
    # enc_list only exists if filter_by_usage_band workflow step ran first
    if query_params['specific_encs']:
        enc_names = query_params['specific_encs']
    elif 'enc_list' in locals():
        enc_names = enc_list
    else:
        print("Getting all ENCs from database (no usage band filter ran)...")
        enc_names = file_factory.get_encs_by_usage_band([1,2,3,4,5,6])
    
    if enc_names:
        print(f"Bounding boxes for ENCs: {enc_names}\n")
        enc_bbox = file_factory.get_enc_bounding_boxes(enc_names)
    else:
        print("No ENCs to retrieve bounding boxes for")
        enc_bbox = None
    
    if enc_bbox is not None and len(enc_bbox) > 0:
        display_cols = [col for col in enc_bbox.columns if col != 'geometry'] if not display_options['show_geometry'] else enc_bbox.columns
        display(enc_bbox[display_cols])

        ply.add_enc_bbox_trace(figure=ply_fig, bbox_df=enc_bbox, usage_bands=query_params['usage_bands'])
        ply_fig.show()
else:
    if workflow_steps["show_bounding_boxes"]:
        print("⊘ Skipping file-based bounding boxes (data_source_type not set to 'file')")
    else:
        print("⊘ Skipping file-based bounding boxes (workflow_steps['show_bounding_boxes'] disabled)")

⊘ Skipping file-based bounding boxes (data_source_type not set to 'file')
