# CONFLUENCE Tutorial: Regional Domain Modeling - Iceland

This notebook demonstrates regional domain modeling using Iceland as an example. Unlike previous tutorials that focused on single watersheds, this example:

1. **Delineates an entire region** (not just a single watershed)
2. **Includes coastal watersheds** (watersheds that drain directly to the ocean)
3. **Uses a bounding box** instead of a pour point

## Key Differences from Previous Tutorials

- **DELINEATE_BY_POURPOINT**: `False` - We're not focusing on a single outlet
- **DELINEATE_COASTAL_WATERSHEDS**: `True` - Include watersheds draining to coast
- **Regional scale**: Multiple independent watersheds

## Learning Objectives

1. Understand regional vs. watershed-scale modeling
2. Learn how to delineate coastal watersheds
3. Handle multiple independent drainage systems
4. Work with national/regional scale domains

## 1. Setup

In [None]:
# Import required libraries
import sys
import os
from pathlib import Path
import yaml
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd
import numpy as np
from shapely.geometry import box
import contextily as cx
from datetime import datetime
import xarray as xr
import warnings

# Suppress specific warnings
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

# Add CONFLUENCE to path
confluence_path = Path('../').resolve()
sys.path.append(str(confluence_path))

# Import CONFLUENCE
from CONFLUENCE import CONFLUENCE

plt.style.use('default')
%matplotlib inline

print(f"Working from: {confluence_path}")

## 2. Initialize CONFLUENCE
First, let's set up our directories and load the configuration. We'll customize the Iceland configuration file.

In [None]:
# Set directory paths
CONFLUENCE_CODE_DIR = confluence_path
CONFLUENCE_DATA_DIR = Path('/work/comphyd_lab/data/CONFLUENCE_data')  # ← User should modify this path

# Fallback to a local path if the default doesn't exist (for easier testing)
if not CONFLUENCE_DATA_DIR.exists():
    CONFLUENCE_DATA_DIR = Path('./data/CONFLUENCE_data')
    print(f"Using local data directory: {CONFLUENCE_DATA_DIR}")

# Verify paths exist
if not CONFLUENCE_CODE_DIR.exists():
    raise FileNotFoundError(f"CONFLUENCE code directory not found: {CONFLUENCE_CODE_DIR}")

if not CONFLUENCE_DATA_DIR.exists():
    print(f"Data directory doesn't exist. Creating: {CONFLUENCE_DATA_DIR}")
    CONFLUENCE_DATA_DIR.mkdir(parents=True, exist_ok=True)

# Check if Iceland config exists
iceland_config_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_iceland.yaml'

if not iceland_config_path.exists():
    print(f"Iceland configuration not found at {iceland_config_path}. Creating from template.")
    # Load template configuration as fallback
    config_template_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_template.yaml'
    with open(config_template_path, 'r') as f:
        config_dict = yaml.safe_load(f)
    
    # Update with Iceland-specific settings
    config_dict['CONFLUENCE_CODE_DIR'] = str(CONFLUENCE_CODE_DIR)
    config_dict['CONFLUENCE_DATA_DIR'] = str(CONFLUENCE_DATA_DIR)
    
    # Set Iceland domain and region-specific settings
    config_dict['DOMAIN_NAME'] = "Iceland_region"  
    config_dict['EXPERIMENT_ID'] = "regional_run_1"
    config_dict['EXPERIMENT_TIME_START'] = "2018-01-01 01:00"
    config_dict['EXPERIMENT_TIME_END'] = "2020-12-31 23:00"
    config_dict['SPATIAL_MODE'] = "Distributed"
    
    # Iceland regional domain settings
    config_dict['BOUNDING_BOX_COORDS'] = "66.56/-24.55/63.25/-13.5"  # North/West/South/East
    config_dict['POUR_POINT_COORDS'] = "default"  # Not needed for regional domain
    config_dict['DELINEATE_BY_POURPOINT'] = False
    config_dict['DELINEATE_COASTAL_WATERSHEDS'] = True
    config_dict['DOMAIN_DEFINITION_METHOD'] = "delineate" 
    config_dict['STREAM_THRESHOLD'] = 5000
    config_dict['DOMAIN_DISCRETIZATION'] = "GRUs"
    config_dict['MIN_GRU_SIZE'] = 10  # Larger minimum size for regional domain
else:
    print(f"Found existing Iceland configuration. Loading it.")
    with open(iceland_config_path, 'r') as f:
        config_dict = yaml.safe_load(f)

# Create tutorial version with updated domain name
config_dict['DOMAIN_NAME'] = "Iceland_tutorial"  
config_dict['EXPERIMENT_ID'] = "tutorial_run_1"

# Create config directory if it doesn't exist
config_dir = CONFLUENCE_CODE_DIR / '0_config_files'
config_dir.mkdir(parents=True, exist_ok=True)

# Write to tutorial config file
tutorial_config_path = config_dir / 'config_iceland_tutorial.yaml'
with open(tutorial_config_path, 'w') as f:
    yaml.dump(config_dict, f)

try:
    # Initialize CONFLUENCE with tutorial config
    confluence = CONFLUENCE(tutorial_config_path)
    
    # Parse bounding box for visualization
    bbox = config_dict['BOUNDING_BOX_COORDS'].split('/')
    lat_max, lon_min, lat_min, lon_max = map(float, bbox)
    
    # Display configuration
    print("=== Iceland Tutorial Configuration ===")
    print(f"Domain Name: {confluence.config['DOMAIN_NAME']}")
    print(f"Bounding Box: {confluence.config['BOUNDING_BOX_COORDS']}")
    print(f"Delineate by Pour Point: {confluence.config['DELINEATE_BY_POURPOINT']} (Full region!)")
    print(f"Include Coastal Watersheds: {confluence.config.get('DELINEATE_COASTAL_WATERSHEDS', True)}")
    print(f"Stream Threshold: {confluence.config['STREAM_THRESHOLD']}")
    print(f"Domain Method: {confluence.config['DOMAIN_DEFINITION_METHOD']}")
    
    # Display geographic extent
    print(f"\nGeographic extent:")
    print(f"  North: {lat_max}°N")
    print(f"  South: {lat_min}°N")
    print(f"  West: {lon_min}°E")
    print(f"  East: {lon_max}°E")
    
except Exception as e:
    print(f"Error initializing CONFLUENCE: {str(e)}")
    print("Check that CONFLUENCE_DATA_DIR is correctly set and accessible.")

## 4. Project Setup
The first step is to set up our project structure. Since we're doing a regional model, we need a different approach than for a single watershed.

In [None]:
# Step 1: Project Initialization
print("=== Step 1: Project Initialization ===")

try:
    # Setup project
    project_dir = confluence.managers['project'].setup_project()    
    pour_point_path = confluence.managers['project'].create_pour_point()

    # List created directories
    print("\nCreated directories:")
    created_dirs = []
    for item in sorted(project_dir.iterdir()):
        if item.is_dir():
            created_dirs.append(item.name)
            print(f"  📁 {item.name}")
    
    # Check if all required directories are created
    required_dirs = ['shapefiles', 'attributes', 'forcing', 'simulations', 'evaluation', 'plots']
    for dir_name in required_dirs:
        if dir_name not in created_dirs:
            print(f"Warning: Required directory '{dir_name}' not created")
    
    print("\nDirectory purposes:")
    print("  📁 shapefiles: Domain geometry (multiple watersheds, river network)")
    print("  📁 attributes: Static characteristics (elevation, soil, land cover)")
    print("  📁 forcing: Meteorological inputs (precipitation, temperature)")
    print("  📁 simulations: Model outputs")
    print("  📁 evaluation: Performance metrics and comparisons")
    print("  📁 plots: Visualizations")
except Exception as e:
    print(f"Error setting up project: {str(e)}")

## 5. Geospatial Domain Definition and Analysis - Data Acquisition
Before delineating the region, we need to acquire geospatial data (DEM, soil, land cover).

In [None]:
# Step 2: Geospatial Domain Definition and Analysis
print("=== Step 2: Geospatial Domain Definition and Analysis ===")

try:
    # Acquire attributes
    print("Acquiring geospatial attributes (DEM, soil, land cover)...")
    confluence.managers['data'].acquire_attributes()
    
    # Check if attributes were created
    dem_dir = project_dir / 'attributes' / 'elevation' / 'dem'
    soilclass_dir = project_dir / 'attributes' / 'soilclass'
    landclass_dir = project_dir / 'attributes' / 'landclass'
    
    if dem_dir.exists() and any(dem_dir.glob('*.tif')):
        print("✓ DEM data acquired successfully")
    else:
        print("⚠ DEM data acquisition may have failed")
        
    if soilclass_dir.exists() and any(soilclass_dir.glob('*.tif')):
        print("✓ Soil class data acquired successfully")
    else:
        print("⚠ Soil class data acquisition may have failed")
        
    if landclass_dir.exists() and any(landclass_dir.glob('*.tif')):
        print("✓ Land cover data acquired successfully")
    else:
        print("⚠ Land cover data acquisition may have failed")
        
except Exception as e:
    print(f"Error acquiring attributes: {str(e)}")

## 6. Regional Domain Delineation
This is the critical step where we delineate the entire region, including coastal watersheds. This is different from the single-watershed approach.

In [None]:
try:
    # Define domain
    print(f"Delineating regional domain using method: {confluence.config['DOMAIN_DEFINITION_METHOD']}")
    print(f"Delineate by pour point: {confluence.config['DELINEATE_BY_POURPOINT']} (Full region!)")
    print(f"Include coastal watersheds: {confluence.config.get('DELINEATE_COASTAL_WATERSHEDS', True)}")
    print(f"Stream threshold: {confluence.config['STREAM_THRESHOLD']}")
    print("\nThis will create multiple independent drainage basins...")
    
    watershed_path = confluence.managers['domain'].define_domain()
    
    # Check results
    basin_path = project_dir / 'shapefiles' / 'river_basins'
    network_path = project_dir / 'shapefiles' / 'river_network'
    
    basin_count = 0
    basin_files = []
    basins = None
    if basin_path.exists():
        basin_files = list(basin_path.glob('*.shp'))
        if basin_files:
            try:
                basins = gpd.read_file(basin_files[0])
                basin_count = len(basins)
                print(f"\n✓ Created {basin_count} watersheds")
                print(f"Total area: {basins.geometry.area.sum() / 1e6:.0f} km²")
            except Exception as e:
                print(f"Error reading basin shapefile: {str(e)}")
    
    network_count = 0
    network_files = []
    rivers = None
    if network_path.exists():
        network_files = list(network_path.glob('*.shp'))
        if network_files:
            try:
                rivers = gpd.read_file(network_files[0])
                network_count = len(rivers)
                print(f"✓ Created river network with {network_count} segments")
            except Exception as e:
                print(f"Error reading river network shapefile: {str(e)}")
                
    if not basin_files:
        print("⚠ No basin shapefiles found. Domain delineation may have failed.")
    if not network_files:
        print("⚠ No river network shapefiles found. Stream delineation may have failed.")
        
except Exception as e:
    print(f"Error during domain delineation: {str(e)}")

## 7. Watershed Discretization
Now we need to discretize our domain into GRUs (Grouped Response Units) and HRUs (Hydrologic Response Units).

In [None]:
try:
    # Discretize domain
    print(f"Creating HRUs using method: {confluence.config['DOMAIN_DISCRETIZATION']}")
    hru_path = confluence.managers['domain'].discretize_domain()
    
    # Check results
    hru_path = project_dir / 'shapefiles' / 'catchment'
    hru_gdf = None
    if hru_path.exists():
        hru_files = list(hru_path.glob('*.shp'))
        if hru_files:
            try:
                hru_gdf = gpd.read_file(hru_files[0])
                
                print(f"\n✓ Created {len(hru_gdf)} HRUs")
                print(f"Number of GRUs: {hru_gdf['GRU_ID'].nunique()}")
                
                # Show some statistics
                hru_stats = hru_gdf.groupby('GRU_ID').size()
                print(f"\nHRU distribution:")
                print(f"  Min HRUs per GRU: {hru_stats.min()}")
                print(f"  Max HRUs per GRU: {hru_stats.max()}")
                print(f"  Avg HRUs per GRU: {hru_stats.mean():.1f}")
            except Exception as e:
                print(f"Error reading HRU shapefile: {str(e)}")
        else:
            print("⚠ No HRU shapefiles found. Domain discretization may have failed.")
    else:
        print("⚠ Catchment directory not found. Domain discretization may have failed.")
        
except Exception as e:
    print(f"Error during domain discretization: {str(e)}")

## 8. Visualize Regional Domain
Let's visualize what our regional domain looks like with all delineated watersheds.

In [None]:
try:
    # Create CONFLUENCE domain visualization
    print("Creating regional domain visualization...")
    if hasattr(confluence.managers['domain'], 'plot_domain'):
        plot_paths = confluence.managers['domain'].plot_domain()
    else:
        print("plot_domain method not available - using custom visualization instead")
    
    # Create custom visualization
    if basin_path.exists() and basin_files and basins is not None:
        fig, ax = plt.subplots(figsize=(14, 10))
        
        # Plot watersheds
        if 'GRU_ID' in basins.columns:
            basins.plot(ax=ax, column='GRU_ID', cmap='tab20', 
                       edgecolor='black', linewidth=0.5, legend=False)
        else:
            basins.plot(ax=ax, cmap='tab20', 
                       edgecolor='black', linewidth=0.5, legend=False)
        
        # Plot river network if available
        if network_path.exists() and network_files and rivers is not None:
            rivers.plot(ax=ax, color='blue', linewidth=1)
        
        ax.set_title(f'Iceland Regional Domain - {basin_count} Watersheds', 
                    fontsize=16, fontweight='bold')
        ax.set_xlabel('Longitude')
        ax.set_ylabel('Latitude')
        
        # Add annotation about coastal watersheds
        ax.text(0.02, 0.98, f'Including coastal watersheds\nTotal watersheds: {basin_count}',
                transform=ax.transAxes, va='top',
                bbox=dict(boxstyle='round', facecolor='white', alpha=0.8),
                fontsize=12)
        
        plt.tight_layout()
        plt.show()
    else:
        print("Cannot create visualization: Basin data not available")
except Exception as e:
    print(f"Error creating domain visualization: {str(e)}")

## 9. Analyze Regional Characteristics
Let's analyze the characteristics of our regional watersheds.