# SYMFLUENCE Tutorial 02a — Basin-Scale Workflow (Bow River at Banff, Lumped)

## Introduction

This tutorial demonstrates basin-scale hydrological modeling using SYMFLUENCE's lumped representation approach. Building on the point-scale workflows from Tutorials 01a and 01b, we now simulate streamflow from an entire watershed—the Bow River at Banff in the Canadian Rockies.

A lumped basin model treats the watershed as a single computational unit, spatially averaging all characteristics across the catchment. This simplified approach provides computational efficiency ideal for calibration and establishes baseline performance before adding spatial complexity.

The **Bow River at Banff** watershed encompasses ~2,210 km² with elevations from 1,384 m to over 3,400 m. Water Survey of Canada station 05BB001 provides streamflow observations for model evaluation. This snow-dominated mountain system presents strong elevation gradients, complex snow dynamics, and pronounced spring freshet periods.

Through this tutorial, you will see how the same SYMFLUENCE workflow scales seamlessly from point validation to basin prediction: configuration → domain → data → model → evaluation.


# Step 1 — Configuration

We generate a basin-scale configuration that specifies the lumped representation approach, gauging station coordinates, and watershed delineation method.

In [None]:
# Step 1 — Create basin-scale configuration

from pathlib import Path
from symfluence import SYMFLUENCE
from symfluence.core.config.models import SymfluenceConfig

config = SymfluenceConfig.from_minimal(
    # Basic identification
    domain_name='Bow_at_Banff_lumped',
    experiment_id='run_1',

    # Gauging station coordinates (Banff WSC 05BB001)
    pour_point_coords='51.1722/-115.5717',

    # Lumped basin settings
    definition_method='lumped',
    discretization='GRUs',

    # Model configuration
    model='SUMMA',
    routing_model='mizuRoute',

    # Temporal extent
    time_start='2004-01-01 01:00',
    time_end='2007-12-31 23:00',
    calibration_period='2005-10-01, 2006-09-30',
    evaluation_period='2006-10-01, 2007-09-30',
    spinup_period='2004-01-01, 2005-09-30',

    # Streamflow observations
    station_id='05BB001',
    DOWNLOAD_WSC_DATA=True,

    # Calibration settings
    OPTIMIZATION_METHODS=['iteration'],
    params_to_calibrate='k_soil,theta_sat,aquiferBaseflowExp,aquiferBaseflowRate,qSurfScale,summerLAI,frozenPrecipMultip,Fcapil,tempCritRain,heightCanopyTop,heightCanopyBottom,windReductionParam,vGn_n',
    basin_params_to_calibrate='routingGammaScale,routingGammaShape',
    optimization_target='streamflow',
    optimization_algorithm='DDS',
    iterations=100,
    optimization_metric='KGE',
    calibration_timestep='hourly',
)

# Save configuration to YAML
import yaml
config_path = Path('./config_basin_lumped.yaml')
config_dict = config.to_dict(flatten=True)
with open(config_path, 'w') as f:
    yaml.dump(config_dict, f, default_flow_style=False, sort_keys=False)
print(f"Configuration saved: {config_path}")

# Initialize SYMFLUENCE with visualization enabled
symfluence = SYMFLUENCE(config, visualize=True)

# Create project structure
project_dir = symfluence.managers['project'].setup_project()
pour_point_path = symfluence.managers['project'].create_pour_point()

print(f"Project structure created at: {project_dir}")

## Step 1b — Download Example Data (Optional)

You can download pre-processed example data from GitHub releases.

In [None]:
# Step 1b — Download example data from GitHub releases (optional)

import shutil
import subprocess
import urllib.request
from pathlib import Path

# Only download if data doesn't already exist
domain_data_dir = Path(str(config.system.data_dir)) / f"domain_{config.domain.name}" / "forcing"

if not domain_data_dir.exists() or not any(domain_data_dir.iterdir()):
    print("📥 Downloading example data from GitHub releases...")
    release_tag = "examples-data-v0.5.5"
    zip_filename = "example_data_v0.5.5.zip"
    zip_file = Path(f"/tmp/{zip_filename}")
    extract_dir = Path("/tmp/symfluence_example_data")
    
    # Direct URL download (no gh CLI authentication required)
    download_url = f"https://github.com/DarriEy/SYMFLUENCE/releases/download/{release_tag}/{zip_filename}"
    
    try:
        print(f"   Downloading from: {download_url}")
        urllib.request.urlretrieve(download_url, zip_file)
        print(f"✅ Downloaded {zip_file}")
        
        # Extract to temp directory first
        extract_dir.mkdir(parents=True, exist_ok=True)
        print(f"📦 Extracting to temp directory...")
        subprocess.run(["unzip", "-q", "-o", str(zip_file), "-d", str(extract_dir)], check=True)
        
        # Copy domain directories to SYMFLUENCE_DATA_DIR
        SYMFLUENCE_DATA_DIR = Path(str(config.system.data_dir))
        SYMFLUENCE_DATA_DIR.mkdir(parents=True, exist_ok=True)
        
        # Find and copy all domain_* directories from extracted content
        extracted_root = extract_dir / "example_data_v0.5.5"
        if extracted_root.exists():
            for domain_dir in extracted_root.glob("domain_*"):
                dest_dir = SYMFLUENCE_DATA_DIR / domain_dir.name
                print(f"   Copying {domain_dir.name} -> {dest_dir}")
                # Use dirs_exist_ok=True to merge/overwrite without deleting (avoids NFS issues)
                shutil.copytree(domain_dir, dest_dir, dirs_exist_ok=True)
        
        # Cleanup temp files
        zip_file.unlink()
        shutil.rmtree(extract_dir)
        print(f"✅ Example data installed to {SYMFLUENCE_DATA_DIR}")
        
    except urllib.error.HTTPError as e:
        print(f"⚠️  HTTP Error {e.code}: {e.reason}")
        print("   Download manually from: https://github.com/DarriEy/SYMFLUENCE/releases")
    except urllib.error.URLError as e:
        print(f"⚠️  Could not download: {e.reason}")
        print("   Download manually from: https://github.com/DarriEy/SYMFLUENCE/releases")
    except subprocess.CalledProcessError as e:
        print(f"⚠️  Extraction failed: {e}")
else:
    print(f"✅ Data already exists at: {domain_data_dir}")

## Step 2 — Domain definition

For basin-scale modeling, we delineate the watershed boundary and create a single lumped HRU representing the entire catchment.

### Step 2a — Geospatial attribute acquisition

Acquires watershed attributes (elevation, land cover, soils) that will be spatially averaged for the lumped representation.

- Uncomment the acquisition line below to download data (set `DATA_ACCESS` to `'cloud'` or `'maf'` in config)
- Alternatively, copy pre-downloaded attributes, forcing, and observation directories into the domain directory

In [None]:
# Step 2a — Attribute acquisition
# Uncomment to acquire data (set DATA_ACCESS to 'cloud' or 'maf' in config)
# symfluence.managers['data'].acquire_attributes()
print("✅ Attribute acquisition complete")

### Step 2b — Watershed delineation

Delineates the basin boundary using automated watershed analysis from the pour point coordinates.

In [None]:
# Step 2b — Watershed delineation
watershed_path = symfluence.managers['domain'].define_domain()
print("✅ Watershed delineation complete")
print(f"Watershed file: {watershed_path}")

### Step 2c — Domain discretization

Creates a single HRU that represents the lumped basin with spatially-averaged characteristics.

In [None]:
# Step 2c — Discretization (single lumped HRU)
hru_path = symfluence.managers['domain'].discretize_domain()
print("✅ Domain discretization complete")
print(f"HRU file: {hru_path}")

### Step 2d — Visualization

Quick visualization of the lumped basin boundary and pour point location.

In [None]:
# Step 2d — Basin visualization 

from IPython.display import Image, display

plot_path = symfluence.managers['domain'].visualize_domain()
print(f"Domain plot saved to: {plot_path}")

# Display the plot inline
if plot_path:
    display(Image(filename=plot_path))

## Step 3 — Data acquisition and preprocessing

Process streamflow observations, meteorological forcing data, and prepare model-ready inputs.

### Step 3a — Streamflow observations

Download and process Water Survey of Canada streamflow data for model evaluation.

In [None]:
# Step 3a — Streamflow data processing
# Uncomment to download and process observations
# symfluence.managers['data'].process_observed_data()
print("✅ Streamflow data processing complete")

### Step 3b — Meteorological forcing

Acquire and spatially average meteorological forcing data over the basin.

In [None]:
# Step 3b — Forcing acquisition
# Uncomment to acquire data (set DATA_ACCESS to 'cloud' or 'maf' in config)
# symfluence.managers['data'].acquire_forcings()
print("✅ Forcing acquisition complete")

### Step 3c — Model-agnostic preprocessing

Standardize variable names, units, and time steps for model consumption.

In [None]:
# Step 3c — Model-agnostic preprocessing
symfluence.managers['data'].run_model_agnostic_preprocessing()
print("✅ Model-agnostic preprocessing complete")

## Step 4 — Model configuration and execution

Configure SUMMA for basin-scale simulation with mizuRoute routing, then execute the model.

In [None]:
# Step 4a — SUMMA-specific preprocessing
symfluence.managers['model'].preprocess_models()

In [None]:
# Step 4b — Model execution
print(f"Running {config.model.hydrological_model} with {config.model.routing_model or 'no routing'}...")
symfluence.managers['model'].run_models()

## Step 5 — Streamflow evaluation

Compare simulated and observed streamflow using standard hydrological metrics and visualization.

In [None]:
# Step 5 — Streamflow evaluation (using Camille's model comparison plots)

from IPython.display import Image, display

# Generate model comparison overview (auto-detects model outputs)
plot_path = symfluence.managers['reporting'].generate_model_comparison_overview(
    experiment_id=config.domain.experiment_id,
    context='run_model'
)

if plot_path:
    print(f"Model comparison overview: {plot_path}")
    display(Image(filename=str(plot_path)))
else:
    print("No model outputs found for comparison. Check simulation outputs.")

print("\nStreamflow evaluation complete")

# Step 5b — Run calibration 

This step runs the calibration process using the DDS (Dynamically Dimensioned Search) algorithm to optimize the model parameters. The optimization targets KGE (Kling-Gupta Efficiency) for streamflow.

In [None]:
results_file = symfluence.managers['optimization'].calibrate_model()  
print("Calibration results file:", results_file)

In [None]:
# Step 5c — Post-calibration visualization
# 
# This generates three visualizations:
# 1. Optimization progress: Shows KGE improvement over iterations
# 2. Model comparison: Calibrated model vs observations (time series, FDC, metrics)
# 3. Default vs Calibrated: Side-by-side comparison showing calibration improvement

from IPython.display import Image, display

# Generate post-calibration visualizations
plot_paths = symfluence.managers['reporting'].visualize_calibration_results(
    experiment_id=config.domain.experiment_id
)

# Display all generated plots
for plot_name, plot_path in plot_paths.items():
    print(f"\n{'='*60}")
    print(f"📊 {plot_name.replace('_', ' ').title()}")
    print(f"{'='*60}")
    display(Image(filename=str(plot_path)))

print("\n✅ Post-calibration visualization complete")