# SYMFLUENCE Tutorial 1b — Point-Scale Workflow (FLUXNET CA-NS7)

## Introduction
This notebook mirrors the concise, configuration-first style established in **Tutorial 01a** and adapts it for **energy-balance validation at a FLUXNET tower (CA-NS7)**. We simulate point-scale land–atmosphere exchanges and evaluate **evapotranspiration (LE)** and **sensible heat (H)** using FLUXNET observations.

The workflow is strictly configuration-driven and fully reproducible:
1) write a minimal config, 2) initialize SYMFLUENCE and standard project layout, 3) define the point-scale domain, 4) acquire & preprocess inputs, 5) run **SUMMA**, and 6) evaluate fluxes.


# Step 1 — Configuration (pick or generate)

We start by generating a compact configuration for the **CA-NS7** FLUXNET site using the same pattern as 01a. This keeps initialization a one-liner and the workflow fully reproducible.

In [None]:
# Step 1 — Create a site-specific configuration for the CA-NS7 FLUXNET example
from pathlib import Path
import yaml
from symfluence.resources import get_config_template
SYMFLUENCE_CODE_DIR = Path.cwd().resolve()

# Path to the default template configuration (same pattern as 01a)
config_template = get_config_template()

# Load the base configuration
with open(config_template, "r") as f:
    config = yaml.safe_load(f)

# === Modify key entries for the CA-NS7 point-scale case ===
# Code & data directories
config['SYMFLUENCE_CODE_DIR'] = str(SYMFLUENCE_CODE_DIR)
#config["SYMFLUENCE_DATA_DIR"] = str(Path("/path/to/SYMFLUENCE_data").resolve())

# Point-scale domain settings
config["DOMAIN_DEFINITION_METHOD"] = "point"
config["DOMAIN_DISCRETIZATION"] = "GRUs"  # 1 GRU => 1 HRU
config["DOMAIN_NAME"] = "CA-NS7"
config["POUR_POINT_COORDS"] = "56.6358/-99.9483"  # CA-NS7 coordinates
config["BOUNDING_BOX_COORDS"] = "56.6858/-99.9983/56.585800000000006/-99.8983"


# Data/forcing & model
config["HYDROLOGICAL_MODEL"] = "SUMMA"
config["FORCING_DATASET"] = "ERA5"  # Used for meteorological inputs
config["DOWNLOAD_FLUXNET"] = True
config["FLUXNET_STATION"] = "CA-NS7"

# Define the temporal extent of the experiment
config["EXPERIMENT_TIME_START"] = "2001-01-01 01:00"
config["EXPERIMENT_TIME_END"] = "2005-12-31 23:00"
config['CALIBRATION_PERIOD'] = "2002-10-01, 2003-09-30"
config['EVALUATION_PERIOD'] = "2003-10-01, 2004-09-30"
config['SPINUP_PERIOD'] = "2001-01-01, 2002-09-30"

# (Optional) Paths to institutional data roots — customize if using shared infra
config['DATATOOL_DATASET_ROOT'] = '/path/to/meteorological-data/'
config['GISTOOL_DATASET_ROOT']  = '/path/to/geospatial-data/'
config['TOOL_CACHE']            = '/path/to/cache/dir'
config['CLUSTER_JSON']          = '/path/to/cluster.json'

# Basic optimization knobs if desired (example only)
config['PARAMS_TO_CALIBRATE'] = 'minStomatalResistance,cond2photo_slope,vcmax25_canopyTop,jmax25_scale,summerLAI,rootingDepth,soilStressParam,z0Canopy,windReductionParam'
config['OPTIMIZATION_TARGET'] = 'et'
config['ET_OBS_SOURCE'] = 'fluxnet'  # Use FLUXNET tower data (not MODIS MOD16)
config['ITERATIVE_OPTIMIZATION_ALGORITHM'] = 'DDS'
config['OPTIMIZATION_METRIC'] = 'KGE'
config['CALIBRATION_TIMESTEP'] = 'daily'  
config['NUMBER_OF_ITERATIONS'] = 100 

# Unique experiment ID for outputs
config["EXPERIMENT_ID"] = "run_fluxnet_1"

# === Save the customized configuration ===
out_config = Path("./config_fluxnet_CA-NS7.yaml")
with open(out_config, "w") as f:
    yaml.dump(config, f, default_flow_style=False, sort_keys=False)

print(f"✅ New configuration written to: {out_config}")

## Step 1b — Initialize SYMFLUENCE
Initialize the framework using the configuration prepared above.

In [None]:
# Step 1b — Initialize SYMFLUENCE
import os, sys
from symfluence import SYMFLUENCE  # adjust if your import path differs
from symfluence.resources import get_config_template

config_path = "./config_fluxnet_CA-NS7.yaml"
symfluence = SYMFLUENCE(config_path, visualize=True)

print("✅ SYMFLUENCE initialized successfully.")
print(f"Configuration loaded from: {config_path}")

## Step 1c — Project structure setup
Create the standardized project directory and a pour-point feature for the site.

In [None]:
# Step 1c — Project structure setup
from pathlib import Path

# 1) Create the standardized project layout (logs, config link, data/output folders, etc.)
project_dir = symfluence.managers['project'].setup_project()

# 2) Create a pour-point feature (site reference geometry for point-scale workflows)
pour_point_path = symfluence.managers['project'].create_pour_point()

print("✅ Project structure created.")
print(f"Project root: {project_dir}")
print(f"Pour point:   {pour_point_path}")

# 3) Brief top-level directory preview
print("\nTop-level structure:")
for p in sorted(Path(project_dir).iterdir()):
    if p.is_dir():
        print(f"├── {p.name}")

# Step 2 — Domain definition (point-scale GRU)
The domain is a **single GRU** around the flux tower footprint, ensuring a strictly point-scale (non-routed) experiment.

### Step 2a — Geospatial attribute acquisition

In [None]:
# Step 2a — Acquire attributes (model-agnostic)
#symfluence.managers['data'].acquire_attributes()
print("✅ Attribute acquisition complete")

### Step 2b — Domain definition (point-scale)
Define a minimal footprint around **CA-NS7** consistent with the pour point.

In [None]:
# Step 2b — Define the point-scale domain
watershed_path = symfluence.managers['domain'].define_domain()
print("✅ Domain definition complete")
print(f"Domain file: {watershed_path}")

### Step 2c — Discretization (required even for 1 GRU = 1 HRU)
Creates the **catchment HRU** artifacts required by downstream steps (still 1:1 with the GRU for point scale).

In [None]:
# Step 2c — Discretization (GRUs → HRUs 1:1)
hru_path = symfluence.managers['domain'].discretize_domain()
print("✅ Domain discretization complete")
print(f"HRU file: {hru_path}")

## Step 2d — Verification & inspection (CA-NS7)
We verify the expected shapefiles in standardized locations, then draw a minimal GRU–HRU overlay.

In [None]:
# Step 2d — Verify domain outputs and visualize (using native SYMFLUENCE plotting)
from IPython.display import Image, display

# Use native visualization
plot_path = symfluence.managers['domain'].visualize_domain()
print(f"Domain plot saved to: {plot_path}")

if plot_path:
    display(Image(filename=str(plot_path)))

# Step 3 — Input preprocessing (model-agnostic)
We prepare inputs in three small moves: 1) acquire **meteorological forcings**, 2) process **FLUXNET observations**, and 3) run **model-agnostic preprocessing** to standardize variables and time steps.

### Step 3a — Acquire meteorological forcings (ERA5)

In [None]:
# Step 3a — Forcings
#symfluence.managers['data'].acquire_forcings()
print("✅ Forcing data acquisition complete")

### Step 3b — Process observations (FLUXNET)

In [None]:
# Step 3b — Observations
#symfluence.managers['data'].process_observed_data()
print("✅ FLUXNET observational data processing complete")

### Step 3c — Model-agnostic preprocessing

In [None]:
# Step 3c — Model-agnostic preprocessing
symfluence.managers['data'].run_model_agnostic_preprocessing()
print("✅ Model-agnostic preprocessing complete")

### Step 3d — Quick verification
Confirm the expected folders exist and contain files (derived from configuration; no hard-coded paths).

In [None]:
from pathlib import Path
import yaml

with open("./config_fluxnet_CA-NS7.yaml") as f:
    cfg = yaml.safe_load(f)

data_dir   = Path(cfg["SYMFLUENCE_DATA_DIR"])
domain_dir = data_dir / f"domain_{cfg['DOMAIN_NAME']}"

targets = {
    "forcing/raw_data":                        domain_dir / "forcing" / "raw_data",
    "forcing/basin_averaged_data":             domain_dir / "forcing" / "basin_averaged_data",
}

def count_files(p: Path) -> int:
    return sum(1 for x in p.iterdir() if x.is_file()) if p.exists() else 0

for label, path in targets.items():
    exists = path.exists()
    n = count_files(path)
    status = "✅" if exists and n > 0 else ("⚠️ empty" if exists else "❌ missing")
    suffix = f"({n} files)" if exists else ""
    print(f"{status} {label}  {suffix}")

# Step 4 — Model-specific preprocessing & model run (SUMMA)

### Step 4a — SUMMA-specific preprocessing

In [None]:
# Step 4a — SUMMA-specific preprocessing
symfluence.managers['model'].preprocess_models()
print("✅ Model-specific preprocessing complete")

## Step 4b — Instantiate & run the model

In [None]:
# Step 4b — Instantiate & run SUMMA
print(f"Running {symfluence.config['HYDROLOGICAL_MODEL']} for point-scale simulation…")
symfluence.managers['model'].run_models()
print("✅ Point-scale model run complete")

### Step 4c — Quick verification
Print where SUMMA inputs and run outputs were written (paths are derived from the configuration).

In [None]:
from pathlib import Path
import yaml

with open("./config_fluxnet_CA-NS7.yaml") as f:
    cfg = yaml.safe_load(f)

data_dir   = Path(cfg["SYMFLUENCE_DATA_DIR"])
domain_dir = data_dir / f"domain_{cfg['DOMAIN_NAME']}"

# Common locations used by the model manager
summa_in   = domain_dir / "forcing" / "SUMMA_input"
results    = domain_dir / "simulations" / cfg['EXPERIMENT_ID'] / 'SUMMA'

print("SUMMA input dir:", summa_in if summa_in.exists() else "(not found)")
print("Results dir:",    results if results.exists()    else "(not found)")

# Step 5 — ET & H Validation (FLUXNET vs Simulation)
We compute basic metrics and draw quick comparisons between **observed** and **simulated** latent heat/ET and sensible heat. The code is resilient to different variable names in SUMMA outputs.

In [None]:
# Step 5 — ET & H Evaluation 

from IPython.display import Image, display
from pathlib import Path

# Generate all SUMMA output visualizations (including energy fluxes with obs overlay)
plot_paths = symfluence.managers['reporting'].visualize_summa_outputs(
    experiment_id=symfluence.config['EXPERIMENT_ID']
)

# Display the energy flux comparison plots
flux_vars = ['scalarLatHeatTotal']
found_plots = False

for var in flux_vars:
    if var in plot_paths:
        plot_file = Path(plot_paths[var])
        var_label = "Latent Heat (ET)" if "Lat" in var else "Sensible Heat"
        print(f"\n{var_label} evaluation plot: {plot_file}")
        display(Image(filename=str(plot_file)))
        found_plots = True

if not found_plots:
    print("Energy flux plots not found. Available plots:")
    for var, path in plot_paths.items():
        print(f"  - {var}: {path}")

print("\nEnergy flux evaluation complete")

In [None]:
# Step 5b — Run calibration 

results_file = symfluence.managers['optimization'].calibrate_model()  
print("Calibration results file:", results_file)

In [None]:
# Step 5c — Post-calibration visualization (using native SYMFLUENCE plotting)
#
# Generates calibration-specific visualizations:
# - Optimization progress/convergence plot
# - Calibrated model comparison (LE/H obs vs sim with metrics)

from IPython.display import Image, display

# Generate post-calibration visualizations
plot_paths = symfluence.managers['reporting'].visualize_calibration_results(
    experiment_id=symfluence.config['EXPERIMENT_ID']
)

# Display all generated plots
for plot_name, plot_path in plot_paths.items():
    print(f"\n{plot_name}:")
    display(Image(filename=str(plot_path)))

print("\nPost-calibration visualization complete")