# CONFLUENCE Tutorial - 2: Point-Scale Workflow (FLUXNET Example)

## Introduction

Building on the previous tutorial's foundation in CONFLUENCE workflow management and point-scale modeling, this notebook extends our analysis to focus on energy balance and evapotranspiration processes. While the SNOTEL tutorial emphasized snow dynamics and soil moisture in mountain environments, this example demonstrates CONFLUENCE's capabilities for simulating land-atmosphere interactions using eddy covariance flux tower observations.

### FLUXNET: A Global Network for Energy and Carbon Flux Observations

The FLUXNET network represents one of the most comprehensive global observational frameworks for studying land-atmosphere interactions, providing continuous measurements of energy, water, and carbon fluxes using the eddy covariance technique. These towers offer unique advantages for hydrological model evaluation:

1. **Direct flux measurements**: Evapotranspiration and sensible heat flux observations provide direct validation targets for land surface energy balance models
2. **High temporal resolution**: Sub-daily measurements capture diurnal cycles and rapid response to environmental drivers
3. **Multi-year records**: Long-term observations enable assessment of seasonal dynamics and interannual variability
4. **Ecosystem diversity**: Sites span major biomes, allowing process-based model evaluation across diverse vegetation types and climatic conditions

### Scientific Importance of Energy Balance Modeling

Accurate representation of land-atmosphere energy exchanges is fundamental to hydrological modeling for several reasons:

1. **Evapotranspiration partitioning**: Understanding the relative contributions of soil evaporation, plant transpiration, and canopy interception to total water loss
2. **Coupling with soil moisture**: Energy balance directly influences soil moisture dynamics through evapotranspiration demand and soil-plant-atmosphere feedback mechanisms
3. **Vegetation stress**: Accurate simulation of plant water stress and stomatal response to environmental conditions
4. **Climate sensitivity**: Land-atmosphere interactions represent a key feedback mechanism in climate variability and change

### Case Study: CA-NS7 Boreal Forest Site

This tutorial focuses on the CA-NS7 FLUXNET site, located in the boreal forest of Saskatchewan, Canada (56.6358°N, 99.9483°W). This site presents distinct scientific challenges compared to the mountain snow environment of the previous tutorial:

**Site characteristics:**
- **Ecosystem type**: Mature boreal forest dominated by black spruce (*Picea mariana*)
- **Climate regime**: Continental boreal climate with pronounced seasonal temperature variations
- **Elevation**: 260 m above sea level
- **Soil conditions**: Organic-rich soils with seasonal freezing and variable drainage
- **Observational period**: Multi-year records of energy, water, and carbon fluxes

**Scientific challenges:**
- **Seasonal vegetation dynamics**: Pronounced phenological cycles affecting canopy conductance and interception
- **Freeze-thaw processes**: Soil and vegetation interactions during spring thaw periods
- **Boreal forest energy balance**: Complex canopy structure effects on radiation partitioning and aerodynamic properties
- **Interannual variability**: Sensitivity to climate drivers and ecosystem disturbance history

## Learning Objectives

Through this tutorial, you will:

1. **Extend CONFLUENCE applications**: Apply the workflow to energy balance modeling and flux tower validation
2. **Understand ecosystem-specific modeling**: Configure SUMMA for boreal forest conditions and vegetation parameterizations
3. **Evaluate energy balance processes**: Compare simulated and observed evapotranspiration and sensible heat flux using established metrics
4. **Interpret land-atmosphere interactions**: Analyze the physical drivers of model-observation discrepancies in energy partitioning
5. **Connect point-scale to ecosystem scales**: Understand how flux tower "footprints" relate to model grid cell assumptions

### Tutorial Structure

This tutorial follows the established CONFLUENCE workflow while emphasizing energy balance processes:

1. **Configuration**: Adapt point-scale setup for boreal forest conditions
2. **Data acquisition**: Integrate FLUXNET observations with meteorological forcing
3. **Model execution**: Run SUMMA with appropriate vegetation and soil parameterizations
4. **Flux validation**: Compare simulated and observed energy balance components
5. **Process analysis**: Interpret results in the context of boreal ecosystem dynamics

By completing this tutorial, you'll develop expertise in energy balance modeling that complements the snow and soil moisture focus of the previous example, providing a more comprehensive foundation for distributed hydrological modeling applications.

## Step 1: Rapid Workflow Setup for FLUXNET Energy Balance Modeling
Building on the CONFLUENCE fundamentals established in Tutorial 01a, we can now streamline the initial workflow setup. This step efficiently configures the system for energy balance validation at the CA-NS7 boreal forest FLUXNET site, leveraging the same reproducible framework while focusing on ecosystem-specific parameterization.
Key Differences from Tutorial 01a:

- Location: CA-NS7 (Saskatchewan boreal forest) vs. Paradise SNOTEL (Cascade Mountains)
- Validation Focus: Energy fluxes (LE, H, Rn) vs. snow/soil moisture (SWE, SM)
- Ecosystem Type: Mature boreal forest vs. transitional snow zone
- Temporal Emphasis: Sub-daily energy cycles vs. seasonal snow dynamic

In [None]:
# =============================================================================
# STEP 1: RAPID WORKFLOW SETUP FOR FLUXNET ENERGY BALANCE MODELING
# =============================================================================

# Import required libraries
import sys
import os
from pathlib import Path
import yaml
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd
from datetime import datetime
import xarray as xr
import numpy as np

# Add CONFLUENCE to path
confluence_path = Path('../').resolve()
sys.path.append(str(confluence_path))

# Import main CONFLUENCE class
from CONFLUENCE import CONFLUENCE

# Set up plotting style
plt.style.use('default')
%matplotlib inline

print("=== CONFLUENCE Tutorial 01b: FLUXNET Energy Balance Validation ===")
print(f"Building on Tutorial 01a foundations for rapid workflow deployment")

# =============================================================================
# CONFIGURATION FOR CA-NS7 BOREAL FOREST SITE
# =============================================================================

print("\n🌿 Configuring for CA-NS7 Boreal Forest FLUXNET Site")

# Set directory paths
CONFLUENCE_CODE_DIR = confluence_path
CONFLUENCE_DATA_DIR = Path('/Users/darrieythorsson/compHydro/data/CONFLUENCE_data')  # ← Update this path

# Load template configuration and customize for FLUXNET site
config_template_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_point_template.yaml'

with open(config_template_path, 'r') as f:
    config_dict = yaml.safe_load(f)

# Update for CA-NS7 boreal forest site
config_updates = {
    'CONFLUENCE_CODE_DIR': str(CONFLUENCE_CODE_DIR),
    'CONFLUENCE_DATA_DIR': str(CONFLUENCE_DATA_DIR),
    'DOMAIN_NAME': 'CA_NS7',
    'EXPERIMENT_ID': 'boreal_energy_balance',
    'POUR_POINT_COORDS': '56.6358/-99.9483',  # CA-NS7 coordinates
    'DOWNLOAD_FLUXNET': 'true',
    'FLUXNET_STATION': 'CA-NS7',
    'EXPERIMENT_TIME_START': '2001-01-01 01:00',  # FLUXNET data availability
    'EXPERIMENT_TIME_END': '2005-12-31 23:00',
    'CALIBRATION_PERIOD': '2002-01-01, 2003-12-31',
    'EVALUATION_PERIOD': '2004-01-01, 2005-12-31',
    'SPINUP_PERIOD': '2001-01-01, 2001-12-31'
}

config_dict.update(config_updates)

# Add experiment metadata for traceability
config_dict['NOTEBOOK_CREATION_TIME'] = datetime.now().isoformat()
config_dict['NOTEBOOK_CREATOR'] = 'CONFLUENCE_Tutorial_01b'
config_dict['TARGET_PROCESSES'] = 'Energy balance, evapotranspiration, boreal forest dynamics'

# Save configuration
temp_config_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_fluxnet_notebook.yaml'
with open(temp_config_path, 'w') as f:
    yaml.dump(config_dict, f, default_flow_style=False, sort_keys=False)

print(f"✅ Configuration saved: {temp_config_path}")

# =============================================================================
# SYSTEM INITIALIZATION AND PROJECT STRUCTURE
# =============================================================================

print("\n🏗️  Initializing CONFLUENCE System...")

# Initialize CONFLUENCE with FLUXNET configuration
confluence = CONFLUENCE(temp_config_path)

print(f"✅ System initialized with {len(confluence.managers)} managers")

# Create project structure
print(f"\n📁 Creating project structure for {config_dict['DOMAIN_NAME']}...")
project_dir = confluence.managers['project'].setup_project()
pour_point_path = confluence.managers['project'].create_pour_point()

print(f"✅ Project directory: {project_dir}")
print(f"✅ Pour point created: {pour_point_path}")

# =============================================================================
# CONFIGURATION SUMMARY AND VALIDATION TARGETS
# =============================================================================

print(f"\n📊 CA-NS7 Site Configuration Summary:")
site_info = {
    "Location": f"{config_dict['POUR_POINT_COORDS']} (Saskatchewan, Canada)",
    "Ecosystem": "Mature boreal forest (Picea mariana dominated)",
    "Elevation": "~260 m above sea level",
    "Climate": "Continental boreal with pronounced seasonality",
    "Simulation period": f"{config_dict['EXPERIMENT_TIME_START']} to {config_dict['EXPERIMENT_TIME_END']}",
    "Target fluxes": "Latent heat (LE), Sensible heat (H), Net radiation (Rn)"
}

for key, value in site_info.items():
    print(f"   🌲 {key}: {value}")

print(f"\n🎯 Energy Balance Validation Framework:")
validation_targets = [
    "Latent heat flux (LE) - Direct ET measurement",
    "Sensible heat flux (H) - Convective energy transfer", 
    "Net radiation (Rn) - Available energy for surface processes",
    "Ground heat flux (G) - Soil energy storage",
    "Energy balance closure - Physical consistency check"
]

for target in validation_targets:
    print(f"   ⚡ {target}")

print(f"\n🔄 Workflow Status:")
workflow_status = confluence.workflow_orchestrator.get_workflow_status()
print(f"   Total steps: {workflow_status['total_steps']}")
print(f"   Completed: {workflow_status['completed_steps']}")
print(f"   Ready for domain definition and data acquisition")

print(f"\n🚀 Setup complete - Ready for boreal forest energy balance modeling!")

## Step 2: Streamlined Geospatial Domain Setup for Boreal Forest Site
Having established the geospatial domain definition principles in Tutorial 01a, we can now efficiently configure the spatial framework for our boreal forest FLUXNET site. The same point-scale approach applies, but the underlying geospatial characteristics reflect the distinct boreal ecosystem.
Geospatial Contrasts: CA-NS7 vs Paradise SNOTEL

- Elevation: 260m (boreal lowland) vs 1,630m (mountain transitional zone)
- Vegetation: Mature black spruce forest vs mixed coniferous/alpine vegetation
- Soils: Organic-rich boreal soils vs mineral mountain soils
- Climate: Continental boreal vs maritime-influenced mountain climate
- Drainage: Variable boreal drainage vs steep mountain topography

The same CONFLUENCE spatial framework handles both environments seamlessly, demonstrating the transferability of the modeling approach across diverse ecosystems while capturing site-specific physical characteristics through the attribute acquisition process.

In [None]:
# =============================================================================
# STEP 2: STREAMLINED GEOSPATIAL DOMAIN SETUP FOR BOREAL FOREST SITE
# =============================================================================

print("=== Step 2: Geospatial Domain Setup for Boreal Forest Energy Balance ===")
print("Efficient spatial configuration leveraging Tutorial 01a foundations")

# =============================================================================
# RAPID ATTRIBUTE ACQUISITION FOR BOREAL FOREST CHARACTERISTICS
# =============================================================================

print(f"\n🌲 Acquiring Boreal Forest Geospatial Characteristics...")
print(f"   Location: CA-NS7 ({config_dict['POUR_POINT_COORDS']})")
print(f"   Bounding box: {config_dict.get('BOUNDING_BOX_COORDS', 'Auto-generated for point-scale')}")

print(f"\n📊 Expected Boreal Forest Attributes:")
boreal_attributes = [
    "Low elevation (~260m) with minimal topographic complexity",
    "Organic-rich soils with seasonal freeze-thaw dynamics",
    "Mature coniferous forest (Picea mariana dominated)", 
    "Continental climate with pronounced seasonal temperature range",
    "Variable drainage conditions typical of boreal landscapes"
]

for attr in boreal_attributes:
    print(f"   🌿 {attr}")

# Execute attribute acquisition
print(f"\n⬇️  Executing geospatial attribute acquisition...")
confluence.managers['data'].acquire_attributes()
print("✅ Attribute acquisition complete")

# =============================================================================
# DOMAIN DELINEATION AND DISCRETIZATION FOR POINT-SCALE FLUX TOWER
# =============================================================================

print(f"\n🎯 Point-Scale Domain Configuration for Flux Tower Footprint...")

# Domain delineation (single GRU representing flux tower footprint)
print(f"   Creating computational boundary representing flux tower footprint...")
watershed_path = confluence.managers['domain'].define_domain()

# Domain discretization (single HRU for energy balance modeling)  
print(f"   Creating single HRU for energy balance simulation...")
hru_path = confluence.managers['domain'].discretize_domain()

print(f"✅ Spatial domain configuration complete")

# =============================================================================
# VERIFICATION AND BOREAL FOREST CHARACTERIZATION
# =============================================================================

print(f"\n🔍 Verifying Boreal Forest Domain Characteristics...")

# Verify HRU creation and inspect characteristics
if hru_path and hru_path.exists():
    hru_gdf = gpd.read_file(hru_path)
    
    print(f"\n📋 Spatial Domain Summary:")
    print(f"   Number of HRUs: {len(hru_gdf)} (point-scale representation)")
    print(f"   Domain area: {hru_gdf.geometry.area.sum():.6f} degree²")
    print(f"   Centroid: ({hru_gdf.geometry.centroid.x.iloc[0]:.6f}, {hru_gdf.geometry.centroid.y.iloc[0]:.6f})")
    
    # Display representative characteristics if available
    if 'elevation' in hru_gdf.columns:
        print(f"   Elevation: {hru_gdf['elevation'].iloc[0]:.1f} m")
    if 'landclass' in hru_gdf.columns:
        print(f"   Dominant land cover: {hru_gdf['landclass'].iloc[0]}")
    if 'soilclass' in hru_gdf.columns:
        print(f"   Soil classification: {hru_gdf['soilclass'].iloc[0]}")
    
    print(f"\n🌲 Boreal Forest Spatial Context:")
    print(f"   → Single HRU represents flux tower measurement footprint")
    print(f"   → Uniform characteristics assumption appropriate for homogeneous forest")
    print(f"   → Contrasts with distributed modeling where spatial heterogeneity matters")
    
else:
    print("⚠️  HRU verification failed - check domain discretization")

# =============================================================================
# WORKFLOW PROGRESS AND NEXT STEPS
# =============================================================================

print(f"\n📊 Workflow Progress Check:")
workflow_status = confluence.workflow_orchestrator.get_workflow_status()
completed_spatial = [step['name'] for step in workflow_status['step_details'] 
                    if step['complete'] and step['name'] in ['setup_project', 'create_pour_point', 
                    'acquire_attributes', 'define_domain', 'discretize_domain']]

print(f"   ✅ Completed spatial steps: {len(completed_spatial)}/5")
for step in completed_spatial:
    print(f"      ✓ {step.replace('_', ' ').title()}")

print(f"\n🎯 Scientific Foundation Established:")
foundation_elements = [
    "Point-scale spatial framework configured for flux tower validation",
    "Boreal forest characteristics captured through attribute acquisition",
    "Single HRU represents homogeneous forest stand assumption",
    "Domain ready for meteorological forcing and FLUXNET data integration",
    "Same spatial principles as Tutorial 01a, different ecosystem context"
]

for element in foundation_elements:
    print(f"   🌿 {element}")

print(f"\n🚀 Ready for data preprocessing and energy balance modeling!")
print(f"   → Spatial domain: Configured for boreal forest conditions")
print(f"   → Next: Model-agnostic preprocessing and FLUXNET data integration") 
print(f"   → Target: Energy flux validation and process interpretation")

## Step 3: Data Pipeline
Leveraging the model-agnostic preprocessing concepts established in Tutorial 01a, we can now efficiently prepare the data pipeline for boreal forest energy balance modeling. The same standardized framework seamlessly handles the transition from snow/soil validation to energy flux evaluation, demonstrating CONFLUENCE's versatility across diverse validation objectives.
Data Pipeline Adaptation: SNOTEL → FLUXNET

- Forcing Data: Same ERA5 global reanalysis, different coordinates and period
- Validation Targets: Energy fluxes (LE, H, Rn, G) vs snow/soil states (SWE, SM)
- Temporal Focus: Sub-daily energy cycles vs seasonal snow dynamics
- Ecosystem Context: Boreal forest processes vs mountain snow processes
- Preprocessing Benefits: Same quality-controlled, standardized pipeline serves both applications

This demonstrates the core strength of CONFLUENCE's model-agnostic philosophy: consistent data preparation enables true process comparisons across sites, ecosystems, and validation targets.

In [None]:
# =============================================================================
# STEP 3: EFFICIENT DATA PIPELINE FOR ENERGY BALANCE VALIDATION
# =============================================================================

print("=== Step 3: Efficient Data Pipeline for Energy Balance Validation ===")
print("Applying model-agnostic preprocessing framework to FLUXNET energy flux validation")

# =============================================================================
# OBSERVATIONAL DATA: FLUXNET ENERGY BALANCE MEASUREMENTS
# =============================================================================

print(f"\n⚡ Processing FLUXNET Energy Balance Observations...")
print(f"   Site: {config_dict['FLUXNET_STATION']} (CA-NS7)")
print(f"   Ecosystem: Mature boreal forest (Picea mariana)")
print(f"   Measurement technique: Eddy covariance flux tower")

print(f"\n🎯 FLUXNET Validation Targets:")
fluxnet_variables = [
    "Latent heat flux (LE) - Direct evapotranspiration measurement",
    "Sensible heat flux (H) - Convective energy transfer to atmosphere",
    "Net radiation (Rn) - Available energy driving surface processes", 
    "Ground heat flux (G) - Soil energy storage and conduction",
    "Additional: Air temperature, humidity, wind for process interpretation"
]

for var in fluxnet_variables:
    print(f"   📊 {var}")

# Execute observational data processing
print(f"\n📥 Processing FLUXNET observational datasets...")
confluence.managers['data'].process_observed_data()
print("✅ FLUXNET data processing complete")

print(f"\n🔬 Scientific Value of Energy Flux Validation:")
validation_benefits = [
    "Direct measurement of land-atmosphere energy exchange",
    "Sub-daily resolution captures diurnal energy cycles", 
    "Multi-year records enable seasonal and interannual assessment",
    "Energy balance closure provides physical consistency check",
    "Ecosystem-specific validation for boreal forest processes"
]

for benefit in validation_benefits:
    print(f"   🌲 {benefit}")

# =============================================================================
# METEOROLOGICAL FORCING: ERA5 FOR BOREAL FOREST LOCATION
# =============================================================================

print(f"\n🌦️  Acquiring ERA5 Forcing for Boreal Forest Site...")
print(f"   Location: {config_dict['POUR_POINT_COORDS']} (Saskatchewan)")
print(f"   Period: {config_dict['EXPERIMENT_TIME_START']} to {config_dict['EXPERIMENT_TIME_END']}")
print(f"   Climate context: Continental boreal with pronounced seasonality")

print(f"\n📈 Forcing Data Characteristics for Boreal Energy Balance:")
forcing_context = [
    "Large seasonal temperature range (-30°C to +25°C)",
    "Pronounced radiation seasonality (polar day/night effects)",
    "Variable precipitation (snow/rain transition seasons)",
    "Continental air mass influences on humidity and wind",
    "Freeze-thaw cycles affecting soil and vegetation dynamics"
]

for context in forcing_context:
    print(f"   🌡️  {context}")

# Execute forcing acquisition (commented for demonstration)
print(f"\n⬇️  Executing forcing data acquisition...")
# confluence.managers['data'].acquire_forcings()
print("✅ ERA5 forcing acquisition complete (simulated)")

# =============================================================================
# MODEL-AGNOSTIC PREPROCESSING: STANDARDIZED PIPELINE
# =============================================================================

print(f"\n🔧 Model-Agnostic Preprocessing Pipeline...")
print(f"   Same framework as Tutorial 01a, different validation targets")

print(f"\n⚙️  Preprocessing Components:")
preprocessing_steps = [
    "Spatial remapping: ERA5 grid to flux tower footprint", 
    "Temporal alignment: Forcing and observation synchronization",
    "Quality control: Gap filling and outlier detection",
    "Format standardization: Model-independent NetCDF products",
    "Attribute integration: Boreal forest characteristics"
]

for step in preprocessing_steps:
    print(f"   🔄 {step}")

# Execute model-agnostic preprocessing
print(f"\n⚙️  Executing model-agnostic preprocessing...")
confluence.managers['data'].run_model_agnostic_preprocessing()
print("✅ Model-agnostic preprocessing complete")

print(f"\n🎯 Key Outputs for Energy Balance Modeling:")
agnostic_outputs = [
    "Basin-averaged forcing: Boreal forest-specific meteorology",
    "HRU attributes: Forest structure and soil characteristics", 
    "Spatial mapping: Conservative remapping for mass/energy conservation",
    "Quality reports: Data integrity and gap-filling documentation"
]

for output in agnostic_outputs:
    print(f"   📦 {output}")

# =============================================================================
# MODEL-SPECIFIC PREPROCESSING: SUMMA ENERGY BALANCE CONFIGURATION
# =============================================================================

print(f"\n🌿 SUMMA-Specific Configuration for Boreal Forest Energy Balance...")
print(f"   Focus: Energy partitioning and evapotranspiration processes")
print(f"   Ecosystem: Mature coniferous forest parameterization")

print(f"\n🔧 SUMMA Configuration for Boreal Forest:")
summa_config = [
    "Vegetation parameters: Boreal coniferous forest (LAI, albedo, roughness)",
    "Soil parameters: Organic-rich boreal soil hydraulic properties",
    "Energy balance: Canopy radiation interception and partitioning",
    "Stomatal conductance: Temperature and moisture stress responses",
    "Seasonal dynamics: Phenological controls on transpiration"
]

for config in summa_config:
    print(f"   🌲 {config}")

# Execute model-specific preprocessing
print(f"\n🔧 Executing SUMMA-specific preprocessing...")
confluence.managers['model'].preprocess_models()
print("✅ SUMMA configuration complete")

print(f"\n📋 SUMMA Output Configuration for Energy Balance Validation:")
summa_outputs = [
    "Latent heat flux (scalarLatHeatTotal) - For LE comparison",
    "Sensible heat flux (scalarSenHeatTotal) - For H comparison",
    "Net radiation (scalarNetRadiation) - For Rn comparison", 
    "Ground heat flux (scalarGroundHeatFlux) - For G comparison",
    "Soil temperature profile - For process interpretation",
    "Canopy temperature - For energy balance analysis"
]

for output in summa_outputs:
    print(f"   📊 {output}")

# =============================================================================
# DATA PIPELINE SUMMARY AND VALIDATION READINESS
# =============================================================================

print(f"\n✅ Data Pipeline Summary for Energy Balance Validation:")

pipeline_achievements = [
    "✅ FLUXNET energy flux observations processed and quality-controlled",
    "✅ ERA5 forcing acquired and spatially mapped to tower footprint", 
    "✅ Model-agnostic preprocessing creates standardized, reusable products",
    "✅ SUMMA configured for boreal forest energy balance simulation",
    "✅ Output variables aligned with FLUXNET validation targets"
]

for achievement in pipeline_achievements:
    print(f"   {achievement}")

print(f"\n🔬 Scientific Benefits Achieved:")
scientific_benefits = [
    "Consistent preprocessing eliminates data preparation as confounding variable",
    "Same framework enables comparison with Tutorial 01a snow/soil results",
    "Quality-controlled inputs support robust energy balance evaluation",
    "Standardized outputs enable automated benchmarking and model comparison",
    "Reproducible pipeline supports collaborative boreal forest research"
]

for benefit in scientific_benefits:
    print(f"   🎯 {benefit}")

print(f"\n🌐 Framework Versatility Demonstrated:")
print(f"   📊 Same preprocessing pipeline handles:")
print(f"      • Tutorial 01a: Snow/soil validation at mountain SNOTEL site")
print(f"      • Tutorial 01b: Energy flux validation at boreal FLUXNET site")
print(f"      • Future: Any point-scale validation across diverse ecosystems")

print(f"\n🚀 Ready for boreal forest energy balance simulation!")
print(f"   → Preprocessed inputs: Standardized and quality-controlled")
print(f"   → Model configuration: Optimized for boreal forest conditions")
print(f"   → Validation targets: FLUXNET energy fluxes prepared")
print(f"   → Next step: SUMMA execution and energy balance evaluation")

## Step 4: Model Execution for Energy Balance Simulation
Building on the detailed model instantiation concepts from Tutorial 01a, we can now efficiently execute the energy balance simulation. The same SUMMA process-based physics applies, but with emphasis on land-atmosphere energy exchange rather than snow accumulation and soil moisture dynamics.
Key Process Differences: Energy Balance Focus

- Primary Outputs: Latent heat (LE), sensible heat (H), net radiation (Rn), ground heat flux (G)
- Temporal Resolution: Sub-daily energy cycles vs seasonal snow evolution
- Validation Emphasis: Energy partitioning vs snow/soil state variables
- Ecosystem Context: Boreal forest canopy processes vs mountain snow processes

The same workflow orchestration and quality assurance framework ensure reproducible, physically-realistic simulations across both applications.

In [None]:
# =============================================================================
# STEP 4: STREAMLINED MODEL EXECUTION FOR ENERGY BALANCE SIMULATION
# =============================================================================

print("=== Step 4: Energy Balance Simulation Execution ===")
print("Efficient model execution leveraging Tutorial 01a process-based foundations")

# =============================================================================
# SUMMA EXECUTION FOR BOREAL FOREST ENERGY BALANCE
# =============================================================================

print(f"\n🌲 Executing SUMMA for Boreal Forest Energy Balance...")
print(f"   Model: {config_dict['HYDROLOGICAL_MODEL']}")
print(f"   Domain: {config_dict['DOMAIN_NAME']} (single HRU)")
print(f"   Focus: Land-atmosphere energy exchange and evapotranspiration")
print(f"   Period: {config_dict['EXPERIMENT_TIME_START']} to {config_dict['EXPERIMENT_TIME_END']}")

print(f"\n⚡ Energy Balance Process Emphasis:")
energy_processes = [
    "Canopy radiation interception and partitioning",
    "Stomatal conductance and transpiration control",
    "Soil evaporation and surface energy exchange", 
    "Sensible heat transfer and boundary layer coupling"
]

for process in energy_processes:
    print(f"   🍃 {process}")

# Execute the model
print(f"\n🏃‍♂️ Running SUMMA energy balance simulation...")
confluence.managers['model'].run_models()
print("✅ Boreal forest energy balance simulation complete")

# =============================================================================
# QUICK VERIFICATION AND OUTPUT SUMMARY
# =============================================================================

print(f"\n🔍 Simulation Output Verification...")

# Locate and verify simulation outputs
sim_dir = confluence.project_dir / "simulations" / config_dict['EXPERIMENT_ID'] / "SUMMA"
expected_outputs = [f"{config_dict['EXPERIMENT_ID']}_day.nc"]

for output_file in expected_outputs:
    file_path = sim_dir / output_file
    exists = "✅" if file_path.exists() else "❌"
    if file_path.exists():
        file_size = file_path.stat().st_size / (1024*1024)  # MB
        print(f"   {exists} {output_file} ({file_size:.1f} MB)")
    else:
        print(f"   {exists} {output_file}")

print(f"\n📊 Key Energy Balance Variables Available:")
energy_variables = [
    "scalarLatHeatTotal (LE) - Latent heat flux", 
    "scalarSenHeatTotal (H) - Sensible heat flux",
    "scalarNetRadiation (Rn) - Net radiation",
    "scalarGroundHeatFlux (G) - Ground heat flux"
]

for var in energy_variables:
    print(f"   ⚡ {var}")

print(f"\n✅ Energy Balance Simulation Summary:")
print(f"   → Process-based simulation completed successfully")
print(f"   → Energy flux outputs generated for FLUXNET validation")
print(f"   → Quality-assured results ready for comprehensive evaluation")
print(f"   → Same workflow reliability as Tutorial 01a, different process focus")

print(f"\n🚀 Ready for energy balance evaluation and FLUXNET comparison!")

## Step 5: Energy Balance Evaluation and ET Process Validation
Building on the comprehensive evaluation framework established in Tutorial 01a, we now focus on energy flux validation using FLUXNET observations. The same scientific evaluation principles apply, but with emphasis on land-atmosphere energy exchange rather than snow/soil state variables.
Evaluation Framework Adaptation: Snow/Soil → Energy Balance

- Validation Targets: Latent heat (LE), sensible heat (H), net radiation (Rn) vs SWE, soil moisture
- Process Focus: Evapotranspiration partitioning vs snow accumulation/melt dynamics
- Temporal Scales: Sub-daily energy cycles vs seasonal snow evolution
- Performance Metrics: Energy balance closure and flux magnitude accuracy
- Physical Interpretation: Stomatal conductance and canopy processes vs snow physics

In [None]:
# =============================================================================
# STEP 5: ENERGY BALANCE EVALUATION AND ET PROCESS VALIDATION
# =============================================================================

print("=== Step 5: Energy Balance Evaluation and ET Process Validation ===")
print("Comprehensive assessment of land-atmosphere energy exchange processes")

# =============================================================================
# SIMULATION DATA LOADING AND ENERGY VARIABLE INVENTORY
# =============================================================================

print(f"\n⚡ Loading Energy Balance Simulation Results...")

# Load simulation data
sim_dir = confluence.project_dir / "simulations" / config_dict['EXPERIMENT_ID'] / "SUMMA"
daily_output_path = sim_dir / f"{config_dict['EXPERIMENT_ID']}_day.nc"

if daily_output_path.exists():
    # Load and prepare evaluation dataset
    ds = xr.open_dataset(daily_output_path)
    
    # Skip spinup period
    start_year = ds.time.dt.year.min().values + 1
    spinup_end = f"{start_year}-01-01"
    time_mask = ds.time >= pd.to_datetime(spinup_end)
    evaluation_data = ds.isel(time=time_mask)
    
    print(f"✅ Simulation data loaded")
    print(f"   Evaluation period: {evaluation_data.time.min().values} to {evaluation_data.time.max().values}")
    
    # Identify available energy balance variables
    energy_variables = {
        'scalarLatHeatTotal': 'Latent heat flux (LE) - Evapotranspiration energy',
        'scalarSenHeatTotal': 'Sensible heat flux (H) - Convective energy transfer',
        'scalarNetRadiation': 'Net radiation (Rn) - Available energy',
        'scalarGroundHeatFlux': 'Ground heat flux (G) - Soil energy storage'
    }
    
    available_energy_vars = {var: desc for var, desc in energy_variables.items() 
                           if var in evaluation_data.data_vars}
    
    print(f"\n📊 Available Energy Balance Variables:")
    for var, desc in available_energy_vars.items():
        print(f"   ⚡ {var}: {desc}")
    
    # ET component variables for detailed analysis
    et_components = {
        'scalarCanopyTranspiration': 'Plant transpiration',
        'scalarCanopyEvaporation': 'Canopy interception evaporation', 
        'scalarGroundEvaporation': 'Soil surface evaporation',
        'scalarCanopySublimation': 'Canopy sublimation',
        'scalarSnowSublimation': 'Snow sublimation'
    }
    
    available_et_components = {var: desc for var, desc in et_components.items()
                             if var in evaluation_data.data_vars}
    
    print(f"\n🌿 Available ET Component Variables:")
    for var, desc in available_et_components.items():
        print(f"   🍃 {var}: {desc}")

else:
    print(f"❌ Simulation output not found: {daily_output_path}")
    raise FileNotFoundError("Cannot proceed with energy balance evaluation")

# =============================================================================
# FLUXNET OBSERVATION DATA LOADING
# =============================================================================

print(f"\n📊 Loading FLUXNET Energy Flux Observations...")

# Load FLUXNET data
fluxnet_path = confluence.project_dir / "observations" / "energy_fluxes" / "fluxnet" / "processed" / f"{config_dict['DOMAIN_NAME']}_fluxnet_processed.csv"

if fluxnet_path.exists():
    fluxnet_df = pd.read_csv(fluxnet_path, parse_dates=['timestamp'])
    fluxnet_df.set_index('timestamp', inplace=True)
    
    print(f"✅ FLUXNET data loaded")
    print(f"   Period: {fluxnet_df.index.min()} to {fluxnet_df.index.max()}")
    print(f"   Available flux variables: {fluxnet_df.columns.tolist()}")
    
    # Convert LE (W/m²) to ET (mm/day) using standard conversion
    if 'LE_F_MDS' in fluxnet_df.columns:
        fluxnet_df['ET_from_LE_mm_per_day'] = fluxnet_df['LE_F_MDS'] * 0.0353
        print(f"   ✅ LE converted to ET (LE × 0.0353 mm/day per W/m²)")
    
else:
    print(f"⚠️  FLUXNET data not found at {fluxnet_path}")
    print("   Proceeding with simulation-only analysis")
    fluxnet_df = None

# =============================================================================
# ENERGY BALANCE EVALUATION: LATENT HEAT FLUX (ET)
# =============================================================================

print(f"\n🌿 Latent Heat Flux (Evapotranspiration) Evaluation...")

if 'scalarLatHeatTotal' in evaluation_data.data_vars and fluxnet_df is not None:
    
    # Extract simulated latent heat flux and convert units
    sim_le = evaluation_data['scalarLatHeatTotal'].to_pandas()  # W/m²
    sim_et_mm_day = sim_le * 0.0353  # Convert to mm/day
    
    print(f"   ✅ SUMMA LE extracted and converted")
    print(f"   Range: {sim_le.min():.1f} to {sim_le.max():.1f} W/m²")
    print(f"   ET equivalent: {sim_et_mm_day.min():.2f} to {sim_et_mm_day.max():.2f} mm/day")
    
    # Find common period and align data
    start_date = max(fluxnet_df.index.min(), sim_et_mm_day.index.min())
    end_date = min(fluxnet_df.index.max(), sim_et_mm_day.index.max())
    
    print(f"\n🔄 Data Alignment:")
    print(f"   Common period: {start_date} to {end_date}")
    print(f"   Duration: {(end_date - start_date).days} days")
    
    # Resample to daily and filter to common period
    obs_daily = fluxnet_df['ET_from_LE_mm_per_day'].resample('D').mean().loc[start_date:end_date]
    sim_daily = sim_et_mm_day.resample('D').mean().loc[start_date:end_date]
    
    # Remove NaN values for metrics calculation
    valid_mask = ~(obs_daily.isna() | sim_daily.isna())
    obs_valid = obs_daily[valid_mask]
    sim_valid = sim_daily[valid_mask]
    
    print(f"   Valid paired observations: {len(obs_valid)} days")
    
    # Calculate performance metrics
    print(f"\n📊 Evapotranspiration Performance Metrics:")
    
    rmse = np.sqrt(((obs_valid - sim_valid) ** 2).mean())
    bias = (sim_valid - obs_valid).mean()
    mae = np.abs(obs_valid - sim_valid).mean()
    corr = obs_valid.corr(sim_valid)
    nse = 1 - ((obs_valid - sim_valid) ** 2).sum() / ((obs_valid - obs_valid.mean()) ** 2).sum()
    
    print(f"   📈 RMSE: {rmse:.2f} mm/day")
    print(f"   📈 Bias: {bias:+.2f} mm/day")
    print(f"   📈 MAE: {mae:.2f} mm/day") 
    print(f"   📈 Correlation: {corr:.3f}")
    print(f"   📈 Nash-Sutcliffe Efficiency: {nse:.3f}")
    
    # Seasonal analysis
    print(f"\n🗓️ Seasonal ET Performance:")
    seasonal_data = pd.DataFrame({
        'obs': obs_valid,
        'sim': sim_valid,
        'month': obs_valid.index.month
    })
    
    seasonal_stats = seasonal_data.groupby('month').apply(
        lambda x: pd.Series({
            'obs_mean': x['obs'].mean(),
            'sim_mean': x['sim'].mean(),
            'bias': x['sim'].mean() - x['obs'].mean(),
            'corr': x['obs'].corr(x['sim']) if len(x) > 3 else np.nan
        })
    )
    
    for season, label in [(12, 'Winter'), (3, 'Spring'), (6, 'Summer'), (9, 'Fall')]:
        if season in seasonal_stats.index:
            stats = seasonal_stats.loc[season]
            print(f"   {label}: Obs={stats['obs_mean']:.2f}, Sim={stats['sim_mean']:.2f} mm/day, "
                  f"Bias={stats['bias']:+.2f}, r={stats['corr']:.3f}")
    
    # Create comprehensive ET visualization
    print(f"\n📈 Creating ET comparison visualization...")
    
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    # Time series comparison
    ax1 = axes[0, 0]
    ax1.plot(obs_daily.index, obs_daily, 'o-', label='FLUXNET ET', 
             color='blue', alpha=0.7, markersize=2, linewidth=1)
    ax1.plot(sim_daily.index, sim_daily, '-', label='SUMMA ET', 
             color='red', linewidth=2)
    ax1.set_title('Evapotranspiration Time Series', fontweight='bold')
    ax1.set_ylabel('ET (mm/day)')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # Scatter plot
    ax2 = axes[0, 1]
    ax2.scatter(obs_valid, sim_valid, alpha=0.6, c='green', s=20)
    max_val = max(obs_valid.max(), sim_valid.max())
    ax2.plot([0, max_val], [0, max_val], 'k--', label='1:1 line')
    ax2.set_xlabel('FLUXNET ET (mm/day)')
    ax2.set_ylabel('SUMMA ET (mm/day)')
    ax2.set_title('Observed vs. Simulated ET', fontweight='bold')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    # Add metrics text
    metrics_text = f'r = {corr:.3f}\nRMSE = {rmse:.2f}\nBias = {bias:+.2f}'
    ax2.text(0.05, 0.95, metrics_text, transform=ax2.transAxes,
             bbox=dict(facecolor='white', alpha=0.8), fontsize=10, verticalalignment='top')
    
    # Monthly climatology
    ax3 = axes[1, 0]
    monthly_obs = obs_valid.groupby(obs_valid.index.month).mean()
    monthly_sim = sim_valid.groupby(sim_valid.index.month).mean()
    months = range(1, 13)
    month_names = ['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D']
    
    ax3.plot(months, monthly_obs, 'o-', label='FLUXNET', color='blue', linewidth=2)
    ax3.plot(months, monthly_sim, 'o-', label='SUMMA', color='red', linewidth=2)
    ax3.set_xticks(months)
    ax3.set_xticklabels(month_names)
    ax3.set_ylabel('ET (mm/day)')
    ax3.set_title('Monthly ET Climatology', fontweight='bold')
    ax3.legend()
    ax3.grid(True, alpha=0.3)
    
    # Residuals
    ax4 = axes[1, 1]
    residuals = sim_valid - obs_valid
    ax4.scatter(obs_valid.index, residuals, alpha=0.6, c='purple', s=15)
    ax4.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    ax4.axhline(y=residuals.std(), color='red', linestyle='--', alpha=0.5)
    ax4.axhline(y=-residuals.std(), color='red', linestyle='--', alpha=0.5)
    ax4.set_ylabel('Residuals (mm/day)')
    ax4.set_title('Model Residuals', fontweight='bold')
    ax4.grid(True, alpha=0.3)
    
    plt.suptitle(f'Evapotranspiration Evaluation - {config_dict["DOMAIN_NAME"]}', 
                 fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()

else:
    print("⚠️  Cannot perform ET evaluation - missing simulation or observation data")

# =============================================================================
# ET COMPONENT ANALYSIS
# =============================================================================

if available_et_components:
    print(f"\n🍃 ET Component Process Analysis...")
    
    # Extract and convert ET components
    et_comp_data = {}
    for comp_var, description in available_et_components.items():
        comp_ts = evaluation_data[comp_var].to_pandas()
        # Convert from kg m-2 s-1 to mm/day
        comp_ts_mm_day = comp_ts * 86400
        et_comp_data[comp_var] = comp_ts_mm_day
        
        print(f"   🌱 {comp_var}: {comp_ts_mm_day.mean():.3f} ± {comp_ts_mm_day.std():.3f} mm/day")
    
    # Create component visualization
    print(f"\n📊 Creating ET component analysis...")
    
    fig, axes = plt.subplots(2, 1, figsize=(14, 10))
    
    # Component time series
    ax1 = axes[0]
    colors = plt.cm.Set2.colors
    
    for i, (comp_var, comp_data) in enumerate(et_comp_data.items()):
        monthly_comp = comp_data.resample('M').mean()
        ax1.plot(monthly_comp.index, monthly_comp.values, 
                label=comp_var.replace('scalar', ''), 
                color=colors[i % len(colors)], linewidth=2)
    
    ax1.set_title('SUMMA ET Components (Monthly Means)', fontweight='bold')
    ax1.set_ylabel('ET Component (mm/day)')
    ax1.legend(loc='upper right')
    ax1.grid(True, alpha=0.3)
    
    # Component seasonal climatology
    ax2 = axes[1]
    monthly_means = {}
    for comp_var, comp_data in et_comp_data.items():
        monthly_mean