# CONFLUENCE Tutorial - 1: Point-Scale Workflow (SNOTEL Example)

## Introduction

### The Scientific Importance of Point-Scale Modeling

Point-scale modeling represents the fundamental building block of distributed hydrological modeling, where vertical energy and water balance processes are simulated at a single location without the complexities of lateral flow routing. This approach is scientifically valuable for several reasons:

1. **Process Understanding**: Point-scale simulations isolate vertical processes (precipitation, evapotranspiration, snowmelt, infiltration, and soil moisture dynamics), allowing researchers to evaluate model physics without confounding effects from spatial heterogeneity and routing processes.

2. **Model Validation**: Single-point simulations provide controlled conditions for testing model assumptions and parameter sensitivity, serving as a prerequisite for successful distributed modeling applications.

3. **Observational Constraints**: Point-scale modeling leverages high-quality, long-term observational datasets to constrain model parameters and validate process representations before scaling to larger, distributed domains.

### Case Study: Paradise SNOTEL Station

This tutorial demonstrates  point-scale simulations in CONFLUENCE using the Paradise SNOTEL station (ID: 602) in Washington State. Located at 1,630 m elevation in the Cascade Range, this site represents a transitional snow climate with observations of both Snow Water Equivilalent (SWE) and Soil Moisture (SM) at four depths.

## Learning Objectives

By the end of this tutorial, you will:

1. **Understand CONFLUENCE architecture**: Learn how the modular framework manages complex hydrological modeling workflows
2. **Configure point-scale simulations**: Set up CONFLUENCE for single-point SUMMA simulations with minimal spatial complexity
3. **Evaluate model performance**: Compare simulated and observed snow water equivalent and soil moisture using quantitative metrics
4. **Interpret results scientifically**: Analyze model-observation discrepancies in the context of process representation and parameter uncertainty
5. **Assess workflow efficiency**: Experience CONFLUENCE's automated workflow management and reproducible modeling practices

This foundation in point-scale modeling prepares you for more complex distributed modeling applications while building confidence in model physics and parameter estimation approaches.

In [1]:
# Import required libraries
import sys
import os
from pathlib import Path
import yaml
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd
from datetime import datetime
import contextily as cx
import xarray as xr
import numpy as np

# Add CONFLUENCE to path
confluence_path = Path('../').resolve()
sys.path.append(str(confluence_path))

# Import main CONFLUENCE class
from CONFLUENCE import CONFLUENCE

# Set up plotting style
plt.style.use('default')
%matplotlib inline

## Create Point-Scale Configuration

We'll use the default point configuration template in 0_config_files/0_config_point_template.yaml as our baseline. 

We copy the template to a new location and update the key configuration settings for our specific experiment.

In [None]:
# Set directory paths
CONFLUENCE_CODE_DIR = confluence_path
CONFLUENCE_DATA_DIR = Path('/Users/darrieythorsson/compHydro/data/CONFLUENCE_data')  # ← User should modify this path

# Load template configuration
config_template_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_point_template.yaml'

# Read config file
with open(config_template_path, 'r') as f:
    config_dict = yaml.safe_load(f)

# Update paths and settings 
config_dict['CONFLUENCE_CODE_DIR'] = str(CONFLUENCE_CODE_DIR)
config_dict['CONFLUENCE_DATA_DIR'] = str(CONFLUENCE_DATA_DIR)

# Update name and experiment id
config_dict['DOMAIN_NAME'] = 'paradise'
config_dict['EXPERIMENT_ID'] = 'point_scale_tutorial'

# Save point-scale configuration to temporary file
temp_config_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_point_notebook.yaml'
with open(temp_config_path, 'w') as f:
    yaml.dump(config_dict, f)

# Initialize CONFLUENCE with the new configuration
confluence = CONFLUENCE(temp_config_path)

# Print summary of the key settings
print("=== Point-Scale Configuration ===")
print(f"Domain Name: {config_dict['DOMAIN_NAME']}")
print(f"Spatial Mode: {config_dict['DOMAIN_DEFINITION_METHOD']}")
print(f"Location: {config_dict['POUR_POINT_COORDS']}")
print(f"Period: {config_dict['EXPERIMENT_TIME_START']} to {config_dict['EXPERIMENT_TIME_END']}")
print(f"Forcing Data: {config_dict['FORCING_DATASET']}")

## 3. Setup Project Structure

1. We setup the basic directory structure under the root data directory specified in the CONFLUENCE_DATA_DIR setting in the configuration file
2. We create a point shapefile from the coordinates given in POUR_POINT_COORDS

In [None]:
# Step 1: Project Initialization
print("=== Step 1: Project Initialization ===")
print("Creating point-scale project structure...")

# Setup project
project_dir = confluence.managers['project'].setup_project()

# Create pour point (in this case, our SNOTEL location)
pour_point_path = confluence.managers['project'].create_pour_point()

# List created directories
print("\nCreated directories:")
for item in sorted(project_dir.iterdir()):
    if item.is_dir():
        print(f"  📁 {item.name}")

## 4. Geospatial Domain Definition - Data acquisition 
Now we will acquire the geospatial attributes we need to setup our model

- Elevation
- Land Cover
- Soil Classifications

We use the Model Agnostic Framework [gistool (Keshavarz et al., 2025)](https://github.com/CH-Earth/gistool) to subset the data based on the coordinates set in: BOUNDING_BOX_COORDS

When SPATIAL_MODE is set to Point CONFLUENCE automatically updates the BOUNDING_BOX_COORDS to a square buffer that is by default 0.001 degrees, the point buffer distance setting can be set in the POINT_BUFFER_DISTANCE setting

In [None]:
# Acquire attributes
print("Acquiring geospatial attributes for point location...")

print(f"Minimal bounding box: {confluence.config.get('BOUNDING_BOX_COORDS')}")

#confluence.managers['data'].acquire_attributes()

## 6. Geospatial Domain Definition - Domain creation

Confluence creates the required shapefiles to pre-process and configure the models

When run in point configuration a square polygon is produced in path/to/domain_dir/shapefiles/catchment

In [None]:
# Geospatial Domain Definition and Analysis
print("=== Step 2: Geospatial Domain Definition and Analysis ===")

# Define domain
print("\nDefining minimal domain for point-scale simulation...")
watershed_path = confluence.managers['domain'].define_domain()
# Discretize domain (single HRU for point-scale)
print("\nCreating single HRU for point-scale simulation...")
hru_path = confluence.managers['domain'].discretize_domain()

# Check outputs
print("\nDomain definition complete:")
print(f"  - Watershed defined: {watershed_path is not None}")
print(f"  - HRUs created: {hru_path is not None}")

## 7. Model Agnostic Data Pre-Processing - Data Acquisition
Next we need meteorological data to run our models and observations to compare them with. 

We use the Model Agnostic Framework [datatool (Keshavarz et al., 2025)](https://github.com/CH-Earth/datatool) to subset the forcing dataset defined in FORCING_DATASET: for the spatial bounding box for the period configured between EXPERIMENT_TIME_START and EXPERIMENT_TIME_END.

A temperature lapse rate can be applied to the forcing data by setting APPLY_LAPSE_RATE: True, the lapse rate can be set with the LAPSE_RATE which defaults to 0.0065 K m-1

In [None]:
# Step 3: Model Agnostic Data Pre-Processing
print("=== Step 3: Model Agnostic Data Pre-Processing ===")

# Acquire forcings
print(f"\nAcquiring forcing data for point location...")
print(f"Dataset: {confluence.config['FORCING_DATASET']}")
#confluence.managers['data'].acquire_forcings()

MAF has the snotel dataset in storage and confluence can aquire data based on station id by setting DOWNLOAD_SNOTEL: 'true' and the appropriate setting for SNOTEL_STATION, which in our case is '602'. The path to the SNOTEL data can be set manually with the SNOTEL_PATH configuration 


In [None]:
# Process observed data 
print("Processing observed data...")
confluence.managers['data'].process_observed_data()

## 8. Model Agnostic Data Pre-Processing - Remapping and zonal statistics

Remapping of the forcing data and zonal statistics calcuations for the geospatial attributes is preformed in one model agnostic pre-processing step. 

The forcing data are remapped onto the defined hydrofabric using [EASYMORE (Gherari et al., 2023)](https://www.sciencedirect.com/science/article/pii/S2352711023002431)
Zonal statistics are run using [rasterstats]('https://pypi.org/project/rasterstats/')

In [None]:
# Run model-agnostic preprocessing
print("\nRunning model-agnostic preprocessing...")

confluence.managers['data'].run_model_agnostic_preprocessing()

print("\nModel-agnostic preprocessing complete")

## 9. Model Specific - Pre Processing 

Using the model agnostic output the model specific input files are prepared in one model specific preprocessing step

In [None]:
# Step 4: Model Specific Processing and Initialization
print("=== Step 4: Model Specific Processing and Initialization ===")

confluence.managers['model'].preprocess_models()

print("\nModel-specific preprocessing complete")

## 10. Model Specific - Initialisation

Once the model input files have been created the models are instantiated with their default configurations

In [None]:
# Run models
print(f"\nRunning {confluence.config['HYDROLOGICAL_MODEL']} for point-scale simulation...")
confluence.managers['model'].run_models()

print("\nPoint-scale model run complete")

## 11 - Result visualisation

Now let's look how our simulations turned out

In [None]:
# Step 11: Visualize Observed vs. Simulated SWE
print("=== Step 11: Comparing Observed vs. Simulated SWE ===")

# 1. Load the observed SWE data
obs_swe_path = Path(config_dict['CONFLUENCE_DATA_DIR']) / f"domain_{config_dict['DOMAIN_NAME']}" / "observations" / "snow" / "snotel" / "processed" / f"{config_dict['DOMAIN_NAME']}_swe_processed.csv"

# Check if observed data exists
if not obs_swe_path.exists():
    print(f"Warning: Observed SWE data not found at {obs_swe_path}")
    print("Checking for alternative locations...")
    # Try to find data in parent directories
    alt_paths = list(Path(config_dict['CONFLUENCE_DATA_DIR']).glob(f"**/observations/snow/swe/*_swe_processed.csv"))
    if alt_paths:
        obs_swe_path = alt_paths[0]
        print(f"Found alternative SWE data at: {obs_swe_path}")
    else:
        print("No observed SWE data found. Only simulated data will be displayed.")

# Load observed SWE data if available
if obs_swe_path.exists():
    print(f"Loading observed SWE data from: {obs_swe_path}")
    obs_swe = pd.read_csv(obs_swe_path, parse_dates=['Date'])
    obs_swe.set_index('Date', inplace=True)
    
    # Ensure the index is a proper DatetimeIndex
    if not isinstance(obs_swe.index, pd.DatetimeIndex):
        print("Converting index to DatetimeIndex...")
        # Try different date formats
        try:
            obs_swe.index = pd.to_datetime(obs_swe.index, format='%d/%m/%Y')
        except:
            try:
                obs_swe.index = pd.to_datetime(obs_swe.index, format='%m/%d/%Y')
            except:
                obs_swe.index = pd.to_datetime(obs_swe.index, infer_datetime_format=True)
    
    print(f"Observed data period: {obs_swe.index.min()} to {obs_swe.index.max()}")
    print(f"Observed SWE range: {obs_swe['SWE'].min():.2f} to {obs_swe['SWE'].max():.2f} mm")
else:
    obs_swe = None

# 2. Load the simulated SWE data
sim_path = Path(config_dict['CONFLUENCE_DATA_DIR']) / f"domain_{config_dict['DOMAIN_NAME']}" / "simulations" / config_dict['EXPERIMENT_ID'] / "SUMMA" / f"{config_dict['EXPERIMENT_ID']}_day.nc"

# Check for alternative NetCDF file patterns if not found
if not sim_path.exists():
    print(f"Simulated data not found at {sim_path}")
    print("Checking for alternative NetCDF files...")
    alt_sim_paths = list(Path(config_dict['CONFLUENCE_DATA_DIR']).glob(f"domain_{config_dict['DOMAIN_NAME']}/simulations/{config_dict['EXPERIMENT_ID']}/SUMMA/*day.nc"))
    
    if alt_sim_paths:
        sim_path = alt_sim_paths[0]
        print(f"Found alternative simulation data at: {sim_path}")
    else:
        raise FileNotFoundError(f"No simulation results found for experiment {config_dict['EXPERIMENT_ID']}")

# Load simulated data
print(f"Loading simulated data from: {sim_path}")
ds = xr.open_dataset(sim_path)

# Skip the first year as spinup
start_year = ds.time.dt.year.min().values + 1
spinup_end = f"{start_year}-01-01"
print(f"Skipping spinup period before: {spinup_end}")

# Filter data after spinup
time_mask = ds.time >= pd.to_datetime(spinup_end)
ds_filtered = ds.isel(time=time_mask)

# Extract scalarSWE and convert to DataFrame
sim_swe = ds_filtered['scalarSWE'].to_dataframe().reset_index()
# Assuming first HRU for point-scale simulation
sim_swe = sim_swe[sim_swe['hru'] == 1][['time', 'scalarSWE']]
sim_swe.columns = ['Date', 'SWE']
sim_swe.set_index('Date', inplace=True)
print(f"Simulated data period (after spinup): {sim_swe.index.min()} to {sim_swe.index.max()}")
print(f"Simulated SWE range: {sim_swe['SWE'].min():.2f} to {sim_swe['SWE'].max():.2f} mm")

# 3. Find common date range if observed data exists
if obs_swe is not None:
    # Ensure same frequency for both datasets
    obs_swe = obs_swe.resample('D').mean()  # Daily mean if multiple obs per day
    sim_swe = sim_swe.resample('D').mean()  # Daily mean if sub-daily sim data
    
    # Find common date range
    start_date = max(obs_swe.index.min(), sim_swe.index.min())
    end_date = min(obs_swe.index.max(), sim_swe.index.max())
    
    print(f"\nCommon data period: {start_date} to {end_date}")
    
    # Filter to common period
    obs_period = obs_swe.loc[start_date:end_date]
    sim_period = sim_swe.loc[start_date:end_date]
    
    # Calculate performance metrics
    rmse = np.sqrt(((obs_period['SWE'] - sim_period['SWE']) ** 2).mean())
    bias = (sim_period['SWE'] - obs_period['SWE']).mean()
    corr = obs_period['SWE'].corr(sim_period['SWE'])
    
    print(f"Performance metrics:")
    print(f"  - RMSE: {rmse:.2f} mm")
    print(f"  - Bias: {bias:.2f} mm")
    print(f"  - Correlation: {corr:.2f}")
    
    # 4. Visualize the comparison
    plt.figure(figsize=(12, 6))
    
    # Plot both time series
    plt.plot(obs_period.index, obs_period['SWE'], 'o-', label='Observed SWE', color='black', alpha=0.7, markersize=4)
    plt.plot(sim_period.index, sim_period['SWE'], '-', label='Simulated SWE', color='blue', linewidth=2)
        
    # Styling
    plt.title(f"SWE Comparison at {config_dict['DOMAIN_NAME'].replace('_', ' ').title()}", fontsize=14)
    plt.xlabel('Date', fontsize=12)
    plt.ylabel('Snow Water Equivalent (mm)', fontsize=12)
    plt.grid(True, alpha=0.3)
    plt.legend(fontsize=12)
    
    # Add annotation with metrics
    plt.text(0.02, 0.95, f"RMSE: {rmse:.2f} mm\nBias: {bias:.2f} mm\nCorr: {corr:.2f}", 
             transform=plt.gca().transAxes, fontsize=12, 
             bbox=dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.5'))
    
    plt.tight_layout()
    plt.show()
    
    # 5. Scatter plot
    plt.figure(figsize=(8, 8))
    plt.scatter(obs_period['SWE'], sim_period['SWE'], color='blue', alpha=0.7)
    
    # Add 1:1 line
    max_val = max(obs_period['SWE'].max(), sim_period['SWE'].max())
    plt.plot([0, max_val], [0, max_val], 'k--', label='1:1 line')
    
    # Styling
    plt.title(f"Observed vs. Simulated SWE", fontsize=14)
    plt.xlabel('Observed SWE (mm)', fontsize=12)
    plt.ylabel('Simulated SWE (mm)', fontsize=12)
    plt.grid(True, alpha=0.3)
    plt.legend(fontsize=12)
    plt.axis('equal')
    
    # Add annotation with metrics
    plt.text(0.02, 0.95, f"RMSE: {rmse:.2f} mm\nBias: {bias:.2f} mm\nCorr: {corr:.2f}", 
             transform=plt.gca().transAxes, fontsize=12, 
             bbox=dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.5'))
    
    plt.tight_layout()
    plt.show()

else:
    # If no observed data, just plot simulated
    plt.figure(figsize=(12, 6))
    plt.plot(sim_swe.index, sim_swe['SWE'], '-', label='Simulated SWE', color='blue', linewidth=2)
    plt.title(f"Simulated SWE at {config_dict['DOMAIN_NAME'].replace('_', ' ').title()}", fontsize=14)
    plt.xlabel('Date', fontsize=12)
    plt.ylabel('Snow Water Equivalent (mm)', fontsize=12)
    plt.grid(True, alpha=0.3)
    plt.legend(fontsize=12)
    plt.tight_layout()
    plt.show()

# Close the dataset
ds.close()

print("\nSWE visualization complete")

Now this SNOTEL station also has an ISMN soil moisture probe. Let's compare our simulations to the observed soil moisture over the depth profile

In [None]:
# Step 12: Visualize Observed vs. Simulated Soil Moisture

# 1. Load the observed SM data
obs_sm_path = Path(config_dict['CONFLUENCE_DATA_DIR']) / f"domain_{config_dict['DOMAIN_NAME']}" / "observations" / "soil_moisture" / "ismn" / "pre_processed" / f"{config_dict['DOMAIN_NAME']}_sm_processed.csv"

# Load observed SM data
if obs_sm_path.exists():
    print(f"Loading observed SM data from: {obs_sm_path}")
    obs_sm = pd.read_csv(obs_sm_path, parse_dates=['timestamp'])
    obs_sm.set_index('timestamp', inplace=True)
    print(f"Observed data period: {obs_sm.index.min()} to {obs_sm.index.max()}")
    
    # Get observed depths from column names
    obs_depths = [col for col in obs_sm.columns if col.startswith('sm_')]
    print(f"Available observed depths: {obs_depths}")
    
    # Print ranges for each depth
    for depth_col in obs_depths:
        print(f"Observed soil moisture range at {depth_col}: {obs_sm[depth_col].min():.3f} to {obs_sm[depth_col].max():.3f}")
else:
    obs_sm = None
    obs_depths = []

# 2. Load the simulated SM data
sim_path = Path(config_dict['CONFLUENCE_DATA_DIR']) / f"domain_{config_dict['DOMAIN_NAME']}" / "simulations" / config_dict['EXPERIMENT_ID'] / "SUMMA" / f"{config_dict['EXPERIMENT_ID']}_day.nc"

# Load simulated data
print(f"Loading simulated data from: {sim_path}")
ds = xr.open_dataset(sim_path)

# Get layer depths and soil moisture data
layer_depths = ds['mLayerDepth'].isel(hru=0)  # First HRU
soil_moisture = ds['mLayerVolFracLiq'].isel(hru=0)  # First HRU

# Skip the first year as spinup
start_year = ds.time.dt.year.min().values + 1
spinup_end = f"{start_year}-01-01"
print(f"Skipping spinup period before: {spinup_end}")

# Filter data after spinup
time_mask = ds.time >= pd.to_datetime(spinup_end)
soil_moisture_filtered = soil_moisture.isel(time=time_mask)
layer_depths_filtered = layer_depths.isel(time=time_mask)

print(f"Simulated data period (after spinup): {soil_moisture_filtered.time.min().values} to {soil_moisture_filtered.time.max().values}")

# 3. Find common date range if observed data exists
if obs_sm is not None:
    # Ensure same frequency for both datasets
    obs_sm = obs_sm.resample('D').mean()  # Daily mean if multiple obs per day
    
    # Find common date range
    start_date = max(obs_sm.index.min(), pd.to_datetime(soil_moisture_filtered.time.min().values))
    end_date = min(obs_sm.index.max(), pd.to_datetime(soil_moisture_filtered.time.max().values))
    
    print(f"\nCommon data period: {start_date} to {end_date}")
    
    # Filter observed data to common period
    obs_period = obs_sm.loc[start_date:end_date]
    
    # Filter simulated data to common period
    sim_time_mask = (soil_moisture_filtered.time >= start_date) & (soil_moisture_filtered.time <= end_date)
    sim_sm_common = soil_moisture_filtered.isel(time=sim_time_mask)
    sim_depths_common = layer_depths_filtered.isel(time=sim_time_mask)
    
    # Convert simulated data to DataFrame for easier handling
    sim_df = sim_sm_common.to_dataframe().reset_index()
    depths_df = sim_depths_common.to_dataframe().reset_index()
    
    # Create figure with subplots for each observed depth
    n_depths = len(obs_depths)
    fig, axes = plt.subplots(n_depths, 1, figsize=(14, 4*n_depths))
    
    # If only one depth, make axes a list for consistency
    if n_depths == 1:
        axes = [axes]
    
    # Define depth mapping (observed depth to simulated layer)
    # Extract numeric depths from observed column names
    obs_depth_values = []
    for depth_col in obs_depths:
        # Extract the first depth value from column name like 'sm_0.0508_0.0508'
        depth_str = depth_col.split('_')[1]
        obs_depth_values.append(float(depth_str))
    
    # For each observed depth, find the closest simulated layer
    for i, (depth_col, obs_depth) in enumerate(zip(obs_depths, obs_depth_values)):
        ax = axes[i]
        
        # Find the closest simulated layer depth
        # Calculate mean layer depths over time to find the best match
        mean_layer_depths = sim_depths_common.mean(dim='time')
        
        # Filter out missing values (negative values)
        valid_layers = mean_layer_depths > 0
        if valid_layers.sum() > 0:
            valid_depths = mean_layer_depths.where(valid_layers)
            # Find closest layer
            depth_diff = np.abs(valid_depths - obs_depth)
            closest_layer_idx = depth_diff.argmin().values
            
            # Extract simulated data for this layer
            sim_layer_data = sim_sm_common.isel(midToto=closest_layer_idx)
            
            # Filter out missing values (negative values indicate missing data)
            sim_layer_data = sim_layer_data.where(sim_layer_data > -100)
            
            # Plot observed data
            ax.plot(obs_period.index, obs_period[depth_col], 'o-', 
                   label=f'Observed (depth: {obs_depth:.4f}m)', 
                   color='black', alpha=0.7, markersize=3)
            
            # Plot simulated data
            ax.plot(sim_layer_data.time, sim_layer_data.values, '-', 
                   label=f'Simulated (layer {closest_layer_idx}, depth: {valid_depths[closest_layer_idx].values:.4f}m)', 
                   color='blue', linewidth=2)
            
            # Calculate performance metrics
            # Align time series
            sim_resampled = sim_layer_data.resample(time='D').mean()
            sim_interp = sim_resampled.interp(time=obs_period.index)
            
            # Remove NaN values for metrics calculation
            valid_mask = ~(np.isnan(obs_period[depth_col]) | np.isnan(sim_interp.values))
            if valid_mask.sum() > 0:
                obs_valid = obs_period[depth_col][valid_mask]
                sim_valid = sim_interp.values[valid_mask]
                
                rmse = np.sqrt(((obs_valid - sim_valid) ** 2).mean())
                bias = (sim_valid - obs_valid).mean()
                corr = np.corrcoef(obs_valid, sim_valid)[0, 1]
                
                # Add annotation with metrics
                ax.text(0.02, 0.95, f"RMSE: {rmse:.3f}\nBias: {bias:.3f}\nCorr: {corr:.3f}", 
                       transform=ax.transAxes, fontsize=10, 
                       bbox=dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.3'))
        
        # Styling
        ax.set_title(f"Soil Moisture at {obs_depth:.4f}m depth", fontsize=12)
        ax.set_ylabel('Soil Moisture (m³/m³)', fontsize=10)
        ax.grid(True, alpha=0.3)
        ax.legend(fontsize=10)
        
        # Only add x-label to bottom subplot
        if i == n_depths - 1:
            ax.set_xlabel('Date', fontsize=10)
    
    plt.suptitle(f"Soil Moisture Comparison at {config_dict['DOMAIN_NAME'].replace('_', ' ').title()}", 
                 fontsize=14, y=0.98)
    plt.tight_layout()
    plt.show()
    
    # Create scatter plots for each depth
    fig, axes = plt.subplots(1, n_depths, figsize=(5*n_depths, 4))
    
    # If only one depth, make axes a list for consistency
    if n_depths == 1:
        axes = [axes]
    
    for i, (depth_col, obs_depth) in enumerate(zip(obs_depths, obs_depth_values)):
        ax = axes[i]
        
        # Find the closest simulated layer (same as above)
        mean_layer_depths = sim_depths_common.mean(dim='time')
        valid_layers = mean_layer_depths > 0
        if valid_layers.sum() > 0:
            valid_depths = mean_layer_depths.where(valid_layers)
            depth_diff = np.abs(valid_depths - obs_depth)
            closest_layer_idx = depth_diff.argmin().values
            
            sim_layer_data = sim_sm_common.isel(midToto=closest_layer_idx)
            sim_layer_data = sim_layer_data.where(sim_layer_data > -100)
            
            # Align time series for scatter plot
            sim_resampled = sim_layer_data.resample(time='D').mean()
            sim_interp = sim_resampled.interp(time=obs_period.index)
            
            # Remove NaN values
            valid_mask = ~(np.isnan(obs_period[depth_col]) | np.isnan(sim_interp.values))
            if valid_mask.sum() > 0:
                obs_valid = obs_period[depth_col][valid_mask]
                sim_valid = sim_interp.values[valid_mask]
                
                # Scatter plot
                ax.scatter(obs_valid, sim_valid, color='blue', alpha=0.6, s=20)
                
                # Add 1:1 line
                min_val = min(obs_valid.min(), sim_valid.min())
                max_val = max(obs_valid.max(), sim_valid.max())
                ax.plot([min_val, max_val], [min_val, max_val], 'k--', label='1:1 line')
                
                # Calculate and display metrics
                rmse = np.sqrt(((obs_valid - sim_valid) ** 2).mean())
                bias = (sim_valid - obs_valid).mean()
                corr = np.corrcoef(obs_valid, sim_valid)[0, 1]
                
                ax.text(0.02, 0.95, f"RMSE: {rmse:.3f}\nBias: {bias:.3f}\nCorr: {corr:.3f}", 
                       transform=ax.transAxes, fontsize=10, 
                       bbox=dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.3'))
        
        # Styling
        ax.set_title(f"Depth: {obs_depth:.4f}m", fontsize=12)
        ax.set_xlabel('Observed SM (m³/m³)', fontsize=10)
        ax.set_ylabel('Simulated SM (m³/m³)', fontsize=10)
        ax.grid(True, alpha=0.3)
        ax.legend(fontsize=10)
        ax.set_aspect('equal', adjustable='box')
    
    plt.suptitle(f"Observed vs. Simulated Soil Moisture Scatter Plots", fontsize=14)
    plt.tight_layout()
    plt.show()

else:
    # If no observed data, just plot simulated data for all layers
    print("No observed data available. Plotting simulated data only.")
    
    # Plot simulated data for first few layers
    plt.figure(figsize=(12, 8))
    
    # Select first 4 layers that have valid data
    valid_layers = []
    mean_depths = layer_depths_filtered.mean(dim='time')
    for i in range(min(4, soil_moisture_filtered.sizes['midToto'])):
        if mean_depths.isel(midToto=i) > 0:
            valid_layers.append(i)
    
    for i, layer_idx in enumerate(valid_layers):
        sim_layer = soil_moisture_filtered.isel(midToto=layer_idx)
        sim_layer = sim_layer.where(sim_layer > -100)  # Remove missing values
        mean_depth = mean_depths.isel(midToto=layer_idx).values
        
        plt.plot(sim_layer.time, sim_layer.values, '-', 
                label=f'Layer {layer_idx} (depth: {mean_depth:.4f}m)', 
                linewidth=2)
    
    plt.title(f"Simulated Soil Moisture at {config_dict['DOMAIN_NAME'].replace('_', ' ').title()}", fontsize=14)
    plt.xlabel('Date', fontsize=12)
    plt.ylabel('Soil Moisture (m³/m³)', fontsize=12)
    plt.grid(True, alpha=0.3)
    plt.legend(fontsize=12)
    plt.tight_layout()
    plt.show()

# Close the dataset
ds.close()

print("\nSoil Moisture visualization complete")

## Summary: Point-Scale Modeling Insights

