# CONFLUENCE Tutorial - 04f: NIWA Large Sample Study (Temperate Oceanic Streamflow Analysis)

## Introduction

This tutorial demonstrates systematic streamflow modeling across New Zealand's unique watersheds using the NIWA dataset (National Institute of Water and Atmospheric Research). Unlike previous tutorials focused on continental, global, and Arctic environments, this study addresses temperate oceanic streamflow prediction across diverse maritime landscape conditions characteristic of island hydrology in the Southern Hemisphere.

### NIWA Dataset

The NIWA dataset provides standardized data for New Zealand watersheds, representing a comprehensive collection of temperate oceanic hydrological data. The dataset includes harmonized meteorological forcing, quality-controlled daily discharge observations, and comprehensive catchment attributes spanning New Zealand's diverse terrain from coastal plains to alpine regions. Watersheds range from small coastal streams to large river systems draining the Southern Alps and encompass the full spectrum of temperate oceanic conditions while maintaining focus on natural and modified basins with reliable observational records.

### Temperate Oceanic Streamflow Modeling Challenges

Streamflow in New Zealand represents the integrated watershed response to precipitation, snowmelt, orographic effects, coastal influences, and complex terrain routing processes. Temperate oceanic analysis presents unique challenges including strong maritime climate influences with high precipitation variability, orographic enhancement from prevailing westerly winds, Southern Hemisphere seasonal patterns opposite to Northern Hemisphere datasets, diverse elevation gradients from sea level to alpine zones, complex terrain interactions between mountains and coastal areas, and climate sensitivity spanning subtropical to subantarctic conditions.

### Research Objectives

This tutorial addresses fundamental questions about maritime and orographic controls on streamflow generation, model performance across different elevation and coastal proximity gradients, parameter sensitivity to Southern Hemisphere climate forcing patterns, streamflow response to complex terrain and wind exposure dynamics, and oceanic influence on hydrological processes. The analysis employs multiple performance metrics including Nash-Sutcliffe efficiency, Kling-Gupta efficiency, seasonal bias assessment, and maritime-specific flow signature analysis.

### Methodological Framework

The approach involves strategic site selection across New Zealand's environmental gradients, standardized model configuration adaptable to temperate oceanic characteristics, automated batch processing execution across diverse terrain types, and systematic performance evaluation using Southern Hemisphere-appropriate metrics. Sites are selected to represent elevation diversity from coastal to alpine regions, climate gradients from subtropical north to subantarctic south, orographic exposure variation across different aspects and wind exposures, and diverse hydrological regimes including rain-dominated, mixed rain-snow, and snow-dominated systems.

### CONFLUENCE Advantages for Temperate Oceanic Analysis

CONFLUENCE provides consistent methodology across diverse temperate oceanic watersheds, automated processing capabilities adaptable to complex terrain and maritime climate variability, systematic quality control suitable for high-precipitation environments, and comprehensive uncertainty assessment for oceanic climate sensitivity. The framework emphasizes process-based modeling with flexible structure adaptable to maritime and orographic watershed characteristics, standardized output formats enabling comparison with global datasets, and robust performance evaluation suitable for Southern Hemisphere hydrological studies.

### Expected Outcomes

This tutorial demonstrates temperate oceanic watershed-scale configuration across New Zealand's diverse terrain, streamflow validation through comprehensive observed-simulated comparisons, performance analysis across elevation and maritime exposure gradients, identification of oceanic-specific versus universal hydrological patterns, and process diagnostics revealing orographic and maritime controls on watershed function. Results contribute to improved understanding of temperate oceanic hydrological controls, enhanced model development for maritime applications, and applications in island water resources assessment and climate change impact evaluation for temperate oceanic regions.

## Step 1: Temperate Oceanic Streamflow Experimental Design and Site Selection

Transitioning from Arctic analysis to temperate oceanic streamflow hydrology simulations, this step establishes the foundation for large sample hydrological modeling using the comprehensive NIWA dataset. We demonstrate how CONFLUENCE's workflow efficiency enables systematic streamflow evaluation across the full spectrum of New Zealand's unique temperate oceanic hydroclimate.

In [None]:
import sys
import os
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import subprocess
import yaml
from datetime import datetime
import seaborn as sns
import warnings
import glob
import geopandas as gpd
import netCDF4 as nc

# Set up plotting style for temperate oceanic watershed visualization
plt.style.use('default')
sns.set_palette("viridis")
%matplotlib inline
confluence_path = Path('../').resolve()

# Set directory paths
CONFLUENCE_CODE_DIR = confluence_path
CONFLUENCE_DATA_DIR = Path('/anvil/projects/x-ees240082/data/CONFLUENCE_data')  # Update this path
#CONFLUENCE_DATA_DIR = Path('/path/to/your/CONFLUENCE_data') 

# =============================================================================
# NIWA TEMPERATE OCEANIC TEMPLATE CONFIGURATION
# =============================================================================

# Load streamflow configuration template or create from base template
streamflow_config_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_template.yaml'
with open(streamflow_config_path, 'r') as f:
    config_dict = yaml.safe_load(f)

# Update for NIWA tutorial-specific settings
config_updates = {
    'CONFLUENCE_CODE_DIR': str(CONFLUENCE_CODE_DIR),
    'CONFLUENCE_DATA_DIR': str(CONFLUENCE_DATA_DIR / 'niwa'),
    'DOMAIN_NAME': 'niwa_template',
    'EXPERIMENT_ID': 'run_1',
    'EXPERIMENT_TIME_START': '2000-01-01 01:00',
    'EXPERIMENT_TIME_END': '2020-12-31 23:00',  # 20-year period for temperate oceanic analysis
    'DOMAIN_DEFINITION_METHOD': 'lumped',  # Use lumped approach for NZ watersheds
    'DOMAIN_DISCRETIZATION': 'GRUs',  # Use GRUs for lumped NZ watersheds
    'STREAMFLOW_DATA_PROVIDER': 'NIWA',  # National Institute of Water and Atmospheric Research
    'DOWNLOAD_USGS_DATA': False,
    'DOWNLOAD_WSC_DATA': False,
    'SIM_REACH_ID': 1,
    'EM_EARTH_PRCP_DIR': '/anvil/datasets/meteorological/EM-Earth/EM_Earth_v1/deterministic_hourly/prcp/Oceania',
    'EM_EARTH_TMEAN_DIR': '/anvil/datasets/meteorological/EM-Earth/EM_Earth_v1/deterministic_hourly/tmean/Oceania'
}

config_dict.update(config_updates)

# Save NIWA configuration template
niwa_config_path = CONFLUENCE_CODE_DIR / '0_config_files' / 'config_niwa_template.yaml'
with open(niwa_config_path, 'w') as f:
    yaml.dump(config_dict, f, default_flow_style=False, sort_keys=False)

print(f"NIWA temperate oceanic template configuration saved: {niwa_config_path}")

# =============================================================================
# LOAD AND EXAMINE NIWA TEMPERATE OCEANIC WATERSHED DATASET
# =============================================================================

print(f"\nLoading NIWA Temperate Oceanic Watershed Database...")

# NIWA data paths
niwa_data_dir = CONFLUENCE_DATA_DIR / 'misc-data' / 'NewZealand_Information' / 'Flow_data'
flowdata_nc = niwa_data_dir / 'Observed_Flow_DN2_3_01Jan1960-31Mar2024_12Aug2024 (1).nc'
metainfo_excel = niwa_data_dir / 'Metadata_flow_stations_Flood_forecasting_questionnaire.xlsx'
attrs_excel = niwa_data_dir / 'Station_data_catchment_attributes.xls'

# Load the NIWA watersheds database
try:
    if all(path.exists() for path in [metainfo_excel, attrs_excel]):
        meta_df = pd.read_excel(metainfo_excel)
        attrs_df = pd.read_excel(attrs_excel)
        
        # Merge metadata and attributes on Station_ID
        niwa_df = pd.merge(meta_df, attrs_df, on='Station_ID', how='inner')
        
        # Filter for good sites
        niwa_df = niwa_df[niwa_df['IsGoodSite'] == 1]
        
        print(f"Successfully loaded NIWA database: {len(niwa_df)} watersheds available")
        print(f"Metadata attributes: {meta_df.columns.tolist()[:8]}...")
        print(f"Catchment attributes: {attrs_df.columns.tolist()[:8]}...")
    else:
        raise FileNotFoundError("NIWA data files not found")
        
except FileNotFoundError:
    print(f"NIWA database not found, creating demonstration dataset...")
    
    # Create demonstration NIWA dataset for tutorial
    np.random.seed(42)
    n_watersheds = 60
    
    # Generate realistic New Zealand watershed locations
    # Focus on major hydrological regions across both islands
    regions = [
        # North Island regions
        {'name': 'Auckland_Northland', 'lat_range': (-35.0, -37.5), 'lon_range': (174.0, 175.5), 'n': 8, 'elevation_type': 'coastal', 'rainfall_type': 'high'},
        {'name': 'Waikato_BOP', 'lat_range': (-37.5, -39.0), 'lon_range': (175.0, 177.0), 'n': 10, 'elevation_type': 'mixed', 'rainfall_type': 'moderate'},
        {'name': 'Central_North', 'lat_range': (-38.5, -40.0), 'lon_range': (175.5, 176.5), 'n': 8, 'elevation_type': 'volcanic', 'rainfall_type': 'moderate'},
        {'name': 'Wellington_Taranaki', 'lat_range': (-39.0, -41.5), 'lon_range': (174.0, 176.0), 'n': 7, 'elevation_type': 'mixed', 'rainfall_type': 'high'},
        
        # South Island regions
        {'name': 'Nelson_Marlborough', 'lat_range': (-40.5, -42.0), 'lon_range': (172.5, 174.5), 'n': 6, 'elevation_type': 'mixed', 'rainfall_type': 'moderate'},
        {'name': 'West_Coast', 'lat_range': (-42.0, -44.5), 'lon_range': (169.0, 171.5), 'n': 8, 'elevation_type': 'alpine', 'rainfall_type': 'very_high'},
        {'name': 'Canterbury', 'lat_range': (-43.0, -45.0), 'lon_range': (170.5, 172.5), 'n': 8, 'elevation_type': 'alpine', 'rainfall_type': 'low'},
        {'name': 'Otago_Southland', 'lat_range': (-45.0, -47.0), 'lon_range': (168.0, 171.0), 'n': 5, 'elevation_type': 'alpine', 'rainfall_type': 'moderate'}
    ]
    
    watersheds_data = []
    station_id = 1000
    
    for region in regions:
        for i in range(region['n']):
            lat = np.random.uniform(region['lat_range'][0], region['lat_range'][1])
            lon = np.random.uniform(region['lon_range'][0], region['lon_range'][1])
            
            # Area based on typical New Zealand watersheds
            area = np.random.lognormal(np.log(200), 1.2)
            area = np.clip(area, 10, 5000)  # Clip to NZ range
            
            # Elevation varies by region and topographic setting
            if region['elevation_type'] == 'alpine':
                elevation = np.random.uniform(500, 2000)  # Alpine areas
            elif region['elevation_type'] == 'volcanic':
                elevation = np.random.uniform(300, 1500)  # Volcanic plateau
            elif region['elevation_type'] == 'mixed':
                elevation = np.random.uniform(100, 800)   # Mixed terrain
            else:  # coastal
                elevation = np.random.uniform(10, 400)    # Coastal areas
            
            # Climate characteristics - temperate oceanic
            # Temperature varies with latitude (north warmer than south)
            base_temp = 16.0 + (lat + 40) * 0.5 - elevation * 0.005  # Warmer north, cooler with elevation
            mat_temp = base_temp + np.random.uniform(-1, 1)
            
            # Precipitation varies dramatically with region and orography
            rainfall_map = {
                'very_high': np.random.uniform(3000, 8000),  # West Coast
                'high': np.random.uniform(1500, 3500),       # Exposed areas
                'moderate': np.random.uniform(800, 2000),    # Most areas
                'low': np.random.uniform(400, 1200)          # Rain shadow
            }
            
            map_precip = rainfall_map[region['rainfall_type']]
            
            # Distance to coast (affects maritime influence)
            # Approximate distance based on longitude (NZ is narrow)
            coast_distance = np.random.uniform(5, 80)  # km from coast
            
            # Snow fraction based on elevation and latitude
            if elevation > 1200:
                snow_fraction = np.random.uniform(0.3, 0.7)
            elif elevation > 600:
                snow_fraction = np.random.uniform(0.1, 0.4)
            else:
                snow_fraction = np.random.uniform(0.0, 0.2)
            
            # Derived characteristics
            pet = max(1, (mat_temp + 5) * 365 * 0.8)  # Higher PET for oceanic conditions
            aridity = pet / map_precip if map_precip > 0 else 10
            seasonality = np.random.uniform(0.2, 0.6)  # Moderate seasonality in oceanic climate
            
            # Vegetation coverage - typically high in NZ
            if region['elevation_type'] == 'alpine' and elevation > 1000:
                vegetation_frac = np.random.uniform(0.3, 0.7)  # Alpine areas
            elif region['rainfall_type'] in ['high', 'very_high']:
                vegetation_frac = np.random.uniform(0.7, 0.95)  # Wet areas (forest)
            else:
                vegetation_frac = np.random.uniform(0.5, 0.85)  # Pastoral/mixed
            
            # Wind exposure (affects evapotranspiration and precipitation)
            if 'West' in region['name'] or coast_distance < 20:
                wind_exposure = np.random.uniform(0.6, 1.0)  # High exposure
            else:
                wind_exposure = np.random.uniform(0.2, 0.7)  # Moderate exposure
            
            # Scale classification based on area
            if area < 100:
                scale = 'headwater'
            elif area < 500:
                scale = 'meso'
            elif area < 2000:
                scale = 'macro'
            else:
                scale = 'large'
            
            # Streamflow characteristics influenced by high precipitation
            base_runoff_coeff = np.random.uniform(0.4, 0.8)  # High runoff in NZ
            
            # Adjust runoff for precipitation amount
            if map_precip > 2500:
                runoff_coeff = min(0.9, base_runoff_coeff + 0.1)
            elif map_precip < 800:
                runoff_coeff = max(0.2, base_runoff_coeff - 0.2)
            else:
                runoff_coeff = base_runoff_coeff
            
            mean_q = area * map_precip * 0.001 * runoff_coeff / 31.536  # Convert to m³/s
            baseflow_index = np.random.uniform(0.2, 0.7)  # Variable baseflow
            
            # Flow regime classification
            if snow_fraction > 0.4:
                flow_regime = 'snow_dominated'
            elif snow_fraction > 0.15:
                flow_regime = 'mixed_snow_rain'
            elif map_precip > 2000:
                flow_regime = 'rain_dominated_wet'
            else:
                flow_regime = 'rain_dominated'
            
            # Climate classification for temperate oceanic conditions
            if mat_temp > 15 and map_precip > 1500:
                climate_class = 'Warm-Humid'
            elif mat_temp > 12:
                climate_class = 'Temperate-Humid'
            elif mat_temp > 8:
                climate_class = 'Cool-Humid'
            else:
                climate_class = 'Cold-Humid'
            
            # Island classification
            if lat > -41.0:
                island = 'North Island'
            else:
                island = 'South Island'
            
            # Create watershed entry
            watershed = {
                'Station_ID': station_id,
                'Station_name': f"{region['name']}_Station_{i+1:02d}",
                'latitude': round(lat, 4),
                'longitude': round(lon, 4),
                'catchment_area': round(area, 1),
                'mean_elevation': round(elevation, 0),
                'p_mean': round(map_precip, 0),  # Mean annual precipitation
                't_mean': round(mat_temp, 1),    # Mean annual temperature
                'pet_mean': round(pet, 0),       # Potential ET
                'aridity': round(aridity, 3),
                'seasonality_p': round(seasonality, 3),
                'frac_snow': round(snow_fraction, 3),
                'vegetation_frac': round(vegetation_frac, 3),
                'wind_exposure': round(wind_exposure, 3),
                'coast_distance': round(coast_distance, 1),
                'q_mean': round(mean_q, 2),
                'runoff_ratio': round(runoff_coeff, 3),
                'baseflow_index': round(baseflow_index, 3),
                'climate_class': climate_class,
                'flow_regime': flow_regime,
                'scale': scale,
                'region': region['name'],
                'island': island,
                'elevation_type': region['elevation_type'],
                'rainfall_type': region['rainfall_type'],
                'IsGoodSite': 1,
                'data_length': np.random.randint(15, 25),  # Years of data
                'data_quality': np.random.choice(['excellent', 'good', 'fair'], p=[0.4, 0.5, 0.1])
            }
            
            # Add CONFLUENCE formatting
            buffer = 0.1
            watershed['BOUNDING_BOX_COORDS'] = f"{lat + buffer}/{lon - buffer}/{lat - buffer}/{lon + buffer}"
            watershed['POUR_POINT_COORDS'] = f"{lat}/{lon}"
            watershed['Watershed_Name'] = f"NZ_{station_id}"
            
            watersheds_data.append(watershed)
            station_id += 1
    
    niwa_df = pd.DataFrame(watersheds_data)
    
    # Save demonstration dataset
    niwa_df.to_csv('niwa-metadata.csv', index=False)
    print(f"Created demonstration NIWA dataset: {len(niwa_df)} watersheds")

# Display basic dataset information
print(f"\nTemperate Oceanic Dataset Overview:")
print(f"  Total watersheds: {len(niwa_df)}")
print(f"  Columns: {len(niwa_df.columns)}")
print(f"  Column names: {', '.join(niwa_df.columns[:8])}...")

# Extract coordinates for analysis
if 'latitude' in niwa_df.columns and 'longitude' in niwa_df.columns:
    niwa_df['lat'] = niwa_df['latitude']
    niwa_df['lon'] = niwa_df['longitude']
    
    # Handle different area column names
    if 'catchment_area' in niwa_df.columns:
        niwa_df['drainage_area'] = niwa_df['catchment_area']
    elif 'area_km2' in niwa_df.columns:
        niwa_df['drainage_area'] = niwa_df['area_km2']
    else:
        niwa_df['drainage_area'] = 100  # Default value
    
    print(f"Coordinate extraction successful")
    print(f"  Latitude range: {niwa_df['lat'].min():.1f}° to {niwa_df['lat'].max():.1f}°S")
    print(f"  Longitude range: {niwa_df['lon'].min():.1f}° to {niwa_df['lon'].max():.1f}°E")
    print(f"  Drainage area range: {niwa_df['drainage_area'].min():.0f} to {niwa_df['drainage_area'].max():.0f} km²")

# =============================================================================
# TEMPERATE OCEANIC WATERSHED-SPECIFIC DATASET CHARACTERISTICS ANALYSIS
# =============================================================================

print(f"\nAnalyzing Temperate Oceanic Watershed Dataset Characteristics...")

# Island distribution
if 'island' in niwa_df.columns:
    island_counts = niwa_df['island'].value_counts()
    print(f"  Island distribution: {len(island_counts)} islands")
    for island, count in island_counts.items():
        print(f"    {island}: {count} watersheds")

# Elevation classification
if 'mean_elevation' in niwa_df.columns:
    niwa_df['elevation_class'] = 'Unknown'
    niwa_df.loc[niwa_df['mean_elevation'] < 200, 'elevation_class'] = 'Coastal'
    niwa_df.loc[(niwa_df['mean_elevation'] >= 200) & (niwa_df['mean_elevation'] < 600), 'elevation_class'] = 'Lowland'
    niwa_df.loc[(niwa_df['mean_elevation'] >= 600) & (niwa_df['mean_elevation'] < 1200), 'elevation_class'] = 'Upland'
    niwa_df.loc[niwa_df['mean_elevation'] >= 1200, 'elevation_class'] = 'Alpine'
    
    elevation_counts = niwa_df['elevation_class'].value_counts()
    print(f"  Elevation zones: {len(elevation_counts)}")
    for elevation_class, count in elevation_counts.items():
        print(f"    {elevation_class}: {count} watersheds")

# Regional distribution
if 'region' in niwa_df.columns:
    region_counts = niwa_df['region'].value_counts()
    print(f"  New Zealand regions: {len(region_counts)}")
    print(f"    Largest region: {region_counts.index[0]} ({region_counts.iloc[0]} watersheds)")

# Climate characteristics
if 'p_mean' in niwa_df.columns:
    precip_stats = niwa_df['p_mean'].describe()
    print(f"  Precipitation range: {precip_stats['min']:.0f} to {precip_stats['max']:.0f} mm/yr")

if 't_mean' in niwa_df.columns:
    temp_stats = niwa_df['t_mean'].describe()
    print(f"  Temperature range: {temp_stats['min']:.1f} to {temp_stats['max']:.1f} °C")

# Flow regime analysis
if 'flow_regime' in niwa_df.columns:
    regime_counts = niwa_df['flow_regime'].value_counts()
    print(f"  Flow regimes: {len(regime_counts)}")
    for regime, count in regime_counts.items():
        print(f"    {regime}: {count} watersheds")

# Streamflow characteristics
if 'q_mean' in niwa_df.columns:
    flow_stats = niwa_df['q_mean'].describe()
    print(f"  Mean streamflow range: {flow_stats['min']:.1f} to {flow_stats['max']:.1f} m³/s")

# =============================================================================
# NIWA TEMPERATE OCEANIC DATASET VISUALIZATION
# =============================================================================

print(f"\nCreating NIWA Temperate Oceanic Dataset Overview Visualization...")

# Create comprehensive temperate oceanic watershed dataset overview
fig, axes = plt.subplots(2, 3, figsize=(20, 12))

# 1. New Zealand watershed distribution map
ax1 = axes[0, 0]
if 'island' in niwa_df.columns:
    # Color by island
    island_colors = {'North Island': 'red', 'South Island': 'blue'}
    
    for island in niwa_df['island'].unique():
        subset = niwa_df[niwa_df['island'] == island]
        color = island_colors.get(island, 'gray')
        ax1.scatter(subset['lon'], subset['lat'], 
                   c=color, alpha=0.8, s=60, label=island, edgecolors='black', linewidth=0.5)
else:
    scatter = ax1.scatter(niwa_df['lon'], niwa_df['lat'], 
                         c=niwa_df['drainage_area'], cmap='viridis', 
                         alpha=0.8, s=60, edgecolors='black', linewidth=0.5)

ax1.set_xlabel('Longitude')
ax1.set_ylabel('Latitude')
ax1.set_title(f'NIWA New Zealand Watershed Distribution\\n({len(niwa_df)} watersheds)')
ax1.grid(True, alpha=0.3)
ax1.set_xlim(166, 179)
ax1.set_ylim(-47, -34)  # Focus on New Zealand
ax1.legend()

# 2. Elevation distribution
ax2 = axes[0, 1]
if 'elevation_class' in niwa_df.columns:
    elevation_counts = niwa_df['elevation_class'].value_counts()
    colors = ['lightblue', 'green', 'orange', 'brown']
    
    bars = ax2.bar(range(len(elevation_counts)), elevation_counts.values, 
                   color=colors[:len(elevation_counts)], alpha=0.8, edgecolor='black')
    ax2.set_xticks(range(len(elevation_counts)))
    ax2.set_xticklabels(elevation_counts.index, rotation=45, ha='right')
    ax2.set_ylabel('Number of Watersheds')
    ax2.set_title('Watersheds by Elevation Zone')
    ax2.grid(True, alpha=0.3, axis='y')
    
    # Add value labels on bars
    for bar, count in zip(bars, elevation_counts.values):
        ax2.text(bar.get_x() + bar.get_width()/2., bar.get_height() + 0.5,
                str(count), ha='center', va='bottom', fontweight='bold')

# 3. Flow regime distribution
ax3 = axes[0, 2]
if 'flow_regime' in niwa_df.columns:
    regime_counts = niwa_df['flow_regime'].value_counts()
    colors = ['lightcyan', 'lightblue', 'blue', 'darkblue']
    bars = ax3.bar(range(len(regime_counts)), regime_counts.values, 
                   color=colors[:len(regime_counts)], alpha=0.8, edgecolor='black')
    ax3.set_xticks(range(len(regime_counts)))
    ax3.set_xticklabels([r.replace('_', '-').title() for r in regime_counts.index], rotation=45, ha='right')
    ax3.set_ylabel('Number of Watersheds')
    ax3.set_title('Watersheds by Flow Regime')
    ax3.grid(True, alpha=0.3, axis='y')
    
    # Add value labels
    for bar, count in zip(bars, regime_counts.values):
        ax3.text(bar.get_x() + bar.get_width()/2., bar.get_height() + 0.5,
                str(count), ha='center', va='bottom', fontweight='bold')

# 4. Climate classification
ax4 = axes[1, 0]
if 'climate_class' in niwa_df.columns:
    climate_counts = niwa_df['climate_class'].value_counts()
    colors = ['red', 'orange', 'lightgreen', 'blue']
    bars = ax4.bar(range(len(climate_counts)), climate_counts.values,
                   color=colors[:len(climate_counts)], alpha=0.8, edgecolor='black')
    ax4.set_xticks(range(len(climate_counts)))
    ax4.set_xticklabels([c.replace('-', '\\n') for c in climate_counts.index], rotation=45, ha='right')
    ax4.set_ylabel('Number of Watersheds')
    ax4.set_title('Watersheds by Climate Zone')
    ax4.grid(True, alpha=0.3, axis='y')
    
    # Add value labels
    for bar, count in zip(bars, climate_counts.values):
        ax4.text(bar.get_x() + bar.get_width()/2., bar.get_height() + 0.5,
                str(count), ha='center', va='bottom', fontweight='bold')

# 5. Precipitation vs Elevation
ax5 = axes[1, 1]
if 'p_mean' in niwa_df.columns and 'mean_elevation' in niwa_df.columns:
    scatter5 = ax5.scatter(niwa_df['mean_elevation'], niwa_df['p_mean'], 
                          c=niwa_df['drainage_area'], cmap='viridis', 
                          alpha=0.7, s=50, edgecolors='black', linewidth=0.3)
    ax5.set_xlabel('Mean Elevation (m)')
    ax5.set_ylabel('Annual Precipitation (mm)')
    ax5.set_title('Oceanic Climate: Elevation vs Precipitation')
    ax5.grid(True, alpha=0.3)
    
    # Add precipitation thresholds
    ax5.axhline(y=2000, color='blue', linestyle='--', alpha=0.5, label='High rainfall')
    ax5.axhline(y=800, color='orange', linestyle='--', alpha=0.5, label='Low rainfall')
    ax5.legend()

# 6. Scale distribution by area
ax6 = axes[1, 2]
if 'scale' in niwa_df.columns:
    scale_counts = niwa_df['scale'].value_counts()
    colors = ['lightcyan', 'lightblue', 'blue', 'darkblue']
    bars = ax6.bar(range(len(scale_counts)), scale_counts.values,
                   color=colors[:len(scale_counts)], alpha=0.8, edgecolor='black')
    ax6.set_xticks(range(len(scale_counts)))
    ax6.set_xticklabels([s.capitalize() for s in scale_counts.index], rotation=45, ha='right')
    ax6.set_ylabel('Number of Watersheds')
    ax6.set_title('Watersheds by Scale')
    ax6.grid(True, alpha=0.3, axis='y')
    
    # Add value labels
    for bar, count in zip(bars, scale_counts.values):
        ax6.text(bar.get_x() + bar.get_width()/2., bar.get_height() + 0.5,
                str(count), ha='center', va='bottom', fontweight='bold')

plt.suptitle('NIWA Temperate Oceanic Watershed Dataset - Comprehensive Overview', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

print(f"\n✅ Step 1 Complete: NIWA Temperate Oceanic Dataset Analysis and Experimental Design")
print(f"   🏔️ New Zealand coverage: {len(niwa_df)} watersheds across diverse oceanic terrain")
if 'island' in niwa_df.columns:
    print(f"   🏝️ Island diversity: {', '.join(niwa_df['island'].unique())}")
if 'climate_class' in niwa_df.columns:
    print(f"   🌡️ Oceanic climate diversity: {', '.join(niwa_df['climate_class'].unique())}")
if 'flow_regime' in niwa_df.columns:
    print(f"   🌊 Oceanic flow regimes: {', '.join(niwa_df['flow_regime'].unique())}")
print(f"   📊 Configuration template created for temperate oceanic streamflow analysis")

## Step 2: Automated CONFLUENCE Configuration and Temperate Oceanic Batch Processing

Building on the temperate oceanic dataset analysis and configuration from Step 1, this step demonstrates automated large sample processing using the `run_watersheds_niwa.py` script. This script performs key functions for temperate oceanic hydrological modeling:

**Temperate Oceanic Configuration Generation**: The script reads the NIWA database and automatically creates individual CONFLUENCE configuration files for each New Zealand watershed. Each configuration is customized with site-specific parameters including domain coordinates, orographic exposure considerations, maritime climate settings, and Southern Hemisphere adaptations, while maintaining consistent model settings across all New Zealand basins.

**Maritime-Specialized Batch Job Submission**: The script submits SLURM jobs to execute the complete CONFLUENCE workflow for each basin optimized for temperate oceanic environments. Each job processes geographic data, prepares meteorological forcing using EM-Earth Oceania data, processes NIWA observations, runs the hydrological model with maritime considerations, and generates standardized output files suitable for Southern Hemisphere analysis.

This automated approach scales CONFLUENCE from Arctic modeling to systematic temperate oceanic analysis across New Zealand's unique maritime and orographic landscapes.

In [None]:
# =============================================================================
# TEMPERATE OCEANIC WATERSHED SELECTION AND CONFIGURATION
# =============================================================================

print(f"\n🏔️ Step 2.1: Temperate Oceanic Watershed Selection for CONFLUENCE Processing")

# Configuration for the temperate oceanic sample experiment
streamflow_config = {
    'dataset': 'niwa',
    'max_watersheds': 10,  # Start with manageable number for oceanic demonstration
    'dry_run_mode': True,  # Set to False to actually submit jobs
    'experiment_name': 'niwa_oceanic_tutorial',
    'template_config': str(niwa_config_path),
    'config_dir': str(CONFLUENCE_CODE_DIR / '0_config_files' / 'nz'),
    'base_data_path': str(CONFLUENCE_DATA_DIR / 'niwa'),
    'script_path': str(CONFLUENCE_CODE_DIR / 'examples' / 'run_watersheds_niwa.py'),
    'flowdata_nc': str(CONFLUENCE_DATA_DIR / 'misc-data' / 'NewZealand_Information' / 'Flow_data' / 'Observed_Flow_DN2_3_01Jan1960-31Mar2024_12Aug2024 (1).nc'),
    'metainfo_excel': str(CONFLUENCE_DATA_DIR / 'misc-data' / 'NewZealand_Information' / 'Flow_data' / 'Metadata_flow_stations_Flood_forecasting_questionnaire.xlsx'),
    'attrs_excel': str(CONFLUENCE_DATA_DIR / 'misc-data' / 'NewZealand_Information' / 'Flow_data' / 'Station_data_catchment_attributes.xls')
}

# Create experiment directory structure
experiment_dir = Path(f"./experiments/{streamflow_config['experiment_name']}")
(experiment_dir / 'plots').mkdir(parents=True, exist_ok=True)
(experiment_dir / 'reports').mkdir(parents=True, exist_ok=True)
(experiment_dir / 'configs').mkdir(parents=True, exist_ok=True)

# Save configuration
with open(experiment_dir / 'oceanic_experiment_config.yaml', 'w') as f:
    yaml.dump(streamflow_config, f, default_flow_style=False)

print(f"   📁 Experiment directory: {experiment_dir}")
print(f"   🏔️ Processing scope: {streamflow_config['max_watersheds']} temperate oceanic watersheds")
print(f"   🗂️ Template config: {streamflow_config['template_config']}")

# Temperate oceanic watershed selection strategy
print(f"\n🎯 Step 2.2: Strategic Temperate Oceanic Watershed Selection")

# Select watersheds to represent temperate oceanic diversity
def select_oceanic_representative_watersheds(niwa_df, max_watersheds=10):
    """
    Select watersheds to maximize temperate oceanic environmental diversity
    """
    print(f"   🔍 Selecting {max_watersheds} temperate oceanic representative watersheds...")
    
    selected_watersheds = []
    
    # Strategy 1: Ensure island representation
    if 'island' in niwa_df.columns:
        islands = niwa_df['island'].unique()
        watersheds_per_island = max(1, max_watersheds // len(islands))
        
        print(f"   🏝️ Island strategy: {watersheds_per_island} watersheds per island")
        
        for island in islands:
            island_data = niwa_df[niwa_df['island'] == island]
            
            if len(island_data) > 0:
                # Select diverse watersheds within island
                if len(island_data) <= watersheds_per_island:
                    selected = island_data
                else:
                    # Diversify by elevation and flow regime
                    selected = []
                    
                    # Elevation diversity
                    if 'elevation_class' in island_data.columns:
                        elevation_classes = island_data['elevation_class'].unique()
                        per_elevation = max(1, watersheds_per_island // len(elevation_classes))
                        
                        for elevation in elevation_classes:
                            elevation_subset = island_data[island_data['elevation_class'] == elevation]
                            if len(elevation_subset) > 0:
                                # Select by different flow regimes within elevation
                                if 'flow_regime' in elevation_subset.columns:
                                    regimes = elevation_subset['flow_regime'].unique()
                                    for regime in regimes[:per_elevation]:
                                        regime_subset = elevation_subset[elevation_subset['flow_regime'] == regime]
                                        if len(regime_subset) > 0:
                                            selected.append(regime_subset.iloc[0])
                                            if len(selected) >= watersheds_per_island:
                                                break
                                    if len(selected) >= watersheds_per_island:
                                        break
                                else:
                                    selected.extend(elevation_subset.head(per_elevation).to_dict('records'))
                            if len(selected) >= watersheds_per_island:
                                break
                    else:
                        # Random selection if no elevation data
                        selected = island_data.sample(n=min(watersheds_per_island, len(island_data)))
                    
                    selected = pd.DataFrame(selected) if isinstance(selected, list) else selected
                
                selected_watersheds.append(selected)
                
                print(f"     {island}: {len(selected)} watersheds selected")
    else:
        # Fallback: stratified selection by region
        if 'region' in niwa_df.columns:
            regions = niwa_df['region'].unique()
            per_region = max(1, max_watersheds // len(regions))
            
            for region in regions:
                region_data = niwa_df[niwa_df['region'] == region]
                selected = region_data.head(per_region)
                selected_watersheds.append(selected)
        else:
            # Final fallback: random selection
            selected_watersheds = [niwa_df.sample(n=min(max_watersheds, len(niwa_df)))]
    
    # Combine all selected watersheds
    if selected_watersheds:
        final_selection = pd.concat(selected_watersheds, ignore_index=True)
    else:
        # Fallback: random selection
        final_selection = niwa_df.sample(n=min(max_watersheds, len(niwa_df)))
    
    # Ensure we don't exceed max_watersheds
    if len(final_selection) > max_watersheds:
        final_selection = final_selection.head(max_watersheds)
    
    return final_selection

# Select representative watersheds
selected_watersheds = select_oceanic_representative_watersheds(niwa_df, streamflow_config['max_watersheds'])

print(f"\n📊 Temperate Oceanic Selection Summary:")
print(f"   Total selected: {len(selected_watersheds)} watersheds")

if 'island' in selected_watersheds.columns:
    island_summary = selected_watersheds['island'].value_counts()
    print(f"   Island distribution:")
    for island, count in island_summary.items():
        print(f"     {island}: {count} watersheds")

if 'elevation_class' in selected_watersheds.columns:
    elevation_summary = selected_watersheds['elevation_class'].value_counts()
    print(f"   Elevation diversity:")
    for elevation, count in elevation_summary.items():
        print(f"     {elevation}: {count} watersheds")

if 'flow_regime' in selected_watersheds.columns:
    regime_summary = selected_watersheds['flow_regime'].value_counts()
    print(f"   Flow regime diversity:")
    for regime, count in regime_summary.items():
        print(f"     {regime}: {count} watersheds")

if 'climate_class' in selected_watersheds.columns:
    climate_summary = selected_watersheds['climate_class'].value_counts()
    print(f"   Oceanic climate diversity:")
    for climate, count in climate_summary.items():
        print(f"     {climate}: {count} watersheds")

# Add required columns for CONFLUENCE processing
if 'Station_ID' in selected_watersheds.columns:
    selected_watersheds['ID'] = 'NZ_' + selected_watersheds['Station_ID'].astype(str)
if 'latitude' in selected_watersheds.columns:
    selected_watersheds['Lat'] = selected_watersheds['latitude']
if 'longitude' in selected_watersheds.columns:
    selected_watersheds['Lon'] = selected_watersheds['longitude']
if 'catchment_area' in selected_watersheds.columns:
    selected_watersheds['Area_km2'] = selected_watersheds['catchment_area']
elif 'drainage_area' in selected_watersheds.columns:
    selected_watersheds['Area_km2'] = selected_watersheds['drainage_area']
if 'scale' in selected_watersheds.columns:
    selected_watersheds['Scale'] = selected_watersheds['scale']

# Save selected watersheds
selected_watersheds_file = experiment_dir / 'selected_oceanic_watersheds.csv'
selected_watersheds.to_csv(selected_watersheds_file, index=False)
print(f"   💾 Selected watersheds saved: {selected_watersheds_file}")

# =============================================================================
# TEMPERATE OCEANIC PROCESSING VISUALIZATION
# =============================================================================

print(f"\n🗺️ Step 2.3: Temperate Oceanic Processing Setup Visualization")

# Create temperate oceanic processing setup map
fig, axes = plt.subplots(1, 2, figsize=(16, 8))

# Map 1: New Zealand overview with selected watersheds
ax1 = axes[0]

# Plot all available watersheds
ax1.scatter(niwa_df['lon'], niwa_df['lat'], 
           c='lightgray', alpha=0.4, s=30, label='Available watersheds')

# Plot selected watersheds with island colors
if 'island' in selected_watersheds.columns:
    island_colors = {
        'North Island': 'red',
        'South Island': 'blue'
    }
    
    for island in selected_watersheds['island'].unique():
        subset = selected_watersheds[selected_watersheds['island'] == island]
        color = island_colors.get(island, 'black')
        ax1.scatter(subset['lon'], subset['lat'], 
                   c=color, s=120, alpha=0.9, 
                   edgecolors='black', linewidth=2, 
                   label=f'Selected: {island}', marker='*')

ax1.set_xlabel('Longitude')
ax1.set_ylabel('Latitude')
ax1.set_title(f'NIWA Oceanic Processing Setup\\n{len(selected_watersheds)} Selected Watersheds')
ax1.grid(True, alpha=0.3)
ax1.set_xlim(166, 179)
ax1.set_ylim(-47, -34)
ax1.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# Map 2: Selection diversity analysis
ax2 = axes[1]

# Create diversity comparison
categories = []
all_counts = []
selected_counts = []

# Elevation diversity
if 'elevation_class' in niwa_df.columns:
    all_elevation = niwa_df['elevation_class'].value_counts()
    selected_elevation = selected_watersheds['elevation_class'].value_counts()
    
    for elevation_class in all_elevation.index:
        categories.append(elevation_class)
        all_counts.append(all_elevation[elevation_class])
        selected_counts.append(selected_elevation.get(elevation_class, 0))

# Plot diversity comparison
if categories:
    x_pos = np.arange(len(categories))
    width = 0.35

    bars1 = ax2.bar(x_pos - width/2, all_counts, width, 
                   label='Available', alpha=0.6, color='lightblue')
    bars2 = ax2.bar(x_pos + width/2, selected_counts, width,
                   label='Selected', alpha=0.8, color='darkblue')

    ax2.set_xlabel('Elevation Class')
    ax2.set_ylabel('Number of Watersheds')
    ax2.set_title('Oceanic Selection Representativeness')
    ax2.set_xticks(x_pos)
    ax2.set_xticklabels(categories, rotation=45, ha='right')
    ax2.legend()
    ax2.grid(True, alpha=0.3, axis='y')

    # Add value labels on bars
    for bar, count in zip(bars2, selected_counts):
        if count > 0:
            ax2.text(bar.get_x() + bar.get_width()/2., bar.get_height() + 0.1,
                    str(count), ha='center', va='bottom', fontweight='bold')

plt.suptitle('NIWA Temperate Oceanic Watershed Selection for CONFLUENCE Processing', 
             fontsize=14, fontweight='bold')
plt.tight_layout()

# Save the processing setup map
setup_map_path = experiment_dir / 'plots' / 'oceanic_processing_setup.png'
plt.savefig(setup_map_path, dpi=300, bbox_inches='tight')
plt.show()

print(f"✅ Oceanic processing setup map saved: {setup_map_path}")

# =============================================================================
# AUTOMATED NIWA PROCESSING EXECUTION
# =============================================================================

def execute_niwa_oceanic_processing():
    """
    Execute the run_watersheds_niwa.py script for temperate oceanic processing
    """
    print(f"\n🚀 Step 2.4: Executing NIWA Oceanic Processing Script")
    
    script_path = streamflow_config['script_path']
    
    if not Path(script_path).exists():
        print(f"❌ Script not found: {script_path}")
        print(f"   📝 Expected location: {script_path}")
        print(f"   🔍 Looking for alternative locations...")
        
        # Look for the script in common locations
        possible_paths = [
            CONFLUENCE_CODE_DIR / "examples" / "run_watersheds_niwa.py",
            CONFLUENCE_CODE_DIR / "scripts" / "run_watersheds_niwa.py", 
            CONFLUENCE_CODE_DIR / "run_watersheds_niwa.py",
            Path("./run_watersheds_niwa.py")
        ]
        
        for path in possible_paths:
            if path.exists():
                script_path = str(path)
                print(f"   ✅ Found script at: {script_path}")
                break
        else:
            print(f"   ⚠️ Script not found in expected locations")
            print(f"   📋 Creating demonstration execution log...")
            return create_demonstration_oceanic_processing_log()
    
    print(f"   📄 Script location: {script_path}")
    print(f"   🏔️ Target watersheds: {len(selected_watersheds)} temperate oceanic basins")
    print(f"   🕐 Processing started: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"   🔧 Mode: {'DRY RUN' if streamflow_config['dry_run_mode'] else 'PRODUCTION'}")
    
    try:
        # Prepare command for NIWA processing
        cmd = [
            'python', script_path,
            '--flowdata', streamflow_config.get('flowdata_nc', ''),
            '--metainfo', streamflow_config.get('metainfo_excel', ''),
            '--attributes', streamflow_config.get('attrs_excel', ''),
            '--template', streamflow_config['template_config'],
            '--output', streamflow_config['config_dir'],
            '--max', str(streamflow_config['max_watersheds'])
        ]
        
        if not streamflow_config['dry_run_mode']:
            cmd.append('--submit')
        else:
            cmd.append('--dry-run')
        
        print(f"   💻 Command: {' '.join(cmd)}")
        
        # Execute the script
        result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
        
        # Process results
        if result.returncode == 0:
            print(f"✅ NIWA processing script completed successfully")
            
            if result.stdout:
                print(f"\n📋 Script Output:")
                for line in result.stdout.split('\n')[:20]:  # Show first 20 lines
                    if line.strip():
                        print(f"   {line}")
                if len(result.stdout.split('\n')) > 20:
                    print(f"   ... (output truncated)")
            
            # Save execution log
            log_file = experiment_dir / 'oceanic_processing_execution.log'
            with open(log_file, 'w') as f:
                f.write(f"NIWA Oceanic Processing Execution Log\n")
                f.write(f"{'='*50}\n")
                f.write(f"Execution time: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
                f.write(f"Command: {' '.join(cmd)}\n")
                f.write(f"Return code: {result.returncode}\n\n")
                f.write("STDOUT:\n")
                f.write(result.stdout)
                if result.stderr:
                    f.write("\n\nSTDERR:\n")
                    f.write(result.stderr)
            
            print(f"   📁 Execution log saved: {log_file}")
            return True
            
        else:
            print(f"❌ Script failed with return code: {result.returncode}")
            if result.stderr:
                print(f"⚠️ Error output:")
                for line in result.stderr.split('\n')[:10]:
                    if line.strip():
                        print(f"   {line}")
            return False
            
    except subprocess.TimeoutExpired:
        print(f"⏰ Script execution timeout (5 minutes)")
        return False
    except Exception as e:
        print(f"❌ Error executing script: {e}")
        return False

def create_demonstration_oceanic_processing_log():
    """
    Create a demonstration processing log when script is not available
    """
    print(f"   📋 Creating demonstration oceanic processing log...")
    
    # Simulate processing results
    processing_results = {
        'total_selected': len(selected_watersheds),
        'configs_generated': len(selected_watersheds),
        'jobs_submitted': len(selected_watersheds) if not streamflow_config['dry_run_mode'] else 0,
        'estimated_completion': '2-4 hours per watershed (oceanic complexity)',
        'expected_outputs': [
            'Oceanic domain shapefiles with point buffers',
            'Southern Hemisphere meteorological forcing',
            'SUMMA simulation results with maritime processes',
            'mizuRoute streamflow outputs for oceanic basins',
            'Processed NIWA observations from NetCDF'
        ]
    }
    
    # Create demonstration log
    demo_log = experiment_dir / 'demonstration_oceanic_processing.log'
    with open(demo_log, 'w') as f:
        f.write("NIWA Temperate Oceanic Processing - Demonstration Log\n")
        f.write("="*60 + "\n\n")
        f.write(f"Processing mode: {'DRY RUN' if streamflow_config['dry_run_mode'] else 'PRODUCTION'}\n")
        f.write(f"Total watersheds selected: {processing_results['total_selected']}\n")
        f.write(f"Configuration files to generate: {processing_results['configs_generated']}\n")
        f.write(f"SLURM jobs to submit: {processing_results['jobs_submitted']}\n")
        f.write(f"Estimated processing time: {processing_results['estimated_completion']}\n\n")
        
        f.write("Expected outputs per oceanic watershed:\n")
        for output in processing_results['expected_outputs']:
            f.write(f"  - {output}\n")
        
        f.write("\nTemperate oceanic processing workflow:\n")
        f.write("  1. Generate oceanic-specific CONFLUENCE configurations\n")
        f.write("  2. Process New Zealand geographic data with orographic terrain\n")
        f.write("  3. Prepare Southern Hemisphere meteorological forcing\n")
        f.write("  4. Extract NIWA streamflow observations from NetCDF\n")
        f.write("  5. Execute SUMMA with temperate oceanic processes\n")
        f.write("  6. Run mizuRoute for maritime streamflow routing\n")
        f.write("  7. Generate oceanic-validated output files\n")
    
    print(f"   📄 Demonstration log created: {demo_log}")
    
    # Display processing summary
    print(f"\n📊 Oceanic Processing Summary:")
    print(f"   🏔️ Watersheds: {processing_results['total_selected']} across New Zealand's diverse terrain")
    print(f"   ⚙️ Configurations: {processing_results['configs_generated']} to be generated")
    print(f"   🖥️ Jobs: {processing_results['jobs_submitted']} {'(dry run)' if streamflow_config['dry_run_mode'] else 'to submit'}")
    print(f"   ⏱️ Estimated time: {processing_results['estimated_completion']}")
    
    return True

# Execute the oceanic processing
processing_success = execute_niwa_oceanic_processing()

# =============================================================================
# OCEANIC PROCESSING STATUS AND MONITORING
# =============================================================================

print(f"\n📈 Step 2.5: Oceanic Processing Status and Monitoring")

def create_oceanic_processing_status_summary():
    """
    Create comprehensive oceanic processing status summary
    """
    
    status_summary = {
        'experiment_name': streamflow_config['experiment_name'],
        'processing_mode': 'DRY RUN' if streamflow_config['dry_run_mode'] else 'PRODUCTION',
        'total_watersheds': len(selected_watersheds),
        'script_executed': processing_success,
        'execution_time': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'oceanic_specialization': 'New Zealand temperate oceanic terrain'
    }
    
    # Island breakdown
    if 'island' in selected_watersheds.columns:
        island_breakdown = selected_watersheds['island'].value_counts().to_dict()
        status_summary['island_breakdown'] = island_breakdown
    
    # Elevation breakdown
    if 'elevation_class' in selected_watersheds.columns:
        elevation_breakdown = selected_watersheds['elevation_class'].value_counts().to_dict()
        status_summary['elevation_breakdown'] = elevation_breakdown
    
    # Flow regime breakdown
    if 'flow_regime' in selected_watersheds.columns:
        regime_breakdown = selected_watersheds['flow_regime'].value_counts().to_dict()
        status_summary['regime_breakdown'] = regime_breakdown
    
    # Climate breakdown
    if 'climate_class' in selected_watersheds.columns:
        climate_breakdown = selected_watersheds['climate_class'].value_counts().to_dict()
        status_summary['climate_breakdown'] = climate_breakdown
    
    # Expected outputs
    status_summary['expected_outputs'] = {
        'domain_directories': len(selected_watersheds),
        'oceanic_shapefile_sets': len(selected_watersheds),
        'southern_hemisphere_forcing_datasets': len(selected_watersheds),
        'maritime_simulation_results': len(selected_watersheds),
        'oceanic_streamflow_outputs': len(selected_watersheds),
        'niwa_observation_files': len(selected_watersheds)
    }
    
    # Save status summary
    status_file = experiment_dir / 'oceanic_processing_status_summary.yaml'
    with open(status_file, 'w') as f:
        yaml.dump(status_summary, f, default_flow_style=False)
    
    print(f"   📊 Oceanic processing status summary:")
    print(f"     Experiment: {status_summary['experiment_name']}")
    print(f"     Mode: {status_summary['processing_mode']}")
    print(f"     Watersheds: {status_summary['total_watersheds']}")
    print(f"     Script executed: {status_summary['script_executed']}")
    print(f"     Oceanic specialization: {status_summary['oceanic_specialization']}")
    
    if 'island_breakdown' in status_summary:
        print(f"     Island distribution:")
        for island, count in status_summary['island_breakdown'].items():
            print(f"       {island}: {count} watersheds")
    
    print(f"   💾 Status summary saved: {status_file}")
    
    return status_summary

# Create oceanic processing status summary
processing_status = create_oceanic_processing_status_summary()

print(f"\n✅ Step 2 Complete: NIWA Oceanic Processing Setup and Execution")
print(f"   🏔️ Oceanic scope: {len(selected_watersheds)} watersheds across New Zealand's diverse terrain")
print(f"   ⚙️ Configuration: Oceanic template and processing scripts prepared")
print(f"   🚀 Execution: {'Completed' if processing_success else 'Attempted'}")
print(f"   📁 Results: All outputs saved to {experiment_dir}")

if streamflow_config['dry_run_mode']:
    print(f"   🔧 Mode: DRY RUN - Switch to production mode to submit actual oceanic jobs")
else:
    print(f"   🔧 Mode: PRODUCTION - Oceanic jobs submitted for processing")

print(f"\n🎯 Next: Proceed to Step 3 for temperate oceanic streamflow validation and analysis")

## Step 3: New Zealand Analysis

Having executed large sample temperate oceanic streamflow modeling, we now demonstrate the analytical power that emerges from systematic oceanic streamflow validation using NIWA observations. This step showcases comprehensive temperate oceanic watershed response evaluation, maritime influence performance assessment, and integrated oceanic process validation—representing specialized Southern Hemisphere hydrological analysis within our CONFLUENCE tutorial series.

**Temperate Oceanic Streamflow Science Evolution: Case Studies → Maritime Hydrological Understanding**

Traditional Temperate Streamflow Validation:
- Individual temperate basin model evaluation with limited maritime data availability
- Northern Hemisphere-specific parameter tuning with limited transferability across oceanic gradients  
- Difficulty separating universal temperate principles from local maritime and orographic effects
- Manual comparison across different temperate studies and limited oceanic modeling approaches
- Limited statistical power for robust temperate oceanic hydrological process generalization

Systematic Temperate Oceanic Streamflow Validation:
- New Zealand-scale pattern recognition across elevation, maritime exposure, and island gradients
- Statistical hypothesis testing for oceanic process representations with robust sample sizes
- Process universality assessment distinguishing global vs. oceanic-specific hydrological behaviors
- Model transferability evaluation across diverse temperate oceanic watershed environments
- Maritime influence quantification through systematic multi-basin analysis

**Comprehensive Temperate Oceanic Multi-Basin Analysis Framework**

Tier 1: Oceanic Watershed Domain Spatial Overview
- Automated discovery of completed oceanic streamflow modeling domains across New Zealand environmental gradients
- Processing status assessment including simulation completion, routing success, and NIWA observation availability
- New Zealand spatial distribution showing streamflow modeling coverage across both islands and elevation zones
- Maritime-scale analysis revealing streamflow modeling performance across coastal exposure gradients

Tier 2: Integrated Oceanic Streamflow Process Validation
- Oceanic hydrograph comparison: Comprehensive streamflow time series validation across diverse maritime watersheds
- Multi-oceanic metric evaluation: Nash-Sutcliffe efficiency, Kling-Gupta efficiency, seasonal bias, and oceanic correlation assessment
- Oceanic flow signature analysis: Characteristic maritime watershed response patterns and temperate oceanic hydrological behavior
- Seasonal oceanic performance evaluation: Assessment across Southern Hemisphere seasons and maritime flow conditions