# TEVAL: T-Route Ensemble Evaluation Demo

This notebook demonstrates the capabilities of the **teval** toolkit. We will walk through:
1.  **Setup & Configuration**: initializing the tool.
2.  **Calculation Run**: Processing raw ensemble data.
3.  **Visualization Run**: Rapidly generating plots from cached stats.
4.  **Deep Dive**: Creating detailed "Spaghetti Plots" and Animations.

In [None]:
import os
import glob
from pathlib import Path
import matplotlib.pyplot as plt
from IPython.display import Image, display

# Import teval modules
from teval.config import TevalConfig, generate_default_config
from teval.pipeline import run_pipeline

# Set up our workspace
WORKDIR = Path("demo_output")
WORKDIR.mkdir(exist_ok=True)

print(f"Working directory set to: {WORKDIR}")

### Step 0: (Optional) Generate Sample Data
If you do not have your own T-Route ensemble output files yet, you can run the cell below to generate realistic dummy data based on a sample hydrofabric GeoPackage.

**Note:** You must have a valid `.gpkg` file (e.g., `gage_10023000.gpkg`) in your `data/` folder for this to work.

In [None]:
# Import the dummy data generator from the root directory
import sys
sys.path.append("..") # Ensure root dir is in path if notebook is in a subfolder
try:
    from create_dummy_data import create_realistic_output_from_gpkg
    
    # USER INPUT REQUIRED
    # Update this path to point to your local hydrofabric file
    GPKG_PATH = Path("../data/gage_10023000.gpkg") 
    OUTPUT_DIR = Path("../data/")
    
    if GPKG_PATH.exists():
        print(f"Generating dummy ensemble data in {OUTPUT_DIR}...")
        create_realistic_output_from_gpkg(
            output_dir=str(OUTPUT_DIR),
            gpkg_path=str(GPKG_PATH),
            num_members=10
        )
        print("Sample data generated.")
    else:
        print(f" Skipping data generation: {GPKG_PATH} not found.")
        print("To run this step, please provide a valid hydrofabric GeoPackage path in the code above.")
        
except ImportError:
    print(" Could not import 'create_dummy_data'. Ensure the script is in your project root.")

### Step 1: Initialize Configuration
We start by generating a default configuration file. In a real workflow, you would edit this YAML file to point to your specific NetCDF data.

In [None]:
config_path = "demo_config.yaml"

# Generate default if it doesn't exist
if not os.path.exists(config_path):
    generate_default_config(config_path)
    print(f"Created default config: {config_path}")

# Load the config object
config = TevalConfig.from_yaml(config_path)

# Point to our sample data (Adjust these paths to match your actual sample data location)
config.io.input_dir = Path("../data") 
config.io.output_dir = WORKDIR
config.io.ensemble_pattern = "troute_output_formulation_*.nc"

# Let's verify what we have
print(f"Input: {config.io.input_dir}")
print(f"Output: {config.io.output_dir}")

### Step 2: The Ensemble Calculations
Processing large ensemble files can take a long time, depending on the size of the domain and duration of the simulation. `teval` is able to do this once then save the processed data to a file.

We will run the pipeline with `stats.enabled = True`. This reads the raw NetCDF files, calculates mean/median/quantiles, and saves them to `ensemble_stats.nc`.

In [None]:
# Configure for Calculation
config.stats.enabled = True
config.stats.quantiles = [0.05, 0.95] # 90% uncertainty bands

# Disable viz for now to keep it fast
config.viz.hydrographs.enabled = False
config.viz.static_maps.enabled = False
config.viz.interactive_map.enabled = False
config.viz.animation.enabled = False

# Run the pipeline
run_pipeline(config)

print("Statistics calculated and cached.")

### Step 3: Hydrograph Visualization
Now that we have `ensemble_stats.nc`, we can generate plots without re-reading the heavy raw data. 
Let's generate standard hydrographs with uncertainty bands.

In [None]:
# Configure for Viz
config.stats.enabled = False # Load from cache
config.viz.hydrographs.enabled = True
config.viz.hydrographs.plot_members = False
config.viz.hydrographs.plot_uncertainty = True
config.viz.hydrographs.target_ids = [2860507]
config.io.auto_download_usgs = True # Enable the USGS gage downloader to add observation to the plot

# Run
run_pipeline(config)

# Display one of the results
plots = glob.glob(str(WORKDIR / "hydrographs" / "*.png"))
if plots:
    print(f"Displaying: {plots[0]}")
    display(Image(filename=plots[0]))

### Step 4: Deep Dive (Spaghetti Plots)
Sometimes the mean isn't enough. You want to see the spread of individual members. 
We can enable `plot_members` to overlay the raw traces. Note: This *does* require reading the raw files again, but only for the specific IDs being plotted.

In [None]:
# Enable Spaghetti Mode
config.viz.hydrographs.plot_members = True
config.viz.hydrographs.plot_uncertainty = False

run_pipeline(config)

# Display the new style
plots = glob.glob(str(WORKDIR / "hydrographs" / "*.png"))
if plots:
    print(f"Displaying Spaghetti Plot: {plots[0]}")
    display(Image(filename=plots[0]))

### Step 5: Spatial Analysis (Maps & Animations)
Finally, let's look at the spatial distribution. We will generate a static map and an animation.

In [None]:
# Configure Spatial Viz
config.viz.hydrographs.enabled = False 
config.viz.static_maps.enabled = True
config.viz.animation.enabled = True

# config.viz.animation.fps = 8
config.viz.animation.log_scale = True # Recommended for streamflow
config.data.time_slice = ['2023-05-01', '2023-05-03']

# Run
run_pipeline(config)

# Show Static Map
maps = glob.glob(str(WORKDIR / "maps" / "*.png"))
if maps:
    display(Image(filename=maps[0]))
    
# Show Animation
gifs = glob.glob(str(WORKDIR / "*.gif"))
if gifs:
    print(f"Animation saved to: {gifs[0]}")
    # Note: GitHub/Jupyter usually renders GIFs automatically
    display(Image(filename=gifs[0]))