# Calibration Tutorial - Fort Peck, MT - Unirrigated Flux Plot

## Step 2: Calibrate with PEST++

Now we see if we can improve the model's performance through calibration.

### PEST++ Resources

PEST++ has excellent documentation covering both theory and practice:

1. **The PEST Manual 4th Ed.** (Doherty, 2002): https://www.epa.gov/sites/default/files/documents/PESTMAN.PDF
2. **GMDSI tutorial notebooks**: https://github.com/gmdsi/GMDSI_notebooks
3. **PEST++ User's Manual**: https://github.com/usgs/pestpp/blob/master/documentation/pestpp_users_manual.md
4. **PEST Book** (Doherty, 2015): https://pesthomepage.org/pest-book

**Note:** We use SSEBop ETf and SNODAS SWE for calibration - not flux observations. The flux data is reserved for validation only.

### PEST++ Installation

Before running this notebook, install PEST++:

1. Get the latest release: https://github.com/usgs/pestpp/releases
2. Follow installation instructions: https://github.com/usgs/pestpp/blob/master/documentation/cmake.md
3. Ensure `pestpp-ies` is in your PATH or update the executable path below.

In [None]:
import os
import sys

root = os.path.abspath('../..')
sys.path.append(root)

from swimrs.prep.prep_plots import preproc
from swimrs.calibrate.pest_builder import PestBuilder
from swimrs.swim.config import ProjectConfig
from swimrs.calibrate.run_pest import run_pst

## 1. Load Configuration

Load the project configuration and prepare paths.

In [None]:
project = '2_Fort_Peck'
project_ws = os.path.abspath('.')

config_path = os.path.join(project_ws, '2_Fort_Peck.toml')

config = ProjectConfig()
config.read_config(config_path, project_ws)

## 2. Generate Observation Files

PEST++ needs observation files (ETf and SWE) to compare against model predictions. We use `preproc()` to generate these files.

**Note:** If you have a SwimContainer with ingested data, you can alternatively use:
```python
from swimrs.container import open_container
container = open_container("data/2_Fort_Peck.swim", mode="r")
container.export.observations(
    output_dir="obs",
    etf_model="ssebop",
    masks=("irr", "inv_irr"),
    irr_threshold=0.1,
)
container.close()
```

In [None]:
# Write the observed data to files within project workspace
preproc(config_path, project_ws)

## 3. Build PEST++ Control Files

The `PestBuilder` class sets up everything needed for PEST++ calibration:

### Calibration Loop
1. Initialize model with initial parameter values
2. Run model, write results
3. Compare results to observations (ETf, SWE)
4. Propose new parameters
5. Repeat until convergence

### Tunable Parameters
SWIM uses 8 tunable parameters:
- **Soil water:** `aw` (available water), `rew` (readily evaporable water), `tew` (total evaporable water)
- **NDVI-Kcb relationship:** `ndvi_alpha`, `ndvi_beta`
- **Stress threshold:** `mad` (management allowable depletion)
- **Snow melt:** `swe_alpha`, `swe_beta`

In [None]:
py_script = os.path.join(project_ws, 'custom_forward_run.py')

builder = PestBuilder(config, use_existing=False, python_script=py_script)

### Build the .pst Control File

The `build_pest()` method:
- Copies project files to a `pest/` directory
- Creates the `.pst` control file
- Sets up parameter templates (`.tpl`) and instruction files (`.ins`)

**Note:** During ETf data processing, capture-date markers (`*_ct`) were created. Observations get weight 1.0 on capture dates, weight 0.0 otherwise. This ensures we only calibrate against actual satellite observations, not interpolated values.

In [None]:
# Build the pest control file
# WARNING: This will erase any existing pest directory!
builder.build_pest(target_etf=config.etf_target_model, members=config.etf_ensemble_members)

In [None]:
# Show the files created in the pest directory
pest_files = [f for f in sorted(os.listdir(builder.pest_dir)) 
              if os.path.isfile(os.path.join(builder.pest_dir, f))]
print("Files in pest directory:")
for f in pest_files:
    print(f"  {f}")

### Examine the .pst File

The PEST++ version 2 control file is concise, delegating details to external files:

In [None]:
with open(builder.pst_file, 'r') as f:
    print(f.read())

## 4. Configure and Test

Now we:
1. Build the localizer matrix (links parameters to relevant observations)
2. Do a dry run to verify everything works
3. Set control parameters for the full calibration

In [None]:
# Build localizer matrix
# - SWE observations update only swe_alpha and swe_beta
# - ETf observations update all other parameters
builder.build_localizer()

# Run a minimal model run to verify the setup
print("\nRunning dry run...")
builder.dry_run()

# Configure for 3 optimization iterations with 20 realizations each
# Increase realizations (e.g., 100-200) for production runs
builder.write_control_settings(noptmax=3, reals=20)

In [None]:
# Show the updated control file with calibration settings
with open(builder.pst_file, 'r') as f:
    print(f.read())

## 5. Run PEST++ Calibration

Now we launch PEST++ with parallel workers. Adjust `workers` to match your machine's capabilities.

**Important:** This can take a while. Watch for the progress indicator showing completed runs (C), failed (F), and total (T).

In [None]:
workers = config.workers or 6
_pst = f"{config.project_name}.pst"

print(f"Starting PEST++ with {workers} workers...")
print(f"Control file: {_pst}")

run_pst(builder.pest_dir,
        'pestpp-ies',
        _pst,
        num_workers=workers,
        worker_root=builder.workers_dir,
        master_dir=builder.master_dir,
        cleanup=True,
        verbose=True)

## Debugging Tips

If calibration fails, try these steps:

### Never saw the PEST++ panther logo?
- Ensure `pestpp-ies` is in your PATH
- Try running with full path: `/path/to/pestpp-ies`
- On Windows, use `.exe` extension

### Python error traceback?
Debug from the bottom up:
1. Verify `swimrs/calibrate/run_mp.py` runs standalone
2. Test `custom_forward_run.py` from the pest directory
3. Try running `pestpp-ies` directly from pest directory

### Saw panther but got errors?
Common error: `output file 'pred/pred_swe_US-FPe.np' not found`
- This usually means SWIM failed before writing predictions
- Set `cleanup=False` above to preserve worker directories
- Check `workers/worker_0/panther_worker.rec` for errors

**Most errors (90%) are due to incorrect paths!**

## 6. Check Results

After successful calibration, parameter files are saved for each optimization iteration:

In [None]:
par_files = [f for f in sorted(os.listdir(builder.pest_dir)) if '.par.csv' in f]
print("Parameter files:")
for f in par_files:
    print(f"  {f}")

if par_files:
    print(f"\nFinal calibrated parameters: {par_files[-1]}")

## Summary

You've set up and run a PEST++ calibration using:
- SSEBop ETf observations from Landsat
- SNODAS SWE observations
- Iterative Ensemble Smoother (pestpp-ies)

The parameter files (e.g., `2_Fort_Peck.3.par.csv`) contain the calibrated parameter sets. Each row represents a different realization from the ensemble.

**Next step:** In notebook `03_calibrated_model.ipynb`, we'll run the model in forecast mode with calibrated parameters and validate against flux tower observations.