# Calibration Tutorial - Fort Peck, MT - Unirrigated Flux Plot

## Step 2: Calibrate with PEST++

Now we see if we can improve the model's performance through calibration.

### PEST++ Resources

PEST++ has excellent documentation covering both theory and practice:

1. **The PEST Manual 4th Ed.** (Doherty, 2002): https://www.epa.gov/sites/default/files/documents/PESTMAN.PDF
2. **GMDSI tutorial notebooks**: https://github.com/gmdsi/GMDSI_notebooks
3. **PEST++ User's Manual**: https://github.com/usgs/pestpp/blob/master/documentation/pestpp_users_manual.md
4. **PEST Book** (Doherty, 2015): https://pesthomepage.org/pest-book

**Note:** We use PT-JPL ETf and SNODAS SWE for calibration - not flux observations. The flux data is reserved for validation only.

### PEST++ Installation

Install PEST++ via conda-forge (recommended):

```bash
conda install conda-forge::pestpp
```

This installs `pestpp-ies` and other PEST++ tools directly into your conda environment. Verify the installation:

```bash
pestpp-ies --version
```

**Alternative:** For manual installation from source, see the [PEST++ GitHub releases](https://github.com/usgs/pestpp/releases).

In [1]:
import os
import sys

root = os.path.abspath('../..')
sys.path.append(root)

from swimrs.container import SwimContainer
from swimrs.calibrate.pest_builder import PestBuilder
from swimrs.swim.config import ProjectConfig
from swimrs.calibrate.run_pest import run_pst



## 1. Load Configuration

Load the project configuration and prepare paths.

In [2]:
project = '2_Fort_Peck'
project_ws = os.path.abspath('.')

config_path = os.path.join(project_ws, '2_Fort_Peck.toml')

config = ProjectConfig()
config.read_config(config_path, project_ws)

## 2. Generate Observation Files

PEST++ needs observation files (ETf and SWE) to compare against model predictions. We export these from the SwimContainer built in the data preparation step.

The export creates per-field numpy files:
- `obs_etf_{field_id}.np`: ETf observations (PT-JPL) with irrigation mask switching
- `obs_swe_{field_id}.np`: SWE observations from SNODAS

In [3]:
# Open container for observations export and PEST++ setup
container_path = os.path.join(project_ws, 'data', f'{project}.swim')
container = SwimContainer.open(container_path, mode='r')

# Export observation files for PEST++ calibration
# Note: obs_folder is set to {pest_run_dir}/obs in the TOML config
obs_dir = os.path.join(project_ws, 'data', 'pestrun', 'obs')
os.makedirs(obs_dir, exist_ok=True)

container.export.observations(
    output_dir=obs_dir,
    etf_model='ptjpl',
    masks=('irr', 'inv_irr'),
    irr_threshold=0.1,
)
print(f"Observation files written to {obs_dir}")

Observation files written to /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/obs


## 3. Build PEST++ Control Files

The `PestBuilder` class sets up everything needed for PEST++ calibration:

### Calibration Loop
1. Initialize model with initial parameter values
2. Run model, write results
3. Compare results to observations (ETf, SWE)
4. Propose new parameters
5. Repeat until convergence

### Tunable Parameters
SWIM uses 8 tunable parameters:
- **Soil water:** `aw` (available water), `rew` (readily evaporable water), `tew` (total evaporable water)
- **NDVI-Kcb relationship:** `ndvi_alpha`, `ndvi_beta`
- **Stress threshold:** `mad` (management allowable depletion)
- **Snow melt:** `swe_alpha`, `swe_beta`

In [4]:
py_script = os.path.join(project_ws, 'custom_forward_run.py')

# Pass container to PestBuilder for ETf/SWE data access
builder = PestBuilder(config, container=container, use_existing=False, python_script=py_script)

### Build the .pst Control File

The `build_pest()` method:
- Copies project files to a `pest/` directory
- Creates the `.pst` control file
- Sets up parameter templates (`.tpl`) and instruction files (`.ins`)

**Note:** ETf observations are sparse (only on satellite capture dates). The PEST builder automatically assigns weight 1.0 to valid observations and weight 0.0 to missing dates, ensuring we only calibrate against actual satellite captures.

In [5]:
# Build the pest control file
# WARNING: This will erase any existing pest directory!
builder.build_pest(target_etf=config.etf_target_model, members=config.etf_ensemble_members)

2026-01-15 11:19:51.153534 starting: opening PstFrom.log for logging
2026-01-15 11:19:51.153828 starting PstFrom process
2026-01-15 11:19:51.154010 starting: setting up dirs
2026-01-15 11:19:51.154322 starting: removing existing new_d '/home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/pest'
2026-01-15 11:19:51.208702 finished: removing existing new_d '/home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/pest' took: 0:00:00.054380
2026-01-15 11:19:51.208764 starting: copying original_d '/home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun' to new_d '/home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/pest'
2026-01-15 11:19:54.711347 finished: copying original_d '/home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun' to new_d '/home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/pest' took: 0:00:03.502583
2026-01-15 11:19:54.713239 finished: setting up dirs took: 0:00:03.559229
2026-01-15 11:19:54.713533 starting: adding cons

In [6]:
# Show the files created in the pest directory
pest_files = [f for f in sorted(os.listdir(builder.pest_dir)) 
              if os.path.isfile(os.path.join(builder.pest_dir, f))]
print("Files in pest directory:")
for f in pest_files:
    print(f"  {f}")

Files in pest directory:
  2_Fort_Peck.idx.csv
  2_Fort_Peck.pst
  2_fort_peck.insfile_data.csv
  2_fort_peck.obs_data.csv
  2_fort_peck.par_data.csv
  2_fort_peck.pargp_data.csv
  2_fort_peck.tplfile_data.csv
  custom_forward_run.py
  etf_US-FPe.ins
  mult2model_info.csv
  p_aw_US-FPe_0_constant.csv.tpl
  p_kr_alpha_US-FPe_0_constant.csv.tpl
  p_ks_alpha_US-FPe_0_constant.csv.tpl
  p_mad_US-FPe_0_constant.csv.tpl
  p_ndvi_0_US-FPe_0_constant.csv.tpl
  p_ndvi_k_US-FPe_0_constant.csv.tpl
  p_swe_alpha_US-FPe_0_constant.csv.tpl
  p_swe_beta_US-FPe_0_constant.csv.tpl
  params.csv
  spinup.json
  swe_US-FPe.ins


### Examine the .pst File

The PEST++ version 2 control file is concise, delegating details to external files:

In [7]:
with open(builder.pst_file, 'r') as f:
    print(f.read())

pcf version=2
* control data keyword
pestmode                                 estimation
noptmax                                 0
svdmode                                 1
maxsing                          10000000
eigthresh                           1e-06
eigwrite                                1
* parameter groups external
2_fort_peck.pargp_data.csv
* parameter data external
2_fort_peck.par_data.csv
* observation data external
2_fort_peck.obs_data.csv
* model command line
python custom_forward_run.py
* model input external
2_fort_peck.tplfile_data.csv
* model output external
2_fort_peck.insfile_data.csv



## 4. Configure and Test

Now we:
1. Build the localizer matrix (links parameters to relevant observations)
2. Run spinup to save initial water balance state
3. Do a dry run to verify everything works
4. Set control parameters for the full calibration

In [8]:
# Build localizer matrix
# - SWE observations update only swe_alpha and swe_beta
# - ETf observations update all other parameters
builder.build_localizer()

# Run spinup to save water balance state as initial conditions for calibration runs
# This runs the model once and saves the final state to spinup.json
print("Running spinup...")
builder.spinup(overwrite=True)

# Run a minimal model run to verify the setup
print("\nRunning dry run...")
builder.dry_run()

# Configure for 3 optimization iterations with 20 realizations each
# Increase realizations (e.g., 100-200) for production runs
builder.write_control_settings(noptmax=3, reals=20)

noptmax:0, npar_adj:8, nnz_obs:2868
Running spinup...
RUNNING SPINUP
USING PARAMETER DEFAULTS

Running dry run...
noptmax:3, npar_adj:8, nnz_obs:2868
writing /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/pest/2_Fort_Peck.pst with noptmax=3, 20 realizations


In [9]:
# Show the updated control file with calibration settings
with open(builder.pst_file, 'r') as f:
    print(f.read())

pcf version=2
* control data keyword
pestmode                                 estimation
noptmax                                 3
svdmode                                 1
maxsing                          10000000
eigthresh                           1e-06
eigwrite                                1
ies_localizer                  loc.mat
ies_num_reals                  20
ies_drop_conflicts             true
* parameter groups external
2_fort_peck.pargp_data.csv
* parameter data external
2_fort_peck.par_data.csv
* observation data external
2_fort_peck.obs_data.csv
* model command line
python custom_forward_run.py
* model input external
2_fort_peck.tplfile_data.csv
* model output external
2_fort_peck.insfile_data.csv



## 5. Run PEST++ Calibration

Now we launch PEST++ with parallel workers. Adjust `workers` to match your machine's capabilities.

**Important:** This can take a while. Watch for the progress indicator showing completed runs (C), failed (F), and total (T).

In [10]:
workers = config.workers or 6
_pst = f"{config.project_name}.pst"

print(f"Starting PEST++ with {workers} workers...")
print(f"Control file: {_pst}")

run_pst(builder.pest_dir,
        'pestpp-ies',
        _pst,
        num_workers=workers,
        worker_root=builder.workers_dir,
        master_dir=builder.master_dir,
        cleanup=True,
        verbose=True)

Starting PEST++ with 20 workers...
Control file: 2_Fort_Peck.pst
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_6
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_14
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_3
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_12
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_18
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_15
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_19
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_8
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_16
rmtree: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/workers/worker_2
rmtree: /home/dgketchum/code/swim-rs/examples/2_F

## Debugging Tips

If calibration fails, try these steps:

### Never saw the PEST++ panther logo?
- Ensure `pestpp-ies` is in your PATH
- Try running with full path: `/path/to/pestpp-ies`
- On Windows, use `.exe` extension

### Python error traceback?
Debug from the bottom up:
1. Verify `swimrs/calibrate/run_mp.py` runs standalone
2. Test `custom_forward_run.py` from the pest directory
3. Try running `pestpp-ies` directly from pest directory

### Saw panther but got errors?
Common error: `output file 'pred/pred_swe_US-FPe.np' not found`
- This usually means SWIM failed before writing predictions
- Set `cleanup=False` above to preserve worker directories
- Check `workers/worker_0/panther_worker.rec` for errors

**Most errors (90%) are due to incorrect paths!**

## 6. Check Results

After successful calibration, parameter files are saved for each optimization iteration:

In [11]:
# Parameter files are in master_dir after calibration
import os
import shutil

results_dir = builder.master_dir if os.path.exists(builder.master_dir) else builder.pest_dir

par_files = [f for f in sorted(os.listdir(results_dir)) if '.par.csv' in f]
print(f"Parameter files (in {results_dir}):")
for f in par_files:
    print(f"  {f}")

if par_files:
    final_par = par_files[-1]
    print(f"\nFinal calibrated parameters: {final_par}")
    
    # Copy final parameters to pest/archive/ for consistency with calibration.py script
    archive_dir = os.path.join(builder.pest_dir, 'archive')
    os.makedirs(archive_dir, exist_ok=True)
    src = os.path.join(results_dir, final_par)
    dst = os.path.join(archive_dir, final_par)
    shutil.copy2(src, dst)
    print(f"Archived to: {dst}")

# Close the container
container.close()
print("\nContainer closed.")

Parameter files (in /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/master):
  2_Fort_Peck.0.par.csv
  2_Fort_Peck.1.par.csv
  2_Fort_Peck.2.par.csv
  2_Fort_Peck.3.par.csv

Final calibrated parameters: 2_Fort_Peck.3.par.csv
Archived to: /home/dgketchum/code/swim-rs/examples/2_Fort_Peck/data/pestrun/pest/archive/2_Fort_Peck.3.par.csv

Container closed.


## Summary

You've set up and run a PEST++ calibration using:
- PT-JPL ETf observations from Landsat (via OpenET)
- SNODAS SWE observations
- Iterative Ensemble Smoother (pestpp-ies)

The parameter files (e.g., `2_Fort_Peck.3.par.csv`) contain the calibrated parameter sets. Each row represents a different realization from the ensemble.

**Next step:** In notebook `03_calibrated_model.ipynb`, we'll run the model in forecast mode with calibrated parameters and validate against flux tower observations.