# 📘 AM2 Model: Simulation & Calibration (Production Ready)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/benmola/OpenAD-lib/blob/main/notebooks/02_AM2_Modelling_Full.ipynb)

**Simplified 4-state AD model** with Optuna-based parameter calibration.

**⚠️ This notebook matches:**
- `examples/02_am2_simulation.py` (Simulation)
- `examples/03_am2_calibration.py` (Calibration)

---

## 📚 References
- **AM2 Model**: [Dekhici et al. (2024) - ACM DL](https://dl.acm.org/doi/10.1145/3680281)
- **Optuna**: [Akiba et al. (2019)](https://arxiv.org/abs/1907.10902)

## 🔬 AM2 Model Background

### Simplified 4-State Model

| State | Description | Unit |
|-------|-------------|------|
| $S_1$ | Organic substrate (COD) | g COD/L |
| $S_2$ | VFA concentration | g COD/L |
| $X_1$ | Acidogenic biomass | g/L |
| $X_2$ | Methanogenic biomass | g/L |

### Model Equations

**Mass Balances:**

$$\frac{dS_1}{dt} = D(S_{1,in} - S_1) - k_1 \mu_1(S_1) X_1$$

$$\frac{dS_2}{dt} = D(S_{2,in} - S_2) + k_2 \mu_1(S_1) X_1 - k_3 \mu_2(S_2) X_2$$

$$\frac{dX_1}{dt} = (\mu_1(S_1) - D) X_1$$

$$\frac{dX_2}{dt} = (\mu_2(S_2) - D) X_2$$

### Kinetics

**Monod (Acidogenesis):**
$$\mu_1(S_1) = \frac{\mu_{1,max} \cdot S_1}{K_1 + S_1}$$

**Haldane with Inhibition (Methanogenesis):**
$$\mu_2(S_2) = \frac{\mu_{2,max} \cdot S_2}{K_2 + S_2 + S_2^2/K_i}$$

**Biogas:**
$$Q = k_6 \mu_2(S_2) X_2$$

### Key Parameters (to be calibrated)
- $\mu_{1,max}$ (`m1`): Max acidogenic growth rate
- $K_1$: Half-saturation for S1
- $\mu_{2,max}$ (`m2`): Max methanogenic growth rate
- $K_2$: Half-saturation for S2
- $K_i$: Substrate inhibition constant

## 1️⃣ Setup

In [None]:
# Install with optimization dependencies (Optuna)
# !pip install git+https://github.com/benmola/OpenAD-lib.git

import sys
import os

IN_COLAB = 'google.colab' in sys.modules

if not IN_COLAB:
    sys.path.append(os.path.join(os.getcwd(), '..', 'src'))

print(f"Running in Colab: {IN_COLAB}")

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from openad_lib.models.mechanistic import AM2Model
from openad_lib.optimisation import AM2Calibrator

print("✅ Imports successful!")

## 2️⃣ Load Lab Data

**Dataset:** `sample_AM2_Lab_data.csv`
- Lab-scale AD reactor data
- Contains: S1(COD), S2(VFA), Q(Biogas) measurements

In [None]:
# Download for Colab
if IN_COLAB:
    !wget -q https://raw.githubusercontent.com/benmola/OpenAD-lib/main/src/openad_lib/data/sample_AM2_Lab_data.csv
    data_path = 'sample_AM2_Lab_data.csv'
else:
    base_path = os.path.dirname(os.getcwd())
    data_path = os.path.join(base_path, 'src', 'openad_lib', 'data', 'sample_AM2_Lab_data.csv')

# Initialize model and load data
print("Initializing AM2 model...")
model = AM2Model()
model.load_data(data_path)

print("✅ Data loaded")

## 3️⃣ Simulation with Default Parameters

**Default parameters** (from literature):
- Often need calibration for specific reactors
- Good starting point but not optimal

In [None]:
# Display default parameters (MATCH example output)
print("Default AM2 Parameters:")
print(f"  µ₁ₘₐₓ (m1): {model.params.m1} d⁻¹")
print(f"  K₁:         {model.params.K1} g COD/L")
print(f"  µ₂ₘₐₓ (m2): {model.params.m2} d⁻¹")
print(f"  Kᵢ:         {model.params.Ki} g COD/L")
print(f"  K₂:         {model.params.K2} g COD/L")

In [None]:
# Run initial simulation
print("\n🚀 Running AM2 simulation...")
initial_results = model.run(verbose=True)

# Evaluate
print("\nEvaluation Metrics:")
model.print_metrics()
initial_metrics = model.evaluate()

In [None]:
# Plot initial results
print("\nGenerating plots...")
model.plot_results(figsize=(12, 10), show_measured=True)

## 4️⃣ Parameter Calibration with Optuna

### Optimization Problem

**Objective:** Minimize weighted error

$$J = \sum_{i} w_i \cdot \frac{\text{MSE}_i}{\text{Var}(y_i)}$$

**Decision variables:** 5 kinetic parameters

**Constraints:** Physically realistic bounds

### Why Optuna?
- **Bayesian optimization** (smarter than grid search)
- **Tree-structured Parzen Estimator (TPE)** algorithm
- Finds good parameters in ~50 trials

### Parameter Bounds

| Parameter | Min | Max | Unit | Physical Meaning |
|-----------|-----|-----|------|------------------|
| m1 | 0.01 | 0.5 | d⁻¹ | Too low → slow acidification |
| K1 | 5.0 | 50.0 | g COD/L | Affects S1 half-saturation |
| m2 | 0.1 | 1.0 | d⁻¹ | Methanogen growth rate |
| Ki | 5.0 | 50.0 | g COD/L | VFA inhibition threshold |
| K2 | 10.0 | 80.0 | g COD/L | VFA half-saturation |

In [None]:
# Configure calibration (MATCH example exactly)
print("Configuring calibration...")
calibrator = AM2Calibrator(model)

# Parameters to tune
params_to_tune = ['m1', 'K1', 'm2', 'Ki', 'K2']

# Custom bounds (from domain knowledge)
param_bounds = {
    'm1': (0.01, 0.5),
    'K1': (5.0, 50.0),
    'm2': (0.1, 1.0),
    'Ki': (5.0, 50.0),
    'K2': (10.0, 80.0)
}

# Optimization weights (focus on VFA and Biogas)
weights = {'S1': 0.5, 'S2': 1.0, 'Q': 1.0}

print(f"Parameters to tune: {params_to_tune}")
print(f"Optimization weights: {weights}")

In [None]:
# Run calibration (50 trials matching example)
print("\n🚀 Starting optimization (50 trials)...\n")
best_params = calibrator.calibrate(
    params_to_tune=params_to_tune,
    param_bounds=param_bounds,
    n_trials=50,
    weights=weights,
    show_progress_bar=True
)

print("\n✅ Calibration complete!")
print(f"Best parameters: {best_params}")

## 5️⃣ Compare Before vs After

**Expected improvement:**
- RMSE reduction: 20-50%
- Better VFA tracking (critical for stability)
- Better biogas prediction

In [None]:
# Run with calibrated parameters
print("Running simulation with calibrated parameters...")
final_results = model.run(verbose=False)
final_metrics = model.evaluate()

# Compare metrics
print("\n📊 Calibration Improvement:")
print("=" * 60)
for var in initial_metrics:
    initial_rmse = initial_metrics[var]['RMSE']
    final_rmse = final_metrics[var]['RMSE']
    improvement = initial_rmse - final_rmse
    pct = (improvement / initial_rmse) * 100
    print(f"{var} RMSE: {initial_rmse:.4f} → {final_rmse:.4f} (Reduction: {pct:.1f}%)")
print("\n✅ Metrics should match examples/03_am2_calibration.py")

In [None]:
# Plot comparison (MATCH example layout)
plt.style.use('bmh')
fig, axes = plt.subplots(3, 1, figsize=(12, 10), sharex=True)

variables = ['S1', 'S2', 'Q']
labels = ['COD (S1)', 'VFA (S2)', 'Biogas (Q)']
time = final_results['time']

for i, var in enumerate(variables):
    ax = axes[i]
    
    # Measured data
    if f'{var}_measured' in final_results.columns:
        valid = ~final_results[f'{var}_measured'].isna()
        ax.plot(time[valid], final_results[f'{var}_measured'][valid], 
                'o', color='#2E86C1', markersize=6, label='Measured', alpha=0.7)
    
    # Initial model
    ax.plot(time, initial_results[var], '--', color='gray', 
            linewidth=2, label='Initial Model', alpha=0.7)
    
    # Calibrated model
    ax.plot(time, final_results[var], '-', color='#27AE60', 
            linewidth=2, label='Calibrated Model')
    
    ax.set_ylabel(labels[i], fontsize=14, fontweight='bold')
    ax.set_title(f'{labels[i]} Calibration Comparison', fontsize=16, pad=20)
    ax.legend(fontsize=12, frameon=True, facecolor='white', edgecolor='gray')
    ax.grid(True, linestyle='--', alpha=0.7)

axes[-1].set_xlabel('Time (days)', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

## 📝 Summary

This notebook demonstrated:

1. **AM2 Model** - Simplified 4-state AD model
2. **Simulation** - Running with default parameters
3. **Calibration** - Optuna-based parameter optimization
4. **Validation** - Before/after comparison

### 🎯 When to Calibrate?

✅ **Calibrate when:**
- New reactor/feedstock
- RMSE >20% with default parameters
- VFA predictions poor (stability critical!)

❌ **Don't calibrate when:**
- Limited data (<50 points)
- Default parameters already good (RMSE <10%)
- Just exploring scenarios

### 📚 Model Selection Guide

| Use Case | ADM1 | AM2 | LSTM/MTGP |
|----------|------|-----|------------|
| **Process understanding** | ✅ | ❌ | ❌ |
| **Fast simulation** | ❌ | ✅ | ✅ |
| **Control/MPC** | ❌ | ✅ | ✅ |
| **Uncertainty** | ❌ | ❌ | ✅ MTGP |
| **Limited data** | ❌ | ✅ | ✅ MTGP |
| **Temporal patterns** | ❌ | ❌ | ✅ LSTM |

### Next Steps

- Apply to [MPC Control](05_MPC_Control_Full.ipynb)
- Compare with [ADM1](01_ADM1_Tutorial_Full.ipynb) full model
- Use [MTGP](04_MTGP_Prediction_Full.ipynb) for uncertainty