# 📘 ADM1 Complete Pipeline (Production Ready)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/benmola/OpenAD-lib/blob/main/notebooks/01_ADM1_Tutorial_Full.ipynb)

**Complete workflow:** ACoD → ADM1 Simulation → Validation

**⚠️ This notebook matches `examples/01_adm1_simulation.py` exactly**

---

## 📚 References
- **ADM1 Implementation**: [PyADM1](https://github.com/CaptainFerMag/PyADM1)
- **ADM1 Theory**: [Rosén & Jeppsson (2021)](https://www.biorxiv.org/content/biorxiv/early/2021/03/04/2021.03.03.433746.full.pdf)
- **ACoD Method**: [Astals et al. (2015)](https://pubmed.ncbi.nlm.nih.gov/27088248/)

## 🔬 ADM1 Background

### Model Complexity

ADM1 is the **most detailed** AD model:
- **35+ state variables**
- **19 biochemical processes**
- **Physicochemical reactions** (pH, gas-liquid transfer)

### Biochemical Cascade

```
Complex Organics
       ↓ Disintegration
Carbs, Proteins, Lipids
       ↓ Hydrolysis  
Sugars, Amino Acids, LCFA
       ↓ Acidogenesis
VFAs (Acetate, Propionate, Butyrate)
       ↓ Acetogenesis
Acetate + H₂
       ↓ Methanogenesis
CH₄ + CO₂
```

### Key Kinetics

**Monod with Inhibition:**
$$\rho_j = k_{m,j} \cdot \frac{S_j}{K_{S,j} + S_j} \cdot X_j \cdot I_{pH} \cdot I_{NH_3}$$

**Gas-Liquid Transfer:**
$$\rho_{T,i} = k_{L}a \cdot (S_i - K_{H,i} \cdot p_{gas,i})$$

### Why Use ADM1?

✅ **Best for:**
- Process understanding (mechanistic insights)
- pH prediction
- VFA speciation (acetate vs propionate)
- Research and validation

❌ **Not ideal for:**
- Real-time control (slow simulation)
- Online optimization
- Limited data (35+ parameters!)

## 1️⃣ Setup

In [None]:
# Install OpenAD-lib
# !pip install git+https://github.com/benmola/OpenAD-lib.git

import sys
import os

IN_COLAB = 'google.colab' in sys.modules

if not IN_COLAB:
    sys.path.append(os.path.join(os.getcwd(), '..', 'src'))

print(f"Running in Colab: {IN_COLAB}")

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, r2_score

from openad_lib.preprocessing import acod
from openad_lib.models.mechanistic import ADM1Model

print("✅ Imports successful!")

## 2️⃣ Step 1: ACoD Preprocessing

### What is ACoD?

**Anaerobic Co-Digestion** characterization method:
- **Input:** Feedstock mixture ratios (tonnes/day of each substrate)
- **Output:** 35 ADM1 state variables (concentrations)

### Method ([Astals et al., 2015](https://pubmed.ncbi.nlm.nih.gov/27088248/))

1. **Feedstock database** - Each substrate has known composition:
   - Carbohydrate fraction (X_ch)
   - Protein fraction (X_pr)  
   - Lipid fraction (X_li)
   - Particulate vs soluble distribution

2. **Mass balance** - Calculate weighted average:
   $$S_{su,in} = \sum_i \frac{m_i \cdot f_{carb,i}}{Q_{total}}$$

3. **COD conversion** - Map to ADM1 units (kg COD/m³)

### Input File Format

```csv
time, Maize, Chicken_Litter, Wholecrop, ...
1,    10.5,  2.3,             5.1,       ...
2,    12.1,  2.5,             4.8,       ...
```

In [None]:
# Download data for Colab
if IN_COLAB:
    !wget -q https://raw.githubusercontent.com/benmola/OpenAD-lib/main/src/openad_lib/data/feedstock/Feed_Data.csv
    !wget -q https://raw.githubusercontent.com/benmola/OpenAD-lib/main/src/openad_lib/data/Biogas_Plant_Outputs.csv
    ratios_file = 'Feed_Data.csv'
    measured_file = 'Biogas_Plant_Outputs.csv'
else:
    base_path = os.path.dirname(os.getcwd())
    ratios_file = os.path.join(base_path, 'src', 'openad_lib', 'data', 'feedstock', 'Feed_Data.csv')
    measured_file = os.path.join(base_path, 'src', 'openad_lib', 'data', 'Biogas_Plant_Outputs.csv')

print("✅ Data paths configured")

In [None]:
print("🔄 Running ACoD preprocessing...\n")
influent_df = acod.generate_influent_data(ratios_file)

print("✅ ACoD complete!")
print(f"Generated {influent_df.shape[0]} time points")
print(f"With {influent_df.shape[1]} ADM1 state variables\n")

# Show first few state variables
print("Sample columns:", list(influent_df.columns[:10]))
influent_df.head()

## 3️⃣ Step 2: ADM1 Simulation

### ODE System

For each state $S_i$:

$$\frac{dS_i}{dt} = \frac{Q_{in}}{V_{liq}}(S_{i,in} - S_i) + \sum_j \nu_{i,j} \cdot \rho_j$$

Where:
- $Q_{in}/V_{liq}$ = dilution rate (washout)
- $\nu_{i,j}$ = stoichiometric coefficients
- $\rho_j$ = process rates (Monod kinetics)

### Numerical Integration

- **Solver:** BDF (Backward Differentiation Formula)
- **Tolerances:** rtol=1e-5, atol=1e-6
- **Adaptive stepping** for stiff equations

### Computational Cost

For 150 days:
- **ADM1:** ~2-5 minutes (35 ODEs)
- **AM2:** ~1-2 seconds (4 ODEs)
- **LSTM:** ~0.1 seconds (forward pass)

In [None]:
# Initialize ADM1
print("Initializing ADM1 model...")
model = ADM1Model()

print(f"  Liquid volume: {model.V_liq} m³")
print(f"  Gas volume: {model.V_gas} m³")
print(f"  Temperature: {model.T_op - 273.15:.1f}°C\n")

# Run simulation
print("🚀 Starting ADM1 simulation...")
simulation_output = model.simulate(influent_df)

# Extract results
df_res = simulation_output['results']
df_qgas = simulation_output['q_gas']

print("\n✅ Simulation complete!")
print(f"   Simulated {len(df_res)} time points")
print(f"   Mean biogas: {df_qgas['q_gas'].mean():.2f} m³/day")

## 4️⃣ Step 3: Validation Against Measured Data

**Comparing:**
- Simulated biogas (from ADM1)
- Measured biogas (from real plant)

**Metrics:**
- **RMSE:** Prediction error
- **R²:** Variance explained

In [None]:
# Load measured data
biogas_data = pd.read_csv(measured_file)
print(f"📊 Loaded {len(biogas_data)} days of measured data")
print(f"   Mean measured biogas: {biogas_data['Biogas (m3/day)'].mean():.2f} m³/day")

In [None]:
# Calculate metrics (align lengths)
common_len = min(len(df_qgas), len(biogas_data))
simulated = df_qgas['q_gas'].values[:common_len]
measured = biogas_data['Biogas (m3/day)'].values[:common_len]

rmse = np.sqrt(mean_squared_error(measured, simulated))
r2 = r2_score(measured, simulated)

print("📊 Validation Metrics:")
print("=" * 50)
print(f"  RMSE: {rmse:.2f} m³/day")
print(f"  R²:   {r2:.3f}")
print(f"  Mean Error: {(simulated - measured).mean():.2f} m³/day")
print("\n✅ These metrics should match examples/01_adm1_simulation.py")

In [None]:
# Plot comparison (MATCH example layout)
plt.style.use('bmh')
fig, ax = plt.subplots(figsize=(14, 7))

# Measured data (dashed line)
ax.plot(biogas_data['time'], biogas_data['Biogas (m3/day)'], 
        label='Measured Data', 
        linestyle='--', 
        color='#2E86C1', 
        linewidth=3, 
        alpha=0.8)

# Simulated data (solid line)
ax.plot(df_qgas['time'], df_qgas['q_gas'], 
        label='ADM1 Prediction', 
        linestyle='-', 
        color='#E67E22', 
        linewidth=2)

ax.legend(fontsize=12, frameon=True, facecolor='white', edgecolor='gray')
ax.set_xlabel('Time (days)', fontsize=14, fontweight='bold')
ax.set_ylabel('Biogas Production Rate (m³/day)', fontsize=14, fontweight='bold')
ax.set_title(f'ADM1 Validation (RMSE={rmse:.2f}, R²={r2:.3f})', fontsize=16, pad=20)
ax.grid(True, linestyle='--', alpha=0.7)
plt.tight_layout()
plt.show()

## 5️⃣ Bonus: VFA Speciation

**ADM1's unique capability:** Predict individual VFA species

**Why does this matter?**
- **Propionate accumulation** → early warning of instability
- **Acetate/Propionate ratio** → process health indicator
- Only ADM1 can do this (AM2 has total VFA only)

In [None]:
fig, ax = plt.subplots(figsize=(14, 6))

# Plot individual VFAs
ax.plot(df_res['time'], df_res['S_ac'], label='Acetate', linewidth=2, color='#27AE60')
ax.plot(df_res['time'], df_res['S_pro'], label='Propionate', linewidth=2, color='#E74C3C')
ax.plot(df_res['time'], df_res['S_bu'], label='Butyrate', linewidth=2, color='#F39C12')
ax.plot(df_res['time'], df_res['S_va'], label='Valerate', linewidth=2, color='#9B59B6')

ax.set_xlabel('Time (days)', fontsize=14, fontweight='bold')
ax.set_ylabel('Concentration (kg COD/m³)', fontsize=14, fontweight='bold')
ax.set_title('VFA Speciation (ADM1 Only)', fontsize=16, pad=20)
ax.legend(fontsize=12)
ax.grid(True, linestyle='--', alpha=0.7)
plt.tight_layout()
plt.show()

# Stability check
final_ac = df_res['S_ac'].iloc[-1]
final_pro = df_res['S_pro'].iloc[-1]
ratio = final_ac / final_pro if final_pro > 0 else float('inf')

print(f"\n🔍 Process Stability Check:")
print(f"   Acetate: {final_ac:.4f} kg COD/m³")
print(f"   Propionate: {final_pro:.4f} kg COD/m³")
print(f"   Ac/Pro Ratio: {ratio:.2f}")

if ratio > 1.4:
    print("   ✅ Process stable (Ac/Pro > 1.4)")
else:
    print("   ⚠️ Warning: Propionate accumulation (Ac/Pro < 1.4)")

## 📝 Summary

This notebook demonstrated:

1. **ACoD Preprocessing** - Feedstock ratios → ADM1 states
2. **ADM1 Simulation** - Full 35-state ODE system
3. **Validation** - Comparison with measured biogas
4. **VFA Speciation** - Process stability monitoring

### 🎯 ADM1 vs Simpler Models

| Capability | ADM1 | AM2 | LSTM/MTGP |
|------------|------|-----|------------|
| **pH prediction** | ✅ | ❌ | ❌ |
| **VFA species** | ✅ | ❌ Total only | ❌ |
| **Inhibition effects** | ✅ NH₃, H₂S | ✅ VFA only | ❌ |
| **Simulation time (150 days)** | ~3 min | ~2 sec | ~0.1 sec |
| **Parameter count** | 35+ | 10 | Varies |
| **Mechanistic insights** | ✅ | ⚠️ Limited | ❌ |

### 📚 When to Use Each Model?

**Use ADM1 when:**
- Research & validation studies
- pH critical for your process
- Need VFA speciation
- Understanding failure modes

**Use AM2 when:**
- Control applications (MPC)
- Fast repeated simulations
- pH relatively stable

**Use LSTM/MTGP when:**
- Data-driven approach
- Uncertainty quantification (MTGP)
- Real-time prediction
- No mechanistic model needed

### Next Steps

- Try [AM2 simplified model](02_AM2_Modelling_Full.ipynb)
- Apply [MPC Control](05_MPC_Control_Full.ipynb)
- Compare with [LSTM](03_LSTM_Prediction_Full.ipynb) data-driven approach