# Notebook 6.1: Introduction to Data-Driven Models in MPC – Learning from Experience

Welcome to Part 6 of our Interactive MPC series! In the previous parts, particularly the bioreactor case studies, we relied on **first-principles models** (based on known physical, chemical, and biological laws) for our NMPC controllers. However, developing accurate first-principles models can be challenging, time-consuming, or even impossible for highly complex systems or when the underlying mechanisms are poorly understood.

This is where **data-driven modeling** comes into play. By leveraging historical or experimentally generated input-output data, we can train machine learning models to learn the system dynamics. These learned models can then serve as the predictive engine within an MPC framework.

**Goals of this Notebook:**
1.  Understand the motivation for using data-driven models in MPC.
2.  Get an overview of the data-driven modeling techniques we will explore in subsequent notebooks:
    *   Artificial Neural Networks (ANNs)
    *   Physics-Informed Neural Networks (PINNs)
    *   Gaussian Processes (GPs)
3.  Outline the general workflow for developing and using data-driven models for MPC.
4.  Guide you through setting up or verifying the necessary machine learning libraries (e.g., TensorFlow/Keras or PyTorch, GPy/GPflow) in your Python environment.
5.  Discuss conceptual examples where data-driven MPC could be beneficial.

## 1. Why Data-Driven Models for MPC?

While first-principles models offer deep process insight, their development faces several hurdles:
*   **Complexity:** Real-world systems (especially biological ones like bioreactors, or large industrial plants) can have incredibly complex, interacting dynamics.
*   **Lack of Knowledge:** The fundamental mechanisms might not be fully understood, or key parameters might be unknown and hard to measure.
*   **Time and Effort:** Deriving, implementing, and validating detailed mechanistic models can be a significant engineering effort.
*   **Adaptation:** First-principles models might not easily adapt to changes in the process (e.g., new raw materials, equipment wear, biological evolution) without re-derivation or re-parameterization.

**Data-driven models offer an alternative when:**
*   Sufficient historical or experimental data is available.
*   The underlying physics are too complex or unknown.
*   A rapid model development cycle is needed.
*   The system exhibits behaviors that are difficult to capture with simple mechanistic equations (e.g., subtle correlations, "soft" phenomena).

By learning directly from data, these models can potentially capture intricate system dynamics without explicit prior knowledge of all governing equations. When integrated into an MPC framework (often called **Learning-based MPC**), they enable predictive control for systems that were previously too challenging to model mechanistically.

## 2. Overview of Data-Driven Models to be Explored

In the upcoming notebooks (6.2, 6.3, 6.4), we will focus on three prominent data-driven modeling techniques for MPC:

1.  **Artificial Neural Networks (ANNs) - Notebook 6.2:**
    *   Highly flexible function approximators capable of learning complex nonlinear input-output mappings.
    *   Types like Feedforward NNs (FNNs/MLPs) and Recurrent NNs (RNNs, LSTMs, GRUs) can model dynamic systems.
    *   **Pros:** Universal approximation capabilities.
    *   **Cons:** Can be data-hungry, "black-box" nature, prone to overfitting, extrapolation can be unreliable, no inherent uncertainty quantification.

2.  **Physics-Informed Neural Networks (PINNs) - Notebook 6.3:**
    *   A hybrid approach that embeds known physical laws (ODEs/PDEs) into the training process of an ANN.
    *   The ANN learns to satisfy both the data and the physical constraints.
    *   **Pros:** Improved generalization from less data, more physically plausible predictions, can aid in parameter discovery.
    *   **Cons:** Requires knowledge of the governing equations (even if some parameters are unknown), training can be more complex.

3.  **Gaussian Processes (GPs) - Notebook 6.4:**
    *   A non-parametric, Bayesian approach that defines a distribution over functions.
    *   Learns a mean function and a covariance function (kernel) that describes the similarity between data points.
    *   **Pros:** Principled uncertainty quantification (provides predictive mean and variance), good with small datasets, allows incorporation of prior knowledge via kernels.
    *   **Cons:** Computationally intensive for large datasets ($O(N^3)$ for standard GPs), choice of kernel can be crucial, can struggle with very high-dimensional inputs.

Each of these methods has its strengths and weaknesses, making them suitable for different types of problems and data availability.

## 3. General Workflow for Data-Driven MPC

Regardless of the specific data-driven modeling technique chosen, the general workflow for developing and using it in MPC typically involves these steps:

1.  **Data Collection & Preprocessing:**
    *   Gather representative input-output data from the system (e.g., $u(t), y(t)$ or $x_k, u_k 
ightarrow x_{k+1}$). This data should cover the expected operating range and be sufficiently exciting.
    *   Clean the data (remove outliers, handle missing values).
    *   Normalize or scale the data.
    *   Split into training, validation, and test sets.

2.  **Model Structure Selection & Training:**
    *   Choose an appropriate model type (ANN, PINN, GP) and its specific architecture (e.g., number of layers/neurons for ANN, kernel type for GP).
    *   Train the model using the training dataset to learn the system dynamics (e.g., by minimizing a loss function that compares model predictions to actual data, and for PINNs, also includes physics residuals).
    *   Use the validation set to tune hyperparameters and prevent overfitting.

3.  **Model Validation:**
    *   Evaluate the trained model's performance on the unseen test set.
    *   Assess its ability to predict multi-step ahead (simulation capability), not just one-step ahead.
    *   Check for physical plausibility where possible.

4.  **Integration into MPC Framework:**
    *   Use the trained data-driven model as the predictive engine within an NMPC controller.
    *   This typically involves formulating an NLP where the model $f_{DDM}(x_k, u_k)$ is called repeatedly for prediction.
    *   Gradients of the data-driven model with respect to inputs $u_k$ (and potentially states $x_k$) are often needed by the NLP solver. This is where Automatic Differentiation (AD) capabilities of ML libraries become essential.

5.  **Closed-Loop Simulation & Testing:**
    *   Simulate the MPC controller with the learned model controlling a (simulated or real) plant.
    *   Evaluate performance, robustness, and constraint handling.

6.  **(Optional) Online Adaptation/Learning:**
    *   For some systems, the data-driven model might be updated or re-trained online as new data becomes available.

## 4. Setting Up Machine Learning Libraries

For the upcoming notebooks, we will need specific machine learning libraries. We already installed PyTorch in Notebook 0.0. Let's ensure it's correctly configured and also discuss other potential libraries.

Make sure your virtual environment from Notebook 0.0 (e.g., `.venv`) is activated before running installation commands.

### 4.1 PyTorch (Already Installed)

We installed PyTorch in Notebook 0.0. It will be used for ANN and potentially PINN examples due to its strong support for automatic differentiation (Autograd) and dynamic computation graphs.

Let's verify it again.

In [None]:
import torch
print(f"PyTorch version: {torch.__version__}")
if torch.cuda.is_available():
    print(f"PyTorch CUDA is available. Device: {torch.cuda.get_device_name(0)}")
else:
    print("PyTorch CUDA not available, will use CPU.")

### 4.2 TensorFlow with Keras (Alternative for ANNs/PINNs)

TensorFlow is another major deep learning framework. Keras is a high-level API that can run on top of TensorFlow (and other backends).

If you prefer to use TensorFlow/Keras, you can install it:
```bash
# Make sure .venv is activated
uv pip install tensorflow # For CPU
# For GPU support with TensorFlow, installation is more involved and system-dependent.
# Refer to: https://www.tensorflow.org/install/pip
```
Our examples will primarily use PyTorch for consistency, but the concepts are transferable.

### 4.3 Libraries for Gaussian Processes (GPs)

For Notebook 6.4 on GP-MPC, we'll need a library for Gaussian Processes. Popular choices include:

*   **GPy:** A well-established GP framework in Python built on NumPy/SciPy.
*   **GPflow:** A GP library built on TensorFlow, allowing for more complex models and leveraging TensorFlow's AD and GPU capabilities.
*   **Scikit-learn:** Includes a basic `GaussianProcessRegressor` which can be good for simple cases.

Let's install GPy and GPflow for flexibility:
```bash
# Make sure .venv is activated
uv pip install gpy gpflow scikit-learn
```

In [None]:
# Verify GP library installations (optional check)
try:
    import GPy
    print(f"GPy version: {GPy.__version__}")
except ImportError:
    print("GPy not found or installation issue.")

try:
    import gpflow
    print(f"GPflow version: {gpflow.__version__}")
except ImportError:
    print("GPflow not found or installation issue.")

import sklearn
print(f"Scikit-learn version: {sklearn.__version__}")

## 5. Conceptual Examples: Where Data-Driven MPC Shines

*   **Complex Chemical Reactions:** If reaction kinetics are unknown or involve many intermediate species, an ANN or GP could learn the input (temperature, catalyst concentration) to output (product yield, impurity levels) relationship for use in MPC.
*   **Biological Systems (Beyond our simplified bioreactor):**
    *   Modeling subtle metabolic shifts in response to environmental cues that are hard to capture with fixed kinetic parameters.
    *   Predicting the impact of raw material variability (e.g., different lots of media components) on cell growth or product quality, if features of the raw materials can be used as inputs to the data-driven model.
    *   *Our Bioreactor Example:* If the $q_P$ (specific productivity) in our bioreactor model showed complex, time-varying behavior not well described by simple Luedeking-Piret, we might try to learn $q_P = f_{ANN}(X_v, S, L_{lac}, \text{time})$ from past batch data.
*   **Manufacturing Processes with Human-in-the-Loop or Unmodeled Wear:** Learning how operator adjustments or machine degradation (not easily modeled mechanistically) affect process outputs.
*   **Systems with Dominant Unmeasured Disturbances:** If the effect of unmeasured disturbances can be learned from past data (e.g., by correlating deviations from a nominal model with other measurable signals), a data-driven model might capture these disturbance dynamics for better rejection by MPC.

The key is that the MPC still performs its optimization over a future horizon, but its "crystal ball" (the prediction model) is now a learned function rather than one derived purely from first principles.

## 6. Key Takeaways

*   Data-driven models provide a powerful alternative or complement to first-principles models for MPC when mechanistic understanding is limited or system complexity is high.
*   Techniques like ANNs, PINNs, and GPs each offer unique strengths for learning system dynamics from data.
*   The general workflow involves data collection, model training, validation, and integration into the MPC's prediction step.
*   Leveraging appropriate machine learning libraries (like PyTorch, TensorFlow, GPy, GPflow) along with CasADi (for integrating with NLP solvers) is crucial for implementing data-driven MPC.

In the next notebook (**Notebook 6.2: MPC with Artificial Neural Network (ANN) Models**), we will dive into our first hands-on example of building an ANN model for a dynamic system and using it within an NMPC controller.