<img src="../Images/DSC_Logo.png" style="width: 400px;">

In [None]:
!pip install numpy scipy tensorflow

# Hybrid Modeling

Domain knowledge is encoded in the form of physical laws or process-based models. Rather than replacing this knowledge, machine learning (ML) can be combined with it. The goal of hybrid modeling is to achieve the best of both worlds: models that obey known physical laws and remain interpretable, yet are adaptive to data and capable of learning unknown dynamics. In practice, this means coupling physical process equations with ML components in various ways. Hybrid models often show improved predictive performance over both standalone process-based models and standalone ML. ML being incrementally introduced into geoscientific tools rather than completely switching to a black-box model facilitates adoption and trust, as the model’s core behavior can be evaluated against known benchmarks and theory. 

However, the trade-off with hybrid approaches is generally between enforcing known theory and allowing the network to discover new patterns. In addition, the chosen physics constraints or information must be correct and relevant. While these models do enforce known laws, the parts of the model that are learned (like a neural network (NN) representing an unknown functional relationship) can still be hard to interpret and may require post-hoc XAI methods to analyze (see Notebook 4). Furthermore, developing a robust hybrid model can be considerably more complex than working with either approach alone. 

This notebook provides a brief overview of hybrid modelling approaches, based on multiple comprehensive reviews that synthesized ongoing research and current use of ML and hybrid models in the geosciences (Reichstein et al. 2019; Irrgang et al. 2021; Shen et al. 2023; Zhao et al. 2024) and beyond (Meng et al. 2025; Cuomo et al. 2025). The table below provides the main categories of hybrid modeling approaches, classified by how ML and physical knowledge are combined, and the respective roles of each component. Across these use cases, hybrids can be weakly coupled (flow of information is mostly one-directional or the ML and PB components operate somewhat independently) or strongly coupled (dynamic two-way interaction). 

| **Category**                  | **Integration Strategy**                                                    | **ML Role**         | **Physics Role**                   |
|------------------------------|------------------------------------------------------------------------------|---------------------|------------------------------------|
| **1. Physics-informed ML**   | Physics added via constraints in loss functions, architecture, or data      | Primary model       | Constraints or priors              |
| **2. ML inside PBMs**        | ML components embedded inside a physics-based model | Sub-component       | Structural backbone                |
| **3. Surrogates/emulators**  | ML trained to approximate outputs of a physical model                       | Replacement model   | Source of training data            |


## 1. Physics-informed or Physics-guided Machine Learning

One broad class of hybrid methods can be termed physics-informed or physics-guided ML. Here, the primary model is a NN, but it is augmented with physical knowledge to improve its realism and generalization. The motivation is **to impose scientific consistency on data-driven models**, thereby mitigating problems like overfitting, data scarcity, and unrealistic outputs. There are several techniques to integrate physics into an ML model, including:

- **Physics-based loss functions:** One common approach is adding penalty terms to the loss function of the network that represent deviations from known physical laws. This guides the learning process to satisfy conservation laws or other constraints. For example, in a deep learning model for lake temperatures, researchers included an energy conservation term in the loss (Read et al. 2019; Daw et al. 2022 - example below). The NN was trained not only to fit observed temperature data but also to minimize any violation of the lake heat budget. 

- **Architecture design and hard constraints:** Another avenue is building physics into the structure of the NN itself. This could mean encoding known invariants or symmetries (like conservation of energy or mass) directly into the model’s architecture, or using custom layers that implement physical operators. With this approach, outputs respect fundamental constraints (mass balance, energy, symmetry, etc.) by construction.

- **Data augmentation with simulations:** Physics knowledge can also inform the training data itself. In many geoscience problems, real observations are limited or do not sample extreme events well. To address this, one can generate additional synthetic data from physics-based simulations or analytical solutions, and use these to train or pretrain the ML model. This approach was effectively used in the above lake temperature studies by pretraining the network on output from a process-based model.

---

This example, adapted from Daw et al. (2022), demonstrates how physics-guided NNs can combine physical principles with data-driven learning to improve the accuracy and realism of environmental predictions (here: lake temperature modeling).

The core idea is to guide the training of a NN not only by its fit to observed data, but also by how well its predictions obey known physical laws. In the context of lake temperatures, one such law is that denser water tends to settle below lighter water, meaning that water density should increase with depth. Since water density is a known nonlinear function of temperature, we can use this relationship to impose a physical constraint.

In [None]:
import numpy as np
import scipy.io as spio

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, TerminateOnNaN
from tensorflow.keras import backend as K

Define physical and statistical loss components in functions:
- This is the standard empirical loss function, used to measure the prediction error (RMSE) between observed and predicted temperatures.

In [None]:
def root_mean_squared_error(y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))

- This function models the nonlinear relationship between water temperature and density, which is central to the physical constraint. Water is densest at around 4°C, and this function encodes that physical behavior.

In [None]:
def density(temp):
    return 1000 * (1 - (temp + 288.9414) * (temp - 3.9863)**2 / (508929.2 * (temp + 68.12963)))

- The following function defines the total loss used during training. It combines:
    - Mean squared error on the observed temperatures, and
    - A penalty for violating the density-depth rule (if upper layers are denser than lower ones). Only violations are penalized using ReLU, and λ (lambda) controls how strongly physics violations affect training.

In [None]:
def combined_loss(params):
    udendiff, lam = params
    def loss(y_true, y_pred):
        mse = K.mean(K.square(y_pred - y_true))       # MSE
        penalty = lam * K.mean(K.relu(udendiff))       # Physics constraint
        return mse + penalty
    return loss

- This function is used as a separate metric to monitor how much the model violates the physics constraint, averaged across all unlabeled depth pairs (i.e., all samples in uX1 and uX2 - see below).

In [None]:
def phy_loss_mean(params):
    udendiff, lam = params
    def loss(y_true, y_pred):
        return K.mean(K.relu(udendiff))
    return loss

Load and prepare data:

- We load the input features and observed temperatures for the Mille Lacs lake dataset provided with the study:
    - Xc_doy: Input features
    - Y: Observed temperatures (targets for supervised learning).

In [None]:
data_dir = "../Data/Daw_et_al_2022/datasets/"
lake_name = "mille_lacs"

filename = f"{data_dir}{lake_name}.mat"
mat = spio.loadmat(filename, squeeze_me=True, variable_names=['Y', 'Xc_doy', 'Modeled_temp'])

Xc = mat['Xc_doy']
Y = mat['Y']

- We split the data into training and test sets:

In [None]:
trainX, trainY = Xc[:3000, :], Y[:3000]
testX, testY = Xc[3000:, :], Y[3000:]

- We also load unlabeled data, consisting of depth-paired features where physical constraints apply but temperature is not observed. This enables semi-supervised learning, where even unlabeled samples contribute to training via physics constraints:

In [None]:
unsup_filename = f"{data_dir}{lake_name}_sampled.mat"
unsup_mat = spio.loadmat(unsup_filename, squeeze_me=True, variable_names=['Xc_doy1', 'Xc_doy2'])

# uX1 = input for shallower depth, uX2 = input for deeper depth
uX1 = unsup_mat['Xc_doy1'] # Features at shallower depth
uX2 = unsup_mat['Xc_doy2'] # Features at deeper depth

Define the NN:

We build a simple fully connected feedforward network that maps environmental inputs to predicted temperatures. No dropout is necessary and applied here, but it could be added to the network. Dropout layers are typically placed after dense (or other trainable) layers to regularize their outputs. 

In [None]:
model = Sequential() # This creates a basic NN where layers are added one after another
model.add(Dense(12, activation='relu', input_shape=(trainX.shape[1],)))  # Input layer: has 12 neurons; uses ReLU activation; inputs number of features as in training data
# model.add(Dropout(0.0)) 
model.add(Dense(12, activation='relu')) # Hidden layer with 12 neurons
# model.add(Dropout(0.0))
model.add(Dense(1, activation='linear'))  # Output layer: has 1 neuron; produces predicted temperature; uses linear activation (see Notebook 3; Sect. 4)

Apply the physics constraint:

- We convert the unlabeled inputs (for depth-paired data) into TensorFlow constants. These are used to calculate predicted temperatures at different depths, enabling the physics penalty.

In [None]:
uin1 = tf.constant(uX1, dtype=tf.float32)  # shallower inputs
uin2 = tf.constant(uX2, dtype=tf.float32)  # deeper inputs
lam = tf.constant(100.0, dtype=tf.float32)  # λ: strength of physics penalty

- We compute the difference in water density between shallower and deeper predicted temperatures. If this difference is positive, it means denser water lies on top which is a violation of physics.

In [None]:
uout1 = model(uin1)  # temperature prediction at shallow depth
uout2 = model(uin2)  # temperature prediction at deeper depth

udendiff = density(uout1) - density(uout2) # compute the difference: should be <= 0 (else, physics is violated)

Compile the model with:
- Combined loss (empirical + physics)
- Adam optimizer (with gradient clipping to improve stability)
- Metrics for both prediction accuracy and physics consistency

In [None]:
model.compile(
    loss=combined_loss([udendiff, lam]),           # Combined loss
    optimizer=Adam(clipnorm=1.0),                  # Adam optimizer 
    metrics=[
        phy_loss_mean([udendiff, lam]),            # Metric to monitor physics violation
        root_mean_squared_error                    # RMSE on labeled data
    ]
)

We train the model using early stopping to prevent overfitting. If the validation loss doesn’t improve for 100 epochs, training halts:

In [None]:
early_stopping = EarlyStopping(monitor='val_loss', patience=100, verbose=1, mode='min')

history = model.fit(
    trainX, trainY,
    batch_size=1000,
    epochs=100,
    validation_split=0.1,  # Use 10% of training data for validation
    verbose=1,
    callbacks=[early_stopping, TerminateOnNaN()]  # Stop training if loss becomes NaN
)

Evaluate the trained model on test data:
- Test RMSE to measure prediction accuracy.
- Mean density violation to assess how well the model respects physics.

In [None]:
score = model.evaluate(testX, testY, verbose=0)
print(f"Test RMSE (temperature error): {score[2]:.4f}")
print(f"Physics Loss (mean density violation): {score[1]:.6f}")

- An RMSE of ~2.66°C means the model's average error is about 2.66 degrees, which may be acceptable depending on the domain and the typical temperature range.
- 0.003 is a small value close to zero which means that in most depth pairs, the model respects the set physical constraint (few violations, or that any violations are very minor).

## 2. Machine Learning in Physical Models

Another major category of integration focuses on enhancing process-based models with ML components. Instead of using ML as the primary model, here the backbone is a physical model (for instance, a climate model or hydrological model), which is augmented by learned elements. The idea is to retain the trusted core of physical models while using statistical learning to improve those aspects that are uncertain, empirical, or too complex to derive from first principles. There are multiple points where ML can be integrated:

- Learning parameterizations: Physical models often contain tunable parameters or simplified subgrid parameterizations that represent unresolved processes (e.g. cloud microphysics, soil properties, vegetation traits). Traditionally these are set by empirical formulas or calibration. ML offers a way to learn optimal parameter values or relationships from data, rather than prescribing them. For example, instead of assigning fixed soil hydraulic parameters based on soil type, a ML model can be trained to map from easily observed soil and climate attributes to the best-fitting parameters for a given location. This was prototyped in hydrology by training a NN on data from thousands of catchments to predict catchment-specific model parameters, yielding a global hydrologic application that was more dynamic and data-informed (Beck et al. 2016).

- Replacing empirical sub-models: If a particular component of a process-based model is highly empirical or uncertain in form, one can replace that sub-model with an ML model trained on data. This yields a hybrid model that is part mechanistic (for the well-understood processes) and part data-driven (for the poorly understood parts).

- Bias correction and model output calibration: ML can be used in a post-processing role to correct systematic errors. Here, the physical model is run in the usual way, and an ML model is trained on the residuals or mismatches between model outputs and observations. The ML then learns to predict the bias as a function of relevant variables, allowing it to adjust the raw model output. Additionally, ML can aid in downscaling coarse model output to finer resolution by learning relationships between large-scale patterns and local outcomes.

## 3. Surrogates/ Emulators

Surrogate modeling, also called emulation, refers to using ML to mimic a physical model. The goal is to create a fast and lightweight ML model that can reproduce the outputs of a much slower, more complex simulation. This is useful because many physical models (like those used in climate or hydrology) are computationally expensive. They can take hours or days to run. With a trained ML surrogate, one can predict results almost instantly, which makes it possible to explore many scenarios, tune parameters, or do optimization tasks that would otherwise be too slow. 

How it works:
- Run the original physical model many times with different inputs (e.g., weather conditions, soil types, etc.).
- Train a NN on this input–output data.
- Use the ML model as a stand-in for the original simulator. It gives fast predictions that approximate what the physical model would have returned.

## References and Further Learning

Cuomo, S., Di Cola, V. S., Giampaolo, F., Rozza, G., Raissi, M., and Piccialli, F.: Scientific machine learning through physics–informed neural networks: Where we are and what’s next, Journal of Scientific Computing, 92, 88, doi:10.1007/s10915-022-01939-z,    2022.

Daw, A., Karpatne, A., Watkins, W. D., Read, J. S., and Kumar, V.: Physics-guided neural networks (pgnn): An application in lake temperature modeling, in: Knowledge guided machine learning, Chapman and Hall/CRC, 353–372, 2022.

Irrgang, C., Boers, N., Sonnewald, M., Barnes, E. A., Kadow, C., Staneva, J., and Saynisch-Wagner, J.: Towards neural Earth system modelling by integrating artificial intelligence in Earth system science, Nat Mach Intell, 3, 667–674, doi:10.1038/s42256-021-00374-3,      2021.

Meng, C., Griesemer, S., Cao, D., Seo, S., and Liu, Y.: When physics meets machine learning: A survey of physics-informed machine learning, Machine Learning for Computational Science and Engineering, 1, 1–23, doi:10.1007/s44379-025-00016-0,   2025.

Read, J. S., Jia, X., Willard, J., Appling, A. P., Zwart, J. A., Oliver, S. K., Karpatne, A., Hansen, G. J. A., Hanson, P. C., and Watkins, W.: Process‐guided deep learning predictions of lake water temperature, Water Resources Research, 55, 9173–9190, doi:10.1029/2019WR024922,  2019.

Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., and Prabhat: Deep learning and process understanding for data-driven Earth system science, Nature, 566, 195–204, doi:10.1038/s41586-019-0912-1,  2019.

Shen, C., Appling, A. P., Gentine, P., Bandai, T., Gupta, H., Tartakovsky, A., Baity-Jesi, M., Fenicia, F., Kifer, D., and Li, L.: Differentiable modelling to unify machine learning and physical models for geosciences, Nature Reviews Earth & Environment, 4, 552–567, doi:10.1038/s43017-023-00450-9, 2023.

Zhao, T., Wang, S., Ouyang, C., Chen, M., Liu, C., Zhang, J., Yu, L., Wang, F., Xie, Y., Li, J., Wang, F., Grunwald, S., Wong, B. M., Zhang, F., Qian, Z., Xu, Y., Yu, C., Han, W., Sun, T., Shao, Z., Qian, T., Chen, Z., Zeng, J., Zhang, H., Letu, H., Zhang, B., Wang, L., Luo, L., Shi, C., Su, H., Zhang, H., Yin, S., Huang, N., Zhao, W., Li, N., Zheng, C., Zhou, Y., Huang, C., Feng, D., Xu, Q., Wu, Y., Hong, D., Wang, Z., Lin, Y., Zhang, T., Kumar, P., Plaza, A., Chanussot, J., Zhang, J., Shi, J., and Wang, L.: Artificial intelligence for geoscience: Progress, challenges, and perspectives, Innovation (Cambridge (Mass.)), 5, 100691, doi:10.1016/j.xinn.2024.100691, 2024.