<a href="https://colab.research.google.com/github/EvenSol/NeqSim-Colab/blob/master/notebooks/process/heat_exchanger_ml_external_unit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Machine-learning-driven heat exchanger unit operations in NeqSim

This notebook explores how to create a machine-learning-based surrogate for a heat exchanger, using NeqSim to generate training data and scikit-learn to fit the model. The surrogate is then embedded as an external unit operation so that it can exchange streams with the rest of a process flowsheet. The workflow is aimed at students who want to combine first-principles simulation with data-driven modeling.

## Learning objectives

By the end of this notebook you will be able to:

* configure and run a shell-and-tube heat exchanger model in NeqSim;
* generate synthetic plant data by sampling the rigorous model over a range of operating conditions;
* train a regression model with scikit-learn to predict outlet stream properties and heat duty;
* package the regression model as an external NeqSim unit operation with hot and cold inlet/outlet streams;
* compare the surrogate model with the rigorous heat exchanger to understand benefits and limitations.

## Background and recommended reading

The idea of combining first-principles models with data-driven surrogates is a recurring theme in process systems engineering. Rigorous process simulators such as NeqSim provide detailed physical fidelity, while machine learning models can approximate unit operations at a fraction of the computational cost once they have been trained on high-quality data. Hybrid modelling approaches allow engineers to blend mechanistic insight with plant or simulation data to support online optimization, real-time digital twins, and advanced control.

You can find more context and theory in the following resources:

* [Cengel, Y. A., & Ghajar, A. J. (2015). *Heat and Mass Transfer: Fundamentals & Applications*. McGraw-Hill.](https://www.mheducation.com/highered/product/heat-mass-transfer-fundamentals-applications-cengel-ghajar/M9780073398198.html)
* [Thompson, M. L., & Kramer, M. A. (1994). Modeling chemical processes using prior knowledge and neural networks. *AIChE Journal*, 40(8), 1328-1340.](https://doi.org/10.1002/aic.690400803)
* [Qin, S. J. (2012). Survey on data-driven industrial process monitoring and diagnosis. *Annual Reviews in Control*, 36(2), 220-234.](https://doi.org/10.1016/j.arcontrol.2012.09.004)
* [Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. *Journal of Machine Learning Research*, 12, 2825-2830.](https://doi.org/10.1145/1953048.2078195)
* [Bhosekar, A., & Ierapetritou, M. (2018). Advances in surrogate based modeling, feasibility analysis, and optimization: A review. *Computers & Chemical Engineering*, 108, 250-267.](https://doi.org/10.1016/j.compchemeng.2017.08.006)
* [Sun, M., & Dong, J. (2022). Data-driven and hybrid modeling for process systems engineering. *Computers & Chemical Engineering*, 158, 107624.](https://doi.org/10.1016/j.compchemeng.2021.107624)
* [NeqSim documentation portal](https://neqsim.com) for thermodynamic and process simulation background.


### How the hybrid unit fits into a process flowsheet

Before diving into the code, the schematic below summarises how NeqSim generates rigorous heat-exchanger data, how scikit-learn calibrates a surrogate, and how the trained `MachineLearningHeatExchanger` exchanges streams with the rest of a flowsheet. Use it as a conceptual map while experimenting with the notebook.


In [None]:
    fig, ax = plt.subplots(figsize=(11, 4))
    ax.axis('off')
    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)

    palette = {
        'rigorous': '#457b9d',
        'data': '#264653',
        'ml': '#2a9d8f',
        'unit': '#e76f51',
        'flowsheet': '#f4a261',
    }

    TEXT_STYLE = dict(
        color='white',
        fontsize=12,
        ha='center',
        va='center',
        fontweight='bold',
        path_effects=[withStroke(linewidth=3, foreground='black')],
    )

    boxes = {}

    def register_box(name, xy, width, height, label, color):
        patch = FancyBboxPatch(
            xy,
            width,
            height,
            boxstyle='round,pad=0.03',
            linewidth=1.6,
            edgecolor='black',
            facecolor=color,
            alpha=0.95,
        )
        ax.add_patch(patch)
        ax.text(xy[0] + width / 2, xy[1] + height / 2, label, **TEXT_STYLE)
        boxes[name] = {'xy': xy, 'width': width, 'height': height}

    register_box('rigorous', (0.03, 0.58), 0.26, 0.32, 'Rigorous NeqSim
heat exchanger model', palette['rigorous'])
    register_box('data', (0.36, 0.58), 0.26, 0.32, 'Synthetic training
cases (features + targets)', palette['data'])
    register_box('ml', (0.69, 0.58), 0.26, 0.32, 'scikit-learn
surrogate training', palette['ml'])
    register_box('unit', (0.22, 0.12), 0.30, 0.32, 'MachineLearning
HeatExchanger unit', palette['unit'])
    register_box('flowsheet', (0.60, 0.12), 0.32, 0.32, 'Integrated NeqSim
process flowsheet', palette['flowsheet'])

    def box_anchor(name, side):
        info = boxes[name]
        x, y = info['xy']
        w, h = info['width'], info['height']
        anchors = {
            'left': (x, y + h / 2),
            'right': (x + w, y + h / 2),
            'top': (x + w / 2, y + h),
            'bottom': (x + w / 2, y),
            'center': (x + w / 2, y + h / 2),
        }
        return anchors[side]

    def connect(src, dst, src_side, dst_side, text=None, text_offset=(0, 0), linestyle='-', rad=0.0):
        start = box_anchor(src, src_side)
        end = box_anchor(dst, dst_side)
        arrow = FancyArrowPatch(
            start,
            end,
            arrowstyle='-|>',
            mutation_scale=18,
            linewidth=2.0,
            color='#1d3557',
            linestyle=linestyle,
            connectionstyle=f'arc3,rad={rad}',
        )
        ax.add_patch(arrow)
        if text:
            mid_x = (start[0] + end[0]) / 2 + text_offset[0]
            mid_y = (start[1] + end[1]) / 2 + text_offset[1]
            ax.text(mid_x, mid_y, text, fontsize=11, ha='center', va='center', color='#1d3557', fontweight='semibold')

    connect('rigorous', 'data', 'right', 'left', text='Sample operating
conditions')
    connect('data', 'ml', 'right', 'left', text='Train regression
model')
    connect('ml', 'unit', 'bottom', 'top', text='Export surrogate', text_offset=(0, -0.04), rad=-0.1)
    connect('unit', 'flowsheet', 'right', 'left', text='Predict duty &
outlet properties')
    connect('flowsheet', 'data', 'top', 'bottom', text='Plant historian
or lab data', text_offset=(0, 0.04), linestyle='--', rad=0.2)

    ax.text(0.02, 0.04, 'Arrows indicate data and information flow between components.', fontsize=11, color='#1d3557')
    plt.show()


## 1. Set up the Python environment

Run the following cell if you are executing the notebook in a fresh environment (for example on Google Colab) to install the required packages.

In [None]:
# %pip install neqsim==2.5.35 scikit-learn pandas matplotlib

## 2. Import libraries and define helper functions

We will use NeqSim for process simulation, pandas and NumPy for data handling, matplotlib for quick visualisation, and scikit-learn for the machine learning workflow.

In [None]:
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from matplotlib.patches import FancyArrowPatch, FancyBboxPatch
from matplotlib.patheffects import withStroke

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, r2_score

from neqsim.thermo import fluid, TPflash
from neqsim.process import clearProcess, stream, heatExchanger, runProcess
from neqsim import jNeqSim
from neqsim.process.unitop import unitop

plt.style.use('seaborn-v0_8')
plt.rcParams['axes.titleweight'] = 'bold'
pd.options.display.float_format = '{:,.3f}'.format
np.random.seed(42)


In [None]:
def make_natural_gas_fluid():
    gas = fluid('srk')
    gas.addComponent('methane', 0.88)
    gas.addComponent('ethane', 0.06)
    gas.addComponent('propane', 0.03)
    gas.addComponent('n-butane', 0.02)
    gas.addComponent('n-pentane', 0.01)
    gas.setMixingRule(2)
    return gas


def make_cooling_water_fluid():
    water = fluid('srk')
    water.addComponent('water', 1.0)
    water.setMixingRule(2)
    return water


def simulate_case(case_id, hot_temp_C, hot_press_bara, hot_flow_MSm3_day,
                  cold_temp_C, cold_press_bara, cold_flow_kg_hr, ua_value):
    clearProcess()
    hot_fluid = make_natural_gas_fluid()
    hot_fluid.setTemperature(hot_temp_C, 'C')
    hot_fluid.setPressure(hot_press_bara, 'bara')
    hot_fluid.setTotalFlowRate(hot_flow_MSm3_day, 'MSm3/day')

    cold_fluid = make_cooling_water_fluid()
    cold_fluid.setTemperature(cold_temp_C, 'C')
    cold_fluid.setPressure(cold_press_bara, 'bara')
    cold_fluid.setTotalFlowRate(cold_flow_kg_hr, 'kg/hr')

    hot_stream = stream(f'hot_{case_id}', hot_fluid)
    cold_stream = stream(f'cold_{case_id}', cold_fluid)

    hx = heatExchanger(f'HX_{case_id}', hot_stream, cold_stream)
    hx.setUAvalue(ua_value)
    runProcess()

    hot_out = hx.getOutStream(0)
    cold_out = hx.getOutStream(1)

    record = {
        'case_id': case_id,
        'hot_in_T_C': hot_temp_C,
        'hot_in_p_bara': hot_press_bara,
        'hot_in_massflow_kg_per_hr': hot_stream.getFlowRate('kg/hr'),
        'cold_in_T_C': cold_temp_C,
        'cold_in_p_bara': cold_press_bara,
        'cold_in_massflow_kg_per_hr': cold_stream.getFlowRate('kg/hr'),
        'UA_W_per_K': ua_value,
        'hot_out_T_C': hot_out.getTemperature('C'),
        'cold_out_T_C': cold_out.getTemperature('C'),
        'duty_kW': hx.getDuty() / 1e3,
    }
    return record


def describe_case(record):
    display(pd.DataFrame([record]).set_index('case_id').T)

## 3. Inspect a single rigorous NeqSim heat exchanger simulation

We first run the built-in NeqSim heat exchanger model for one operating point to understand the inputs and outputs that we want our surrogate to learn.

In [None]:
baseline_case = simulate_case(
    case_id='baseline',
    hot_temp_C=130.0,
    hot_press_bara=55.0,
    hot_flow_MSm3_day=1.1,
    cold_temp_C=15.0,
    cold_press_bara=5.0,
    cold_flow_kg_hr=220_000.0,
    ua_value=120_000.0,
)
describe_case(baseline_case)

## 4. Generate a training data set from NeqSim

To train a machine learning model we need examples that cover the relevant range of operation. The loop below perturbs hot-side temperature, pressure, and flow rate together with cold-side inlet temperature and flow rate, and different overall heat transfer coefficients (expressed as UA values). Each combination is simulated with the rigorous NeqSim heat exchanger, and the resulting data are stored in a pandas `DataFrame`.

In [None]:
hot_temps = [100.0, 130.0, 160.0]
hot_pressures = [40.0, 55.0]
hot_flows = [0.8, 1.1, 1.4]
cold_temps = [5.0, 15.0, 25.0]
cold_flows = [180_000.0, 220_000.0, 260_000.0]
ua_values = [90_000.0, 130_000.0]

records = []
case_counter = 0
for hot_temp in hot_temps:
    for hot_pressure in hot_pressures:
        for hot_flow in hot_flows:
            for cold_temp in cold_temps:
                for cold_flow in cold_flows:
                    for ua in ua_values:
                        record = simulate_case(
                            case_id=f'case_{case_counter}',
                            hot_temp_C=float(hot_temp),
                            hot_press_bara=float(hot_pressure),
                            hot_flow_MSm3_day=float(hot_flow),
                            cold_temp_C=float(cold_temp),
                            cold_press_bara=5.0,
                            cold_flow_kg_hr=float(cold_flow),
                            ua_value=float(ua),
                        )
                        records.append(record)
                        case_counter += 1

training_data = pd.DataFrame(records)
training_data.head()

In [None]:
training_data.describe()

### Visualise the design of experiments

Machine-learning surrogates are only as good as the operating envelope they see during training. The pairplot below highlights how the NeqSim sampling campaign spans hot-feed temperature, cooling-water flow, and the resulting heat duty.


In [None]:
plot_features = pd.DataFrame({
    'Hot inlet T [°C]': training_data['hot_in_T_C'],
    'Cold inlet T [°C]': training_data['cold_in_T_C'],
    'Cooling-water flow [t/h]': training_data['cold_in_massflow_kg_per_hr'] / 1000.0,
    'Heat duty [kW]': training_data['duty_kW'],
})
axes = pd.plotting.scatter_matrix(
    plot_features,
    figsize=(10, 10),
    diagonal='hist',
    color='#2a9d8f',
    alpha=0.6,
    range_padding=0.1,
)
for ax in axes.ravel():
    ax.set_facecolor('#f6f6f6')
    ax.grid(False)
plt.suptitle('Design space coverage and resulting duty', y=1.02, fontsize=15, fontweight='bold')
plt.show()


## 5. Train a machine learning surrogate

We now split the data set into training and test subsets, fit a random forest regressor that predicts hot outlet temperature, cold outlet temperature, and heat duty, and evaluate the performance with the coefficient of determination ($R^2$) and mean absolute error (MAE).

In [None]:
feature_columns = [
    'hot_in_T_C',
    'hot_in_p_bara',
    'hot_in_massflow_kg_per_hr',
    'cold_in_T_C',
    'cold_in_p_bara',
    'cold_in_massflow_kg_per_hr',
    'UA_W_per_K',
]
target_columns = ['hot_out_T_C', 'cold_out_T_C', 'duty_kW']

X = training_data[feature_columns]
y = training_data[target_columns]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

regressor = RandomForestRegressor(n_estimators=300, random_state=42)
regressor.fit(X_train, y_train)

y_pred = regressor.predict(X_test)
r2 = r2_score(y_test, y_pred, multioutput='raw_values')
mae = mean_absolute_error(y_test, y_pred, multioutput='raw_values')

metrics = pd.DataFrame({
    'Target': target_columns,
    'R2 score': r2,
    'Mean absolute error': mae,
})
metrics

In [None]:
y_pred_df = pd.DataFrame(y_pred, columns=target_columns, index=y_test.index)
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
for idx, target in enumerate(target_columns):
    ax = axes[idx]
    ax.scatter(y_test[target], y_pred_df[target], alpha=0.6)
    lims = [
        min(y_test[target].min(), y_pred_df[target].min()),
        max(y_test[target].max(), y_pred_df[target].max()),
    ]
    ax.plot(lims, lims, 'k--', linewidth=1)
    ax.set_xlabel('NeqSim reference')
    ax.set_ylabel('ML prediction')
    ax.set_title(target)
    ax.grid(True)
plt.tight_layout()
plt.show()

### Inspect residual distributions

Parity plots are complemented by checking the residuals. Histograms make it easy to see any systematic bias that the surrogate might introduce when compared with the rigorous NeqSim model.


In [None]:
residuals = y_test[target_columns] - y_pred_df[target_columns]
fig, axes = plt.subplots(1, len(target_columns), figsize=(15, 4))
for idx, target in enumerate(target_columns):
    ax = axes[idx]
    ax.hist(residuals[target], bins=12, color='#577590', alpha=0.85, edgecolor='white')
    ax.axvline(0.0, color='black', linestyle='--', linewidth=1)
    ax.set_title(target.replace('_', ' '))
    ax.set_xlabel('Residual (NeqSim - ML)')
    ax.set_ylabel('Frequency')
    ax.grid(False)
plt.tight_layout()
plt.show()


## 6. Create an external machine-learning heat exchanger unit

The class below extends the NeqSim `unitop` interface and wraps the trained scikit-learn regressor. The unit accepts hot and cold inlet streams, uses their thermodynamic states to build the feature vector, predicts outlet conditions and heat duty, and returns outlet `Stream` objects just like the rigorous heat exchanger. This makes it possible to plug the surrogate into any `ProcessSystem`.

In [None]:
class MachineLearningHeatExchanger(unitop):
    def __init__(self, model, feature_names, target_names):
        super().__init__()
        self.model = model
        self.feature_names = list(feature_names)
        self.target_names = list(target_names)
        self.ua_value = None
        self.hot_inletstream = None
        self.cold_inletstream = None
        self.hot_outletstream = None
        self.cold_outletstream = None
        self.duty = None
        self.latest_results = None
        self.setName('ML Heat Exchanger')

    def setUAvalue(self, ua_value):
        self.ua_value = ua_value

    def setHotInletStream(self, stream_in):
        self.hot_inletstream = stream_in
        self.hot_outletstream = stream_in.clone()
        self.hot_outletstream.setName(f'{stream_in.getName()}_out')

    def setColdInletStream(self, stream_in):
        self.cold_inletstream = stream_in
        self.cold_outletstream = stream_in.clone()
        self.cold_outletstream.setName(f'{stream_in.getName()}_out')

    def getOutStream(self, index):
        if index == 0:
            return self.hot_outletstream
        if index == 1:
            return self.cold_outletstream
        raise IndexError('Heat exchanger has two outlet streams: 0 for hot, 1 for cold.')

    def getDuty(self):
        return self.duty

    def toJson(self):
        snapshot = {
            'name': self.getName(),
            'UA_W_per_K': self.ua_value,
            'duty_kW': None if self.duty is None else self.duty / 1e3,
        }
        if self.hot_outletstream is not None:
            snapshot['hot_out_T_C'] = self.hot_outletstream.getTemperature('C')
        if self.cold_outletstream is not None:
            snapshot['cold_out_T_C'] = self.cold_outletstream.getTemperature('C')
        return json.dumps(snapshot)

    def run(self, identifier):
        if self.hot_inletstream is None or self.cold_inletstream is None:
            raise ValueError('Both hot and cold inlet streams must be set before running the ML heat exchanger.')
        if self.ua_value is None:
            raise ValueError('Please provide a UA value with setUAvalue before running the ML heat exchanger.')
        self.serialVersionUID = identifier

        hot_features = {
            'hot_in_T_C': self.hot_inletstream.getTemperature('C'),
            'hot_in_p_bara': self.hot_inletstream.getPressure('bara'),
            'hot_in_massflow_kg_per_hr': self.hot_inletstream.getFlowRate('kg/hr'),
        }
        cold_features = {
            'cold_in_T_C': self.cold_inletstream.getTemperature('C'),
            'cold_in_p_bara': self.cold_inletstream.getPressure('bara'),
            'cold_in_massflow_kg_per_hr': self.cold_inletstream.getFlowRate('kg/hr'),
        }
        feature_vector = {**hot_features, **cold_features, 'UA_W_per_K': self.ua_value}
        X = [[feature_vector[name] for name in self.feature_names]]
        predictions = self.model.predict(X)[0]
        results = dict(zip(self.target_names, predictions))
        self.latest_results = {**feature_vector, **results}

        hot_fluid = self.hot_inletstream.getFluid().clone()
        hot_fluid.setTemperature(float(results['hot_out_T_C']), 'C')
        hot_fluid.setPressure(float(hot_features['hot_in_p_bara']), 'bara')
        TPflash(hot_fluid)
        self.hot_outletstream.setFluid(hot_fluid)

        cold_fluid = self.cold_inletstream.getFluid().clone()
        cold_fluid.setTemperature(float(results['cold_out_T_C']), 'C')
        cold_fluid.setPressure(float(cold_features['cold_in_p_bara']), 'bara')
        TPflash(cold_fluid)
        self.cold_outletstream.setFluid(cold_fluid)

        self.duty = float(results['duty_kW']) * 1e3
        return self.latest_results

## 7. Validate the surrogate inside a NeqSim process system

We now compare the surrogate with the rigorous model for a new operating point that was not part of the training set. The surrogate is added to a `ProcessSystem`, receives the hot and cold inlet streams, and produces outlet streams and duty predictions.

In [None]:
validation_conditions = {
    'hot_in_T_C': 142.0,
    'hot_in_p_bara': 58.0,
    'hot_in_flow_MSm3_day': 1.25,
    'cold_in_T_C': 12.0,
    'cold_in_p_bara': 5.0,
    'cold_in_flow_kg_per_hr': 235_000.0,
    'UA_W_per_K': 125_000.0,
}

physics_record = simulate_case(
    case_id='validation_reference',
    hot_temp_C=validation_conditions['hot_in_T_C'],
    hot_press_bara=validation_conditions['hot_in_p_bara'],
    hot_flow_MSm3_day=validation_conditions['hot_in_flow_MSm3_day'],
    cold_temp_C=validation_conditions['cold_in_T_C'],
    cold_press_bara=validation_conditions['cold_in_p_bara'],
    cold_flow_kg_hr=validation_conditions['cold_in_flow_kg_per_hr'],
    ua_value=validation_conditions['UA_W_per_K'],
)

clearProcess()
hot_fluid_val = make_natural_gas_fluid()
hot_fluid_val.setTemperature(validation_conditions['hot_in_T_C'], 'C')
hot_fluid_val.setPressure(validation_conditions['hot_in_p_bara'], 'bara')
hot_fluid_val.setTotalFlowRate(validation_conditions['hot_in_flow_MSm3_day'], 'MSm3/day')

cold_fluid_val = make_cooling_water_fluid()
cold_fluid_val.setTemperature(validation_conditions['cold_in_T_C'], 'C')
cold_fluid_val.setPressure(validation_conditions['cold_in_p_bara'], 'bara')
cold_fluid_val.setTotalFlowRate(validation_conditions['cold_in_flow_kg_per_hr'], 'kg/hr')

hot_stream_ml = stream('validation_hot_in', hot_fluid_val)
cold_stream_ml = stream('validation_cold_in', cold_fluid_val)

ml_hex = MachineLearningHeatExchanger(regressor, feature_columns, target_columns)
ml_hex.setName('ML Heat Exchanger (validation)')
ml_hex.setHotInletStream(hot_stream_ml)
ml_hex.setColdInletStream(cold_stream_ml)
ml_hex.setUAvalue(validation_conditions['UA_W_per_K'])

process_ml = jNeqSim.processSimulation.processSystem.ProcessSystem()
process_ml.add(hot_stream_ml)
process_ml.add(cold_stream_ml)
process_ml.add(ml_hex)

hot_out_stream_ml = jNeqSim.processSimulation.processEquipment.stream.Stream('validation_hot_out')
hot_out_stream_ml.setStream(ml_hex.getOutStream(0))
cold_out_stream_ml = jNeqSim.processSimulation.processEquipment.stream.Stream('validation_cold_out')
cold_out_stream_ml.setStream(ml_hex.getOutStream(1))

process_ml.add(hot_out_stream_ml)
process_ml.add(cold_out_stream_ml)
process_ml.run()

comparison = pd.DataFrame({
    'Quantity': [
        'Hot outlet temperature [°C]',
        'Cold outlet temperature [°C]',
        'Heat duty [kW]',
    ],
    'NeqSim rigorous model': [
        physics_record['hot_out_T_C'],
        physics_record['cold_out_T_C'],
        physics_record['duty_kW'],
    ],
    'ML surrogate': [
        ml_hex.getOutStream(0).getTemperature('C'),
        ml_hex.getOutStream(1).getTemperature('C'),
        ml_hex.getDuty() / 1e3,
    ],
})
comparison['Absolute error'] = np.abs(comparison['NeqSim rigorous model'] - comparison['ML surrogate'])
comparison['Relative error [%]'] = 100.0 * comparison['Absolute error'] / np.maximum(np.abs(comparison['NeqSim rigorous model']), 1e-6)
comparison

In [None]:
ml_hex.toJson()

## 8. Next steps

* Replace the random forest with other regressors (for example Gaussian process regression or gradient boosting) and compare accuracy and computational cost.
* Augment the feature set with additional measurements such as inlet enthalpies, pressures drops, or fouling factors if they are important in your plant.
* Combine the surrogate unit operation with online plant data to build soft sensors or to calibrate the model continuously.
* Deploy the surrogate in optimisation or control studies where a fast-running model is advantageous.

## 9. Application possibilities enabled by the surrogate unit

The external `MachineLearningHeatExchanger` makes it straightforward to weave hybrid physics/data-driven models into full NeqSim process simulations. Below are representative opportunities to spark student discussions and project work.

1. **Rapid scenario screening for process engineers.** Because the surrogate predicts results in milliseconds, entire operating envelopes can be explored interactively. The code cell below sweeps inlet temperatures, cooling-water rates, and overall heat-transfer coefficients to map the resulting duty and outlet temperatures.
2. **Online digital twins and soft sensors.** The `toJson()` snapshot, combined with live plant measurements, can populate dashboards for fouling detection, steam-demand forecasting, or anomaly alerts without the computational expense of rigorous solvers.
3. **Closed-loop optimisation and control studies.** The unit can sit inside model-predictive-control (MPC) or reinforcement-learning (RL) environments that require fast rollouts. Students can swap the random forest for differentiable regressors (e.g., gradient-boosted trees, neural networks) when gradient information is valuable.
4. **What-if energy integration and debottlenecking.** Multiple surrogates may be chained to approximate heat-recovery networks, enabling pinch-style trade-off studies or brownfield retrofit planning while the rigorous models remain available for final verification.
5. **Transfer learning with plant data.** New plant campaigns can be appended to the training data to adapt the surrogate to fouled or cleaned exchangers, illustrating how machine learning maintains fidelity throughout an asset's lifecycle.


In [None]:
scenario_space = []
for hot_temp in [130.0, 140.0, 150.0]:
    for cold_flow in [200_000.0, 250_000.0, 300_000.0]:
        for ua in [100_000.0, 125_000.0]:
            scenario_space.append({
                'case_id': f'T{int(hot_temp)}_CW{int(cold_flow/1000)}_UA{int(ua/1000)}',
                'hot_in_T_C': hot_temp,
                'hot_in_p_bara': 58.0,
                'hot_in_massflow_kg_per_hr': training_data['hot_in_massflow_kg_per_hr'].median(),
                'cold_in_T_C': 10.0,
                'cold_in_p_bara': 5.0,
                'cold_in_massflow_kg_per_hr': cold_flow,
                'UA_W_per_K': ua,
            })
scenario_df = pd.DataFrame(scenario_space)
predicted = regressor.predict(scenario_df[feature_columns])
scenario_results = scenario_df[['case_id', 'hot_in_T_C', 'cold_in_massflow_kg_per_hr', 'UA_W_per_K']].copy()
for idx, target in enumerate(target_columns):
    scenario_results[target] = predicted[:, idx]
scenario_results


In [None]:
ua_levels = sorted(scenario_results['UA_W_per_K'].unique())
cw_values = sorted(scenario_results['cold_in_massflow_kg_per_hr'].unique())
hot_values = sorted(scenario_results['hot_in_T_C'].unique())

fig, axes = plt.subplots(2, len(ua_levels), figsize=(5 * len(ua_levels), 8), sharex=True, sharey=True)
axes = np.array(axes)
if axes.ndim == 1:
    axes = axes.reshape(2, 1)

for col, ua in enumerate(ua_levels):
    subset = scenario_results[np.isclose(scenario_results['UA_W_per_K'], ua)]
    pivot_duty = subset.pivot(index='cold_in_massflow_kg_per_hr', columns='hot_in_T_C', values='duty_kW').sort_index().sort_index(axis=1)
    pivot_cold = subset.pivot(index='cold_in_massflow_kg_per_hr', columns='hot_in_T_C', values='cold_out_T_C').sort_index().sort_index(axis=1)

    X, Y = np.meshgrid(pivot_duty.columns, pivot_duty.index / 1000.0)
    pcm0 = axes[0, col].pcolormesh(X, Y, pivot_duty.values, shading='nearest', cmap='inferno')
    pcm1 = axes[1, col].pcolormesh(X, Y, pivot_cold.values, shading='nearest', cmap='viridis')

    axes[0, col].set_title(f'UA = {ua:,.0f} W/K')
    axes[1, col].set_xlabel('Hot inlet T [°C]')

    fig.colorbar(pcm0, ax=axes[0, col], fraction=0.046, pad=0.04, label='Heat duty [kW]')
    fig.colorbar(pcm1, ax=axes[1, col], fraction=0.046, pad=0.04, label='Cold outlet T [°C]')

axes[0, 0].set_ylabel('Cooling-water flow [t/h]')
axes[1, 0].set_ylabel('Cooling-water flow [t/h]')

for ax in axes[1, :]:
    ax.set_xticks(hot_values)
for ax in axes[:, 0]:
    ax.set_yticks([val / 1000.0 for val in cw_values])

fig.suptitle('Surrogate-based scenario sweeping', fontsize=16, fontweight='bold', y=0.93)
plt.tight_layout(rect=[0, 0, 1, 0.9])
plt.show()


The tabulated predictions and heatmaps illustrate how the surrogate can support front-end engineering questions: identify duty spikes when hot-feed temperature rises, observe how additional cooling-water flow influences cold outlet temperature, or shortlist UA upgrades with the largest payoff. The same pattern can power dashboard widgets, optimisation routines, or operator-training simulators that require instant feedback.

Use the coloured panels to spot sweet spots or constraint violations at a glance—especially useful when briefing cross-disciplinary teams about the impact of uncertain feed conditions or equipment fouling.


## References

* Cengel, Y. A., & Ghajar, A. J. (2015). *Heat and Mass Transfer: Fundamentals & Applications*. McGraw-Hill.
* Thompson, M. L., & Kramer, M. A. (1994). Modeling chemical processes using prior knowledge and neural networks. *AIChE Journal*, 40(8), 1328-1340. https://doi.org/10.1002/aic.690400803
* Qin, S. J. (2012). Survey on data-driven industrial process monitoring and diagnosis. *Annual Reviews in Control*, 36(2), 220-234. https://doi.org/10.1016/j.arcontrol.2012.09.004
* Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. *Journal of Machine Learning Research*, 12, 2825-2830. https://doi.org/10.1145/1953048.2078195
* Bhosekar, A., & Ierapetritou, M. (2018). Advances in surrogate based modeling, feasibility analysis, and optimization: A review. *Computers & Chemical Engineering*, 108, 250-267. https://doi.org/10.1016/j.compchemeng.2017.08.006
* Sun, M., & Dong, J. (2022). Data-driven and hybrid modeling for process systems engineering. *Computers & Chemical Engineering*, 158, 107624. https://doi.org/10.1016/j.compchemeng.2021.107624
* NeqSim documentation: https://neqsim.com
