# Surrogate model

With the predictors detemined and presented in [Prediction](Prediction.ipynb) now it is possible to build the surrogate models based on the FCNN, LSTM and Decoder Predictor and finally test the surrogate models.

## Development

### Preliminary steps

Import modules

In [1]:
# HDF5 and Scientific computing
import h5py
import numpy as np
import scipy.integrate as it

# Neural networks libraries
import keras.layers as layers
from keras.models import Model, load_model

# Custom modules
from utils import plot_red_comp, format_data, anim_comp
from utils_keras import loss_norm_error

Using TensorFlow backend.


In [2]:
# Keras custom objects
from keras.utils.generic_utils import get_custom_objects

# Set custom keras functions
get_custom_objects().update({"loss_norm_error": loss_norm_error})

In [3]:
# Import plot modules
import matplotlib.pyplot as plt
from matplotlib import animation, rc
from IPython.display import HTML, Image

# Seaborn plot style
import seaborn as sns
sns.set()

# Matplotlib settings
rc('animation', html='html5')
%matplotlib inline

Directories where the models are stored

In [4]:
# Architecture
arch_fcnn = "fcnn_v2"
arch_lstm = "lstm_v2"
arch_dec = "dec_v2"

# Paths to the model and Optuna study
path_model = "./hyperparameter-optimisation/models/model_pred-{}.h5"
path_study = "./hyperparameter-optimisation/models/study_{}.pkl"

# Autoencoder path
ae_conv = "smp_v4"
ae_path = "./hyperparameter-optimisation/models/model_ae-{}.h5"

Load data generated from the OpenLB

In [5]:
# Datasets to load
dt_fl = "nn_data.h5"
dt_dst = "scaled_data"

# Open data file
f = h5py.File(dt_fl, "r")
dt = f[dt_dst]

In order to obtain the reduced dimension data, it was developed a Python script that splits the autoencoder model into an encoder and decoder part and then generates the data in the lower dimension space. After generating it, the data was standardized to remove the mean and set the standard deviation to be $1$. Finally, the data was saved in a file named `data_compact.h5`. The Python script can be accessed at [script](./pre_proc_latent.py).

In [6]:
# Reduced dimensions
dt_fl_red = "data_compact.h5"
dt_dst_red = "model_ae-smp_4_scaled"

# Open data file
f_red = h5py.File(dt_fl_red, "r")
dt_red = f_red[dt_dst_red]

#Get the mean and standard deviation of the standardised data
mean = dt_red.attrs["mean"]
std = dt_red.attrs["std"]

### Autoencoder

Load the encoder and decoder from the autoencoder discussed at [Autoencoder](Autoencoder.ipynb).

In [7]:
# Load the autoencoder model
ae = load_model(ae_path.format(ae_conv))

# Generate the encoder
inputs = layers.Input(shape=ae.layers[0].input_shape[1:])
enc_lyr = inputs
for layer in ae.layers[1:5]:
    enc_lyr = layer(enc_lyr)
enc = Model(inputs=inputs, outputs=enc_lyr)

# Generate the decoder
inputs = layers.Input(shape=ae.layers[5].input_shape[1:])
dec_lyr = inputs
for layer in ae.layers[5:]:
    dec_lyr = layer(dec_lyr)

dec = Model(inputs=inputs, outputs=dec_lyr)

Create a function to scale and reconstruct the data back.

In [8]:
def reconstruct(x, mean, std):
    return dec.predict((x + mean)*std)

### FCNN


Load the FCNN model.

In [9]:
# Load trained model
fcnn = load_model(path_model.format(arch_fcnn))

As the FCNN does not depend the time step it can be passed only the initial time and final time. It is possible to save intermediate time steps by passing the number of steps and setting the optional argument `save=True`.

In [10]:
def fcnn_surrogate(x, mean, std, t0=0, tf=0, stps=1, dt=10, save=False):
    
    # Set the FCNN as the function to integrate
    def func(t, y):
        return fcnn.predict(y[np.newaxis])[0]
    
    # Set the times to simulate
    if tf > 0:
        t_stp = np.linspace(t0, tf, stps+1)
    else:
        t_stp = np.array([t0 + dt*i for i in range(stps + 1)])
    t_stp = np.vstack([t0*np.ones(t_stp.shape), t_stp]).T[1:]
    
    # Encode the input data
    x_enc = enc.predict(x)
    # standardizing the data
    x_enc = (x_enc - mean)/std
    
    # Initialize the array to reconstruct
    x_rec = np.empty((stps**save, *x_enc.shape[1:]))
    
    for i, j in enumerate(t_stp):
        # Integrate using explicit Runge-Kutta method of order 5(4)
        out = it.solve_ivp(func, [j[0], j[1]], x_enc[0])
        x_rec[i] = out.y[:, -1]
        

    # Reconstruct the data
    return reconstruct(x_rec, mean, std)

### LSTM


Load the LSTM model.

In [11]:
# Load trained model
lstm = load_model(path_model.format(arch_lstm))
# LSTM window
wd = 4
lstm_x, lstm_idx = format_data(dt, wd=wd, cont=True, get_idx=True)
lstm_idx = {idx[0]: i for i, idx in enumerate(lstm_idx)}

Create a function where the input is data in its original dimensions and the output is the predicted next time step.

In [12]:
# lstm_x[[lstm_idx[3]]]
def lstm_surrogate(x, mean, std, stps=1, save=False):
    # Save the x shape
    shp = x.shape
    # Flat time dimension
    x = x.reshape((x.shape[0]*x.shape[1], *x.shape[2:]))

    # Encode the input data
    x_enc = enc.predict(x)
    # standardizing the data
    x_enc = (x_enc - mean)/std

    # Initialize the array to reconstruct
    x_rec = np.empty((stps**save, *x_enc.shape[1:]))
    # Create the time dimension back
    x_enc = x_enc.reshape((shp[0], shp[1], x_enc.shape[-1]))

    # Iterate n time steps
    for i in range(stps):
        # Get next step
        nxt_x_enc = lstm.predict(x_enc)
        # Roll array to replace the last time step
        x_enc = np.roll(x_enc, -1, axis=1)
        x_enc[:, -1, :] = nxt_x_enc
        # Store the step
        x_rec[i*save] = nxt_x_enc[0]

    # Reconstruct the data
    return reconstruct(x_rec, mean, std)

### Decoder predictor

Load the decoder predictor model.

In [13]:
# Load trained model
pdec = load_model(path_model.format(arch_dec))

Create a function where the input is data in its original dimensions and the output is the predicted next time step.

In [14]:
def dec_surrogate(x, mean, std, stps=1, save=False):
    # Initialize the array to reconstruct
    x_rec = np.empty((stps*save, *x.shape[1:]))
    
    # Iterate n time steps
    for i in range(stps):
        # Encode the input data
        x_enc = enc.predict(x)
        # standardizing the data
        x_enc = (x_enc - mean)/std
        # Next step
        x = pdec.predict(x_enc)
        x_rec[i] = x[0]
        
    
    return x_rec

## Videos

Once generated the surrogate models, it is possible to assess their quality in predict successively the next time steps. Therefore, it was generated a video for the surrogate simulations, where the pressure, temperature, and velocity being predicted are shown in sequence together with the original data, and the MSE errors are shown for each time step and variable.

The first test will be case first case, starting at the time step $10$ and ending at the time step $30$, this cases were trained.

In [15]:
# Start index
start = 10
stps = 30

In [16]:
# FCNN
red = fcnn_surrogate(dt[[start]], mean, std, t0=10*start, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
fcnn_anim = anim_comp(org, red, [2], n_dim=90, alg="FCNN")
fcnn_anim

In [17]:
# LSTM
# The LSTM network requires previous time steps
red = fcnn_surrogate(lstm_x[lstm_idx[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
lstm_anim = anim_comp(org, red, [2], n_dim=90, alg="LSTM")
lstm_anim

In [18]:
# Decoder predictor
red = dec_surrogate(dt[[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
dec_anim = anim_comp(org, red, [2], n_dim=90, alg="Decoder")
dec_anim

It is clear that both the FCNN and LSTM neural networks performed poorly while sucessfully predicing the next steps, they  quickly start making wrong predictions after a few time steps for all variables. At the beggining, the FCNN predict better than the LSTM. After the time step $21$ the LSTM had a lower MSE and the FCNN MSE value. However, the spatial distribution for both cases are considerably different than the original data. It is worth mentioning that for the FCNN and LSTM the errors for the pressure were not as high as for the velocity. On the other hand, the decoder predictor performed resonably well. The worst MSE during the simulation was of $0.4940$, at the time step $12$, for the velocity and then decreases.

Finally, it is important to note that the FCNN had a slightly better results than the LSTM. It is considerably smaller than the LSTM and does not require previous time steps to run.

It will be presented test dataset from the study case $1$.

In [19]:
# Start index
case = 1
start = 90
stps = 10

# Convert to the global index
start = dt.attrs['idx'][case - 1 ][0] + start

In [20]:
# FCNN
red = fcnn_surrogate(dt[[start]], mean, std, t0=10*start, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
fcnn_anim = anim_comp(org, red, [2], n_dim=90, alg="FCNN")
fcnn_anim

In [21]:
# LSTM
# The LSTM network requires previous time steps
red = fcnn_surrogate(lstm_x[lstm_idx[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
lstm_anim = anim_comp(org, red, [2], n_dim=90, alg="LSTM")
lstm_anim

In [22]:
# Decoder predictor
red = dec_surrogate(dt[[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
dec_anim = anim_comp(org, red, [2], n_dim=90, alg="Decoder")
dec_anim

The overall performance of the three methods was similar, differently for the previous case. Meanwhile, similarly for the previous study case, the performance of the FCNN was slightly better than the LSTM at the first time steps, and then by the end of the simulation the performance of the FCNN degraded quickly surpassing the LSTM. However, differently from the previous case, the Decoder predictor appears to settle at a fixed point spatial distribution afterwhile, despite lower MSE values.

Finally, it will be presented another study cases just for visual analysis.

### Study case 2

In [23]:
# Start index
case = 2
start = 20
stps = 30

# Convert to the global index
start = dt.attrs['idx'][case - 1 ][0] + start

In [24]:
# FCNN
red = fcnn_surrogate(dt[[start]], mean, std, t0=10*start, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
fcnn_anim = anim_comp(org, red, [2], n_dim=90, alg="FCNN")
fcnn_anim

In [25]:
# LSTM
# The LSTM network requires previous time steps
red = fcnn_surrogate(lstm_x[lstm_idx[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
lstm_anim = anim_comp(org, red, [2], n_dim=90, alg="LSTM")
lstm_anim

In [26]:
# Decoder predictor
red = dec_surrogate(dt[[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
dec_anim = anim_comp(org, red, [2], n_dim=90, alg="Decoder")
dec_anim

### Study case 4

In [27]:
# Start index
case = 4
start = 50
stps = 30

# Convert to the global index
start = dt.attrs['idx'][case - 1 ][0] + start

In [28]:
# FCNN
red = fcnn_surrogate(dt[[start]], mean, std, t0=10*start, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
fcnn_anim = anim_comp(org, red, [2], n_dim=90, alg="FCNN")
fcnn_anim

In [29]:
# LSTM
# The LSTM network requires previous time steps
red = fcnn_surrogate(lstm_x[lstm_idx[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
lstm_anim = anim_comp(org, red, [2], n_dim=90, alg="LSTM")
lstm_anim

In [30]:
# Decoder predictor
red = dec_surrogate(dt[[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
dec_anim = anim_comp(org, red, [2], n_dim=90, alg="Decoder")
dec_anim

### Study case 6

In [31]:
# Start index
case = 6
start = 20
stps = 30

# Convert to the global index
start = dt.attrs['idx'][case - 1 ][0] + start

In [32]:
# FCNN
red = fcnn_surrogate(dt[[start]], mean, std, t0=10*start, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
fcnn_anim = anim_comp(org, red, [2], n_dim=90, alg="FCNN")
fcnn_anim

In [33]:
# LSTM
# The LSTM network requires previous time steps
red = fcnn_surrogate(lstm_x[lstm_idx[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
lstm_anim = anim_comp(org, red, [2], n_dim=90, alg="LSTM")
lstm_anim

In [34]:
# Decoder predictor
red = dec_surrogate(dt[[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
dec_anim = anim_comp(org, red, [2], n_dim=90, alg="Decoder")
dec_anim

### Study case 13

In [35]:
# Start index
case = 13
start = 9
stps = 30

# Convert to the global index
start = dt.attrs['idx'][case - 1 ][0] + start

In [36]:
# FCNN
red = fcnn_surrogate(dt[[start]], mean, std, t0=10*start, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
fcnn_anim = anim_comp(org, red, [2], n_dim=90, alg="FCNN")
fcnn_anim

In [37]:
# LSTM
# The LSTM network requires previous time steps
red = fcnn_surrogate(lstm_x[lstm_idx[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
lstm_anim = anim_comp(org, red, [2], n_dim=90, alg="LSTM")
lstm_anim

In [38]:
# Decoder predictor
red = dec_surrogate(dt[[start]], mean, std, stps=stps, save=True)
org = dt[start + 1: start + stps + 1]
# Animate
dec_anim = anim_comp(org, red, [2], n_dim=90, alg="Decoder")
dec_anim

## Conclusion

### General comments

The three techniques presented a resonable performance during the training and tests against the test dataset. However, when they were coupled to build a surrogate model, the Decoder Predictor presented the best performance while the FCNN and LSTM presented similar results being the FCNN slightly better. It is important to note that the FCNN is a considerably smaller neural network and quite fast to train differently from the LSTM. The decoder predictor work reasonably well for the different cases analysed, however, it is considerably larger than the FCNN and LSTM neural networks. Also, it took more time to train than the FCNN and LSTM.

### Computational performance

Despite being costly to train, the surrogate based Decoder Predictor was the fastest for simulating, followed by the LSTM and for the last FCNN. The FCNN low computational performance it is due to the integration using explicit Runge-Kutta method of order 5(4), most part of the computational time during the simulation using the FCNN surrogate is on the Runge-Kutta integration library used.