# Lorenz '63 DIRESA tutorial

<a target="_blank" href="https://colab.research.google.com/github/gdepaepe/diresa/blob/main/diresa.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

### 1. Install packages
The `diresa` package depends on the `tensorflow` package. This tutorial also uses `numpy` and `matplotlib`. 

In [None]:
# Install needed packages
!pip install numpy
!pip install matplotlib
!pip install tensorflow
!pip install diresa

### 2. Load the dataset
In this tutorial, we are going to compress the 3D lorenz '63 butterfly into a 2D latent space. The `lorenz.csv` contains a list of butterfly points, with three colums for the X, Y and Z coordinate. The DIRESA model has 2 inputs: the original dataset and a shuffled version of this dataset for the twin encoder.

In [None]:
!wget https://gitlab.com/etrovub/ai4wcm/public/diresa/-/raw/master/docs/lorenz.csv

In [None]:
import numpy as np
data_file = "lorenz.csv"
data = np.loadtxt(data_file, delimiter=",")
print("Shape", data_file, ":", data.shape)
train = data[:30000]
val = data[30000:]
id_train = np.argsort((np.random.random(train.shape[0])))
id_val = np.argsort((np.random.random(val.shape[0])))
train_twin = train[id_train]
val_twin = val[id_val]

### 3. Build the DIRESA model
We can build a DIRESA model with convolutional, attention and/or dense layers with the `build_diresa` function. We can also build a DIRESA model based on a custom encoder and decoder with the `diresa_model` function (see below). We build here a model with an input shape of `(3,)` for the 3D butterfly points. Our encoder model has 3 dense layers with 40, 20 and 2 units (the latter is the dimension of the latent space). The decoder is a reflection of the encoder. The DIRESA model has 3 loss functions, the reconstruction loss (usually the MSE is used here), the covariance loss and a distance loss (here the MSE distance loss is used). Also the weights for the diffenent loss functions are specified.

In [None]:
from diresa.models import build_diresa
from diresa.loss import mse_dist_loss, LatentCovLoss

diresa = build_diresa(input_shape=(3,), dense_units=(40, 20, 2))

diresa.compile(loss=['MSE', LatentCovLoss(), mse_dist_loss], loss_weights=[1., 3., 1.5], optimizer="adam")

In order to lower the loss weight tuning effort, we will use annealing for the covariance loss. In this case, the covariance weight starts from an initial value (here the keras backend variable `cov_weight` is initialized to 0.) and is increased until the covariance loss reaches a certain target.

In [None]:
from tensorflow.keras import backend as K
from diresa.callback import LossWeightAnnealing

cov_weight = K.variable(0.)
diresa.compile(loss=['MSE', LatentCovLoss(cov_weight), mse_dist_loss], loss_weights=[1., 1., 1.], optimizer="adam")
diresa.summary(expand_nested=True)

### 4. Train the DIRESA model
We train the DIRESA model in a standard way. The model is fit with 2 inputs: the original dataset and the shuffled dataset. There are 3 outputs: the original dataset for the reconstruction loss; the 2 last outputs are not used, but are needed in Keras 3. The batch size should be large enough for the calculation of the covariance loss, which calculates the covariance matrix of the latent space components over the batch. In the `LossWeightAnnealing` callback, we specify the target (`target_loss`) for the mean squared covariance between the latent components. Also the step size by which the annealing weight factor is increased (`anneal_step`) and epoch from which annealing is started (`start_epoch`) is specified. If annealing is not used, the fit method is called without callback function.

In [None]:
callback = [LossWeightAnnealing(cov_weight, target_loss=0.0001, anneal_step=0.2, start_epoch=3)]
diresa.fit((train, train_twin), (train, train, train), 
           validation_data=((val, val_twin), (val, val, val)),
           epochs=20, batch_size=512, shuffle=True, verbose=2, callbacks=callback)

### 5. Encoder and decoder submodel
We cut out the encoder and decoder submodels with the `encoder_decoder` function. If a dataset is given, the R2-scores of the latent components are calculated and a ranking layer, which orders the latent components based on the R2-scores, is added to the submodels.

In [None]:
from diresa.toolbox import encoder_decoder
compress_model, decode_model = encoder_decoder(diresa, dataset=val)
latent = compress_model.predict(val)
predict = decode_model.predict(latent)

### 6. Show latent space
We plot the 2D latent space.

In [None]:
import matplotlib.pyplot as plt
plt.figure()
plt.title("Latent space")
plt.scatter(latent[:, 0], latent[:, 1], marker='.', s=0.1, color='C2')
plt.show()

### 7. Original versus decoded datset
We compair the origonal dataset with the decoded one.

In [None]:
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(val[:, 0], val[:, 1], val[:, 2], marker='.', s=0.1)
ax.scatter(predict[:, 0], predict[:, 1], predict[:, 2], marker='.', s=0.1, color='C1')
plt.show()

### 8. A convolutional and attention example
If your dataset consists of a number of variables (e.g. temperature and pressure, so 2 variables) over a 2 dimensional grid, convolutional layers can be used in the encoder/decoder. Here is an example for a grid (y, x) = (32, 64). The dataset would then have a size of (nbr_of_samples, 32, 64, 2). We will use a stack of 4 convolutional/maxpooling blocks in the encoder (the decoder mirrors the encoder). The first block uses 3 Conv2D layers, the second bock 2 and the third block 1, followed by a MaxPooling2D layer (`stack=(3, 2, 1)`). The number of filters in the first block is 32, in the second 16 and in the third 8 (`stack_filters=(32, 16, 8)`). The number of filters in Latent space, before flattening, is 1 (`latent_filters=1`). This will result in a latent size (before flattening) of (8, 16, 1).

In [10]:
diresa = build_diresa(input_shape=(32, 64, 2), stack=(3, 2, 1), stack_filters=(32, 16, 8), latent_filters=1)
diresa.summary(expand_nested=True)




We can add an attention after the last convolutional layer in a block, to catch long distance relations. Here we add an attention layer in the second and third block. After the convolutional/attention blocks, 2 dense layers are added, bringing the dimension of the latent space to 10.

In [None]:
diresa = build_diresa(input_shape=(32, 64, 2), stack=(3, 2, 1), stack_filters=(32, 16, 8), attention=(False, True, True), dense_units=(30, 10))
diresa.summary(expand_nested=True)

### 9. Build DIRESA with custom encoder and decoder
We can also build DIRESA models with custom encoder and decoder (reconstruction) models. We define those two here.

In [11]:
from tensorflow.keras import layers, Input
from tensorflow.keras.models import Model
def encoder_model(input_shape=(3,), output_shape=2, units=40):
    x = Input(shape=input_shape)
    y = layers.Dense(units=units, activation="relu")(x)
    y = layers.Dense(units=units // 2, activation="relu")(y)
    y = layers.Dense(output_shape, activation="linear")(y)
    model = Model(x, y, name="Encoder")
    return model
def decoder_model(input_shape=(2,), output_shape=3, units=40):
    x = Input(shape=input_shape)
    y = layers.Dense(units=units // 2, activation="relu")(x)
    y = layers.Dense(units=units, activation="relu")(y)
    y = layers.Dense(output_shape, activation="linear")(y)
    model = Model(x, y, name="Recon")
    return model




Based on the custom encoder and decoder model, we now build the DIRESA model with the `diresa_model` function. 

In [12]:
from diresa.models import diresa_model
from diresa.loss import mse_dist_loss, LatentCovLoss

diresa = diresa_model(x=Input(shape=(3,)), x_twin=Input(shape=(3,)), encoder=encoder_model(), decoder=decoder_model())

diresa.compile(loss=['MSE', LatentCovLoss(), mse_dist_loss], loss_weights=[1., 3., 1.])
diresa.summary(expand_nested=True)




### 9. Reference

In [13]:
from diresa import *
help(models)
#help(loss)

Help on module diresa.models in diresa:

NAME
    diresa.models

DESCRIPTION
    Creates DIRESA and (V)AE models out of an encoder and decoder model
    Creates DIRESA and AE models from hyperparameters
    :Author:  Geert De Paepe
    :Email:   geert.de.paepe@vub.be
    :License: Apache 2.0
    
    1. Creating (V)AE and Diresa models out of an encoder and decoder model:
     - autoencoder_model(x, encoder, decoder)
     - diresa_model(x, x_twin, encoder, decoder)
    
    2. Creating AE and Diresa models from hyperparameters
       - build_ae(input_shape, stack, stack_filters, latent_filters, kernel_size=(3, 3),
                  conv_transpose=False, up_first=False, residual=False, batchnorm=False,
                  dense_units=(), activation='relu', encoder_activation='linear', decoder_activation='linear')
       - build_diresa(input_shape, stack, stack_filters, latent_filters, kernel_size=(3, 3),
                      conv_transpose=False, up_first=False, residual=False, batchnorm