# Example of VQ VAE with MNIST dataset

In [1]:
# First install the library

# %pip install aepy

Since Rapidae uses the new version of Keras 3, this allows the use of different backends. 
We can select among the 3 available backends (Tensorflow, Pytorch and Jax) by modifying the environment variable "KERAS_BACKEND".
In the next cell we can define it.

In [2]:
import os

os.environ["KERAS_BACKEND"] = "torch"

In [3]:
import sys

notebook_dir = os.path.abspath('')
sys.path.append(os.path.join(notebook_dir, '..', 'src'))

from keras import utils
from rapidae.pipelines.training import TrainingPipeline
from rapidae.models.base.default_architectures import Encoder_Conv_VQ_MNIST, Decoder_Conv_VQ_MNIST
from rapidae.models.vq_vae.vq_vae_model import VQ_VAE
from rapidae.data.utils import display_diff
from rapidae.data.datasets import load_MNIST

# For reproducibility in Keras 3. This will set:
# 1) `numpy` seed
# 2) backend random seed
# 3) `python` random seed
utils.set_random_seed(1)

ModuleNotFoundError: No module named 'rapidae'

### Download and preprocess the dataset

Download and preprocess the dataset. In this example, the selected dataset is the well-known MNIST composed of handwritten number images.

The "persistant" parameter of the load_MNIST() serves as a flag to determine if we want the dataset to be cached in the datasets folder.

In this case since we are using convolutional layers we don't need to flatten the data.

Train and test labels are converted into one-hot encoding.

In [None]:
# Load MNIST dataset
x_train, y_train, x_test, y_test = load_MNIST(persistant=True)

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# Obtaint number of clasess
n_classes = len(set(y_train))

# Convert labels to categorical
y_train = utils.to_categorical(y_train, n_classes)
y_test = utils.to_categorical(y_test, n_classes)

### Model creation


Model's creation step. The selected encoder and decoder are extracted for the Keras tutorial. They are almost similar with the ones used in the vanilla vae example for MNIST, but with a few changes related to latent space, since in VQ-vae we don't have available 'z_mean' and 'z_log_var' arrays.

In [None]:
# Model creation
model = VQ_VAE(input_dim=(x_train.shape[1], x_train.shape[2]),
               latent_dim=2, encoder=Encoder_Conv_VQ_MNIST, decoder=Decoder_Conv_VQ_MNIST, layers_conf=[32, 64])

### Training pipeline 

Define the training pipeline. Here you can fix some hyperparameters related to the training phase of the autoencoder, like learning rate, bath size, numer of epochs, etc. 
Also you can define callbacks to the model.

In [None]:
pipe = TrainingPipeline(name='training_pipeline_mnist_vq_vae',
                        model=model, num_epochs=20)

trained_model = pipe(x=x_train, y=y_train)

### Evaluation step

Let's now check the performance of this model. The original images are listed in the first row and the reconstructions in the second. These results look decent but you can play with some specific hyperparameters like the number and dimensionality of the embedding to improve this.

In [None]:
reconstructions_test = trained_model.predict(x_test)

display_diff(x_test, reconstructions_test['recon'])