# CVAE

The encoder and decoder in addition to the input data are provided with an embedding vector that represents a condition. Thus, the encoder does not need to represent the condition in the latent space since the decoder will also get this information as an extra input.

In this example we will use the MNIST dataset: the encoder can regress out the condition (specific digits) and learn the handwriting style as a latent representation.

In [None]:
# Install the library

!pip install rapidae

In [1]:
%load_ext autoreload
%autoreload 2

import os
import sys

notebook_dir = os.path.abspath('') # get the current notebook directory
sys.path.append(os.path.join(notebook_dir, '..', 'src')) # add src folder to path to import modules
                                                        # '..', 'src' if you are in the 'examples' folder

In [3]:
from rapidae.data import load_dataset
from rapidae.models import CVAE
from rapidae.models.base import VAE_Encoder_MLP, VAE_Decoder_MLP
from rapidae.pipelines import TrainingPipeline
from rapidae.evaluate import plot_latent_space, plot_reconstructions

### Data

In [4]:
# Load MNIST dataset
data = load_dataset("MNIST")

# normalize data
x_train = data["x_train"].reshape(data["x_train"].shape[0], -1) / 255
x_test = data["x_test"].reshape(data["x_test"].shape[0], -1) / 255
y_train = data["y_train"]
y_test = data["y_test"]

print("Data shape:", x_train.shape)

2024-05-10 11:59:15 [32m[INFO][0m: Downloading data...[0m


Data shape: (60000, 784)


Keep only images of 0–7 digits

In [5]:
import numpy as np

idx_07 = np.where((data["y_train"] != 8) & (data["y_train"] == 9))[0]
x_train_07 = x_train[idx_07]
y_train_07 = y_train[idx_07]

### Model

In [13]:
input_dim = x_train.shape[1]
n_classes = np.unique(y_train).shape[0]
# Model creation
model = CVAE(input_dim=input_dim, 
            latent_dim=2,
            encoder=VAE_Encoder_MLP(input_dim=input_dim, latent_dim=2), 
            decoder= VAE_Decoder_MLP(input_dim=input_dim, latent_dim=2),
            n_classes=n_classes)

2024-05-10 12:03:10 [32m[INFO][0m: Using provided encoder[0m
2024-05-10 12:03:11 [32m[INFO][0m: Using provided decoder[0m


### Training

In [14]:
pipe = TrainingPipeline(name='CVAE_MNIST', 
                        learning_rate=0.001,
                        model=model, 
                        num_epochs=30, 
                        batch_size=128,
                        graph_mode=True,)

trained_model = pipe(x=(x_train, y_train))

2024-05-10 12:03:12 [32m[INFO][0m: +++ CVAE_MNIST +++[0m
2024-05-10 12:03:12 [32m[INFO][0m: Creating folder in ./output_dir/CVAE_MNIST_2024-05-10_12-03[0m
2024-05-10 12:03:12 [32m[INFO][0m: 
TRAINING STARTED
	Backend: tensorflow
	Eager mode: True
	Validation data available: False
	Callbacks set: ['EarlyStopping', 'ModelCheckpoint'] 
[0m


Epoch 1/30

Epoch 1: loss improved from inf to 10.83467, saving model to ./output_dir/CVAE_MNIST_2024-05-10_12-03/model.weights.h5
469/469 - 27s - 57ms/step - kl_loss: 0.3622 - loss: 10.8347 - reconstruction_loss: 10.4701
Epoch 2/30

Epoch 2: loss improved from 10.83467 to 7.94343, saving model to ./output_dir/CVAE_MNIST_2024-05-10_12-03/model.weights.h5
469/469 - 26s - 55ms/step - kl_loss: 0.0054 - loss: 7.9434 - reconstruction_loss: 7.9370
Epoch 3/30

Epoch 3: loss improved from 7.94343 to 7.68684, saving model to ./output_dir/CVAE_MNIST_2024-05-10_12-03/model.weights.h5
469/469 - 27s - 57ms/step - kl_loss: 0.0051 - loss: 7.6868 - reconstruction_loss: 7.6806
Epoch 4/30

Epoch 4: loss improved from 7.68684 to 7.62186, saving model to ./output_dir/CVAE_MNIST_2024-05-10_12-03/model.weights.h5
469/469 - 26s - 55ms/step - kl_loss: 0.0037 - loss: 7.6219 - reconstruction_loss: 7.6170
Epoch 5/30

Epoch 5: loss improved from 7.62186 to 7.53352, saving model to ./output_dir/CVAE_MNIST_2024-05-

2024-05-10 12:15:52 [32m[INFO][0m: Restoring best model[0m
2024-05-10 12:15:52 [32m[INFO][0m: Best model restored[0m


Aparentemente z es una tupla, averiguar por qué