# Deep Learning
## Formative assessment
### Week 10: Variational autoencoder

#### Instructions

In this notebook, you will write code to implement the variational autoencoder algorithm for an image dataset of celebrity faces. You will use the trained encoder and decoder networks to reconstruct and generate images. You will also see how the latent space encodes high-level information about the images.

Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line: 

`#### GRADED CELL ####`

These cells require you to write your own code to complete them.

#### Let's get started!

We'll start by running some imports, and loading the dataset.

In [None]:
#### PACKAGE IMPORTS ####

# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook

import keras
from keras import ops
import tensorflow as tf
import torch
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from pathlib import Path

from keras import Sequential, Model
from keras.layers import Layer, Input, Dense, Flatten, Reshape, Conv2D,  Conv2DTranspose, BatchNormalization
from keras.metrics import Mean

<center><img src="figures/celeba.png" title="CelebA" style="width: 650px;"/></center>

#### The Large-scale CelebFaces Attributes (CelebA) Dataset

For this assignment you will use a subset of the [CelebFaces Attributes (CelebA) dataset](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html). The full dataset contains over 200K images CelebA contains thousands of colour images of the faces of celebrities, together with tagged attributes such as 'Smiling', 'Wearing glasses', or 'Wearing lipstick'. It also contains information about bounding boxes and facial part localisation. CelebA is a popular dataset that is commonly used for face attribute recognition, face detection, landmark (or facial part) localization, and face editing & synthesis. 

* Z. Liu, P. Luo, X. Wang, and X. Tang. "Deep Learning Face Attributes in the Wild", Proceedings of International Conference on Computer Vision (ICCV), 2015.

Your goal is to implement the variational autoencoder algorithm for a subset of the CelebA dataset. For practical reasons we will keep the dataset and the network size relatively small.

#### Load and preprocess the dataset

For this assignment, you will use a subset of the CelebA dataset. Note that the full dataset can be downloaded from [the CelebA dataset webpage](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)), but this is not necessary for this assignment. 

In addition, attribute labels for the subset have been saved in the CSV file `list_attr_celeba.csv`. We will use this in the last part of the assignment.

You should now write the following `load_dataset` function to create a `tf.data.Dataset` object from the files saved in the images folder.

* The function takes `split` as an argument, which will be equal to one of the strings `"train"`, `"val"` or  `"test"`, `batch_size`, an optional `shuffle_buffer` argument and `image_dir` argument
* The function should create a Dataset containing the filepaths saved in the corresponding `split` subfolder of the `image_dir` directory
* The function should include a nested/inner function used to map the Dataset
  * This function will take the `filepath` as an argument
  * It should read the contents of the file saved at `filepath` - this will be a jpeg image
  * It should then decode the jpeg and scale the pixel values to lie in the range $[0, 1]$
  * You should use the [`set_shape`](https://www.tensorflow.org/api_docs/python/tf/Tensor#set_shape) Tensor method to fix the (static) shape of the image Tensor to `(64, 64, 3)`
  * It should then the image Tensor
* The function should then apply the nested function using the `map` method
* If `shuffle_buffer` is not None, then it should be used to shuffle the Dataset
* It should then batch the Dataset using the `batch_size` argument
* Finally, the function should prefetch the Dataset using the argument `tf.data.AUTOTUNE`
* The function should then return the Dataset

_Hint: The Dataset can be created using_ `tf.data.Dataset.list_files`, _and using a wildcard character_ `'*.jpg'`_. Make sure that you set_ `shuffle=False` _when calling this method._

In [None]:
#### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

def load_dataset(split, batch_size, shuffle_buffer=None, image_dir=str(Path('data', 'images'))):
    """
    This function should create a tf.data.Dataset object for one of the train/valid/test
    splits, according to the above specification.
    It should then return the Dataset.
    """
    dataset = tf.data.Dataset.list_files(str(Path(image_dir, split, '*.jpg')), shuffle=False)
    
    def load_image(filepath):
        raw_img = tf.io.read_file(filepath) 
        img_tensor = tf.image.decode_jpeg(raw_img, channels=3)
        img_tensor.set_shape((64, 64, 3))
        img_tensor = tf.image.convert_image_dtype(img_tensor, tf.float32)
        return img_tensor
    
    dataset = dataset.map(load_image)
    if shuffle_buffer is not None:
        dataset = dataset.shuffle(shuffle_buffer)
    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(tf.data.AUTOTUNE)
    return dataset

In [None]:
# Use your function to obtain the train, valid and test Datasets

train_ds = load_dataset('train', 32, shuffle_buffer=500)
valid_ds = load_dataset('val', 32)
test_ds = load_dataset('test', 8)

In [None]:
# Display a few examples

n_rows, n_cols = 4, 8
f, axs = plt.subplots(n_rows, n_cols, figsize=(16, 8))

for img_batch in train_ds.take(1):
    img_batch = ops.convert_to_numpy(img_batch)
    for n, image in enumerate(img_batch):
        i = n // n_cols
        j = n % n_cols
        axs[i, j].imshow(image)
        axs[i, j].axis('off')

#### Define the encoder network

We will now define the encoder network as part of the VAE. The approximate posterior $q_\phi(z\mid x)$ defined by the encoder will be a diagonal Gaussian distribution. You should complete the following function to define the encoder network, according to the following specification:

* The function takes the `latent_dim` as an argument
* Use the functional API to define the model, which has the following layers:
  * An Input layer that sets the input shape to `(64, 64, 3)`
  * A Conv2D layer with 32 filters, 3x3 kernel size, ReLU activation, stride of 2x2, and 'SAME' padding
  * BatchNormalization layer
  * Conv2D layer with 64 filters, 3x3 kernel size, ReLU activation, stride of 2x2, and 'SAME' padding
  * BatchNormalization layer
  * Conv2D layer with 128 filters, 3x3 kernel size, ReLU activation, stride of 2x2, and 'SAME' padding
  * BatchNormalization layer
  * Conv2D layer with 256 filters, 3x3 kernel size, ReLU activation, stride of 2x2, and 'SAME' padding
  * BatchNormalization layer
  * Flatten layer
  * Dense layer with no activation function, and the right number of units to parameterise the means and log variance of a diagonal Gaussian distribution of dimension `latent_dim`
  * The resulting Tensor should be split into `z_mean` and `z_log_var` Tensors
  * The encoder Model should output the mean and log variance Tensors in a list `[z_mean, z_log_var]`
* The function should then return the encoder model

In [None]:
#### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

def get_encoder(latent_dim):
    """
    This function should build a CNN encoder model according to the above specification. 
    The function takes latent_dim as an argument, which should be used to define the model.
    Your function should return the encoder model.
    """
    encoder_inputs = Input(shape=(64, 64, 3))
    h = Conv2D(32, 3, activation='relu', strides=2, padding='same')(encoder_inputs)  # (32, 32, 32)
    h = BatchNormalization()(h)
    h = Conv2D(64, 3, activation='relu', strides=2, padding='same')(h)  # (16, 16, 64)
    h = BatchNormalization()(h)
    h = Conv2D(128, 3, activation='relu', strides=2, padding='same')(h)  # (8, 8, 128)
    h = BatchNormalization()(h)
    h = Conv2D(256, 3, activation='relu', strides=2, padding='same')(h)  # (4, 4, 256)
    h = BatchNormalization()(h)
    h = Flatten()(h)
    h = Dense(2 * latent_dim)(h)
    z_mean, z_log_var = ops.split(h, 2, axis=-1)
    return Model(inputs=encoder_inputs, outputs=[z_mean, z_log_var], name='encoder')  

In [None]:
# Run your function to get the encoder

encoder = get_encoder(latent_dim=50)

In [None]:
# Print the encoder summary

encoder.summary()

#### Define the decoder network

You should now define the decoder network for the VAE, using the functional API. This should be a neural network that returns a logits Tensor of shape `(64, 64, 3)` that will be used to parameterise independent Bernoulli distributions per pixel and colour channel.

* The function takes the `latent_dim` as an argument
* Use the functional API to define the model with the following layers:
  * An Input layer that sets the input shape to `(latent_dim,)`
  * A Dense layer with 4096 units and ReLU activation
  * A Reshape layer, that reshapes its input to `(4, 4, 256)`
  * BatchNormalization layer
  * Conv2DTranspose layer with 128 filters, 3x3 kernel size, ReLU activation, stride of 2x2 and 'SAME' padding
  * BatchNormalization layer
  * Conv2DTranspose layer with 64 filters, 3x3 kernel size, ReLU activation, stride of 2x2 and 'SAME' padding
  * BatchNormalization layer
  * Conv2DTranspose layer with 32 filters, 3x3 kernel size, ReLU activation, stride of 2x2 and 'SAME' padding
  * BatchNormalization layer
  * Conv2DTranspose layer with 3 filters, 3x3 kernel size, no activation function, stride of 2x2 and 'SAME' padding
* The Conv2DTranspose layers will need to be configured such that the final Conv2DTranspose layer outputs a Tensor of shape `(64, 64, 3)` in the final layer. You should pass in the `output_padding` argument to each of these layers.
* The function should then return the decoder model

In [None]:
#### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

def get_decoder(latent_dim):
    """
    This function should build a CNN decoder model according to the above specification. 
    The function takes latent_dim as an argument, which should be used to define the model.
    Your function should return the decoder model.
    """
    event_shape = (64, 64, 3)
    
    decoder_inputs = Input(shape=(latent_dim,))
    h = Dense(4096, activation='relu')(decoder_inputs)
    h = Reshape((4, 4, 256))(h)
    h = BatchNormalization()(h)
    h = Conv2DTranspose(128, 3, activation='relu', strides=2, padding='same', output_padding=1)(h)
    h = BatchNormalization()(h)
    h = Conv2DTranspose(64, 3, activation='relu', strides=2, padding='same', output_padding=1)(h)
    h = BatchNormalization()(h)
    h = Conv2DTranspose(32, 3, activation='relu', strides=2, padding='same', output_padding=1)(h)
    h = BatchNormalization()(h)
    decoder_outputs = Conv2DTranspose(3, 3, strides=2, padding='same', output_padding=1)(h)

    return Model(inputs=decoder_inputs, outputs=decoder_outputs, name='decoder')

In [None]:
# Run your function to get the decoder

decoder = get_decoder(latent_dim=50)

In [None]:
# Print the decoder summary

decoder.summary()

#### Build the end-to-end architecture

Now that the encoder and decoder networks are defined, you should now complete the following `CelebAVAE` class to build the complete encoder-decoder architecture. 

* The `CelebAVAE` class subclasses from the base `Model` class
* The function takes the `encoder` and `decoder` networks as arguments
* You should complete the `_get_losses` method
* The `_get_losses` method should compute and return the loss, KL divergence loss and negative log-likelihood loss as a tuple `(loss, kl_loss, nll_loss)`
  * The prior distribution $p_\theta(z)$ should be a zero-mean, isotropic Gaussian with identity covariance matrix
  * You should use the following form of the SGVB estimator with $L=3$:
$$
\hat{\mathcal{L}}^A(\theta,\phi;x) := \frac{1}{L} \sum_{j=1}^L \log p_\theta(x \mid z^{(j)}) + \log p_\theta(z^{(j)}) − \log q_\phi(z^{(j)}|x)
$$
where $z^{(j)} = g_\phi(\epsilon^{(j)}, x)$, $\epsilon^{(j)}\sim p(\epsilon)$ and $p(\epsilon) = N(\mathbf{0}, \mathbf{I})$
* The `train_step` method is completed for you. It computes the losses, perform the gradient update and update the metrics
* The `test_step` method is completed for you. It computes the losses and update the metrics
* The `call` method is completed for you. It passes a batch of inputs through the end-to-end encoder-decoder architecture. It uses a single Monte Carlo sample to evaluate the likelihood

In [None]:
 #### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

class CelebAVAE(Model):
    
    def __init__(self, encoder, decoder, **kwargs):
        super().__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder
        self.loss_metric = Mean(name='loss')
        self.nll_metric = Mean(name='nll')
        self.kl_metric = Mean(name='kl')
        self.pi = ops.array(np.pi)
    
    def _get_losses(self, data):
        """
        This method should compute and return the loss, kl_loss and nll_loss.
        It should use 3 Monte Carlo samples and the first form of the SGVB estimator.
        """
        z_mean, z_log_var = self.encoder(data)
        batch_size, latent_dim = ops.shape(z_mean)[0], ops.shape(z_mean)[1]
        epsilon = keras.random.normal((3, batch_size, latent_dim))
        z_std = ops.exp(0.5 * z_log_var)
        posterior_samples = z_mean + (z_std * epsilon)  # (3, B, L)

        log_Z = 0.5 * ops.log(2 * self.pi)
        prior_log_prob = -0.5 * ops.square(posterior_samples) - log_Z
        prior_log_prob = ops.mean(ops.sum(prior_log_prob, axis=-1))
        
        posterior_log_prob = -0.5 * ops.square((posterior_samples - z_mean) / z_std) - ops.log(z_std) - log_Z
        posterior_log_prob = ops.mean(ops.sum(posterior_log_prob, axis=-1))
        
        kl_loss = posterior_log_prob - prior_log_prob
        
        posterior_samples = ops.reshape(posterior_samples, (3 * batch_size, latent_dim))
        
        x_logits = self.decoder(posterior_samples)  # (3*B, 64, 64, 3)
        x_logits = ops.reshape(x_logits, (3, batch_size, 64, 64, 3, 1))
        data = ops.reshape(data, (1, batch_size, 64, 64, 3, 1))
        data = ops.repeat(data, 3, axis=0)
        nll_loss = keras.losses.binary_crossentropy(data, x_logits, from_logits=True)  # (3, B, 64, 64, 3)
        nll_loss = ops.mean(ops.sum(nll_loss, axis=[-1, -2, -3]))

        loss = kl_loss + nll_loss
        return loss, kl_loss, nll_loss

    def train_step(self, data):
        if keras.config.backend() == 'tensorflow':
            with tf.GradientTape() as tape:
                loss, kl_loss, nll_loss = self._get_losses(data)
            grads = tape.gradient(loss, self.trainable_weights)
            self.optimizer.apply_gradients(zip(grads, self.trainable_weights))
        else:
            assert keras.config.backend() == 'torch'
            self.zero_grad()
            loss, kl_loss, nll_loss = self._get_losses(data)

            loss.backward()

            gradients = [v.value.grad for v in self.trainable_weights]    
            with torch.no_grad():
                self.optimizer.apply(gradients, self.trainable_weights)
            
        self.loss_metric.update_state(loss)
        self.nll_metric.update_state(nll_loss)
        self.kl_metric.update_state(kl_loss)
        return {m.name: m.result() for m in self.metrics}

    def test_step(self, data):
        loss, kl_loss, nll_loss = self._get_losses(data)
        self.loss_metric.update_state(loss)
        self.nll_metric.update_state(nll_loss)
        self.kl_metric.update_state(kl_loss)
        return {m.name: m.result() for m in self.metrics}

    def call(self, inputs):
        z_mean, z_log_std = self.encoder(inputs)
        epsilon = keras.random.normal(ops.shape(z_mean))
        z_std = ops.exp(z_log_std)
        z_sample = z_mean + (z_std * epsilon)
        return self.decoder(z_sample)

    @property
    def metrics(self):
        return [self.loss_metric, self.nll_metric, self.kl_metric]

In [None]:
# Run your function to define and compile the end-to-end architecture

vae = CelebAVAE(encoder, decoder)
vae.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0005))

#### Train the model

In [None]:
# Fit the model

early_stopping = keras.callbacks.EarlyStopping(patience=3, monitor='val_loss')
vae.fit(train_ds, validation_data=valid_ds, epochs=40, callbacks=[early_stopping])

In [None]:
# Evaluate the model on the test set

vae.evaluate(test_ds, return_dict=True)

#### Compute reconstructions of test images

We will now take a look at some image reconstructions from the encoder-decoder architecture.

You should complete the following function, that uses `encoder` and `decoder` to reconstruct images from the test dataset. 

* This function takes the `encoder`, `decoder` and a Tensor `batch_of_images` as arguments
* It should then compute the reconstructions as follows:
  * Compute the means of the encoding distributions from passing the batch of images into the encoder
  * Pass these latent vectors through the decoder to get the Bernoulli distribution probabilities
* The function should then return the resulting Tensor, which will be of shape `(batch_size, 64, 64, 3)`

In [None]:
 #### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

def reconstruct(encoder, decoder, batch_of_images):
    """
    This function should compute reconstructions of the batch_of_images according
    to the above instructions.
    The function takes the encoder, decoder and batch_of_images as inputs, which
    should be used to compute the reconstructions.
    The function should then return the reconstructions Tensor.
    """
    mean_latent_vectors, _ = encoder(batch_of_images)
    mean_reconstructions = ops.sigmoid(decoder(mean_latent_vectors))
    return mean_reconstructions

In [None]:
# Use your function to compute and visualise reconstructions

for test_batch in test_ds.shuffle(100).take(1):
    reconstructions = reconstruct(encoder, decoder, test_batch)

test_batch_size = 8
f, axs = plt.subplots(2, test_batch_size, figsize=(16, 6))
axs[0, 0].set_title("Original test images", loc='left')
axs[1, 0].set_title("Reconstructed images", loc='left')
for j in range(test_batch_size):
    axs[0, j].imshow(ops.convert_to_numpy(test_batch)[j])
    axs[1, j].imshow(ops.convert_to_numpy(reconstructions)[j])
    axs[0, j].axis('off')
    axs[1, j].axis('off')
plt.tight_layout()

#### Manipulate images in the latent space

In this final section, we will see how the latent space encodes high-level information about the images, even though it has not been trained with any information apart from the images themselves.

As mentioned earlier, each image in the CelebA dataset is labelled according to the attributes of the person pictured. The cell below will load these labels.

In [None]:
# Load the attribute labels

labels = pd.read_csv(Path('./data/list_attr_celeba_subset.csv'))
labels.head()

As can be seen above, each image is labelled with a binary indicator (1 true, -1 false), according to whether it posseses the attribute. The list of attributes contained in the `labels` DataFrame is shown below.

In [None]:
# List the attributes contained in the DataFrame

labels.columns[1:]

We would like to perform some computations in the latent space, depending on the attribute values in the `labels` DataFrame. To do this, we will construct a new TensorFlow Dataset object, containing the images and attribute information.

You should now complete the following `get_labelled_dataset` function to construct this new Dataset.

* The function takes the arguments `split` (which again will be one of the strings `'train'`, `'val'` or `'test'`), an `attribute` string, the `labels` DataFrame and `image_dir` string
  * The `attribute` will be one of the column headers listed above
* As before, the function should create a Dataset containing the filepaths saved in the corresponding `split` subfolder of the `image_dir` directory
* The function should include a nested function used to map the Dataset similar to before
  * It should again read the contents of the file, decode the jpeg and scale the pixel values to lie in the range $[0, 1]$
  * It should then look up the `attribute` value for the image from the `labels` DataFrame
  * It should return a tuple containing the image Tensor, and scalar `tf.int32` label Tensor
* The function should then apply the nested function using the `map` method
* The function should then return the Dataset

_Hint: convert the filenames and attribute columns of the_ `labels` _DataFrame into separate Tensor objects for use in the map function. The_ `tf.strings.split` _and_ `tf.where` _functions will be useful to extract the label for a given image._

In [None]:
#### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

def get_labelled_dataset(split, attribute, labels=labels, image_dir=str(Path('data', 'images'))):
    """
    This function should create a tf.data.Dataset object for one of the train/valid/test
    splits, according to the above specification.
    It should then return the Dataset.
    """
    filenames = tf.constant(labels['image_id'])
    labels = tf.constant(labels[attribute], dtype=tf.int32)
    dataset = tf.data.Dataset.list_files(str(Path(image_dir, split, '*.jpg')), shuffle=False)
    
    def load_image(filepath):
        filename = tf.strings.split(filepath, '/')[-1]
        i = tf.where(filenames == filename)
        raw_img = tf.io.read_file(filepath) 
        img_tensor = tf.image.decode_jpeg(raw_img, channels=3)
        img_tensor = tf.image.convert_image_dtype(img_tensor, tf.float32)
        label = labels[tf.squeeze(i)]
        return img_tensor, label
    
    dataset = dataset.map(load_image)
    return dataset

In [None]:
# Create the labelled Dataset from the train split

labelled_train_ds = get_labelled_dataset('train', 'Eyeglasses', labels=labels)

We now would like to compute the 'attribute vector' for the chosen attribute. This will be the average latent vector corresponding to all images that have the attribute, minus the average latent vector corresponding to all images that do not have the attribute. The intuition is that this vector will correspond the high-level property of adding the attribute to an image.

You should now complete the following function to compute the attribute vector.

* The function takes `labelled_dataset` as an argument, as well as the `encoder` network
* The function should compute the encoding distribution mean (latent vector) for all images that have the attribute, and (separately) all the images that do not
* It should then compute the average of each of these two sets of latent vectors
* It should then compute `avg_latent_with_attribute - avg_latent_without_attribute`. This is the attribute vector
* The function should then return the attribute vector as a numpy array of shape `(latent_dim,)`

In [None]:
#### GRADED CELL ####

# Complete the following function. 
# Make sure to not change the function name or arguments.

def get_attribute_vector(labelled_dataset, encoder):
    """
    This function should compute and return the attribute vector according 
    to the above specification.
    """
    latent_dim = encoder.outputs[0].shape[1]
    with_attribute_ds = labelled_dataset.filter(lambda i, l: tf.math.equal(l, 1)).batch(128)
    without_attribute_ds = labelled_dataset.filter(lambda i, l: tf.math.equal(l, -1)).batch(128)
    
    avg_latent_with_attribute = np.empty(shape=(0, latent_dim), dtype=np.float32)
    for images, _ in with_attribute_ds:
        latents, _  = encoder(images)
        avg_latent_with_attribute = np.concatenate((avg_latent_with_attribute, ops.convert_to_numpy(latents)), 
                                                   axis=0)
    avg_latent_with_attribute = np.mean(avg_latent_with_attribute, axis=0)    
    
    avg_latent_without_attribute = np.empty(shape=(0, latent_dim), dtype=np.float32)
    for images, _ in without_attribute_ds:
        latents, _ = encoder(images)
        avg_latent_without_attribute = np.concatenate((avg_latent_without_attribute, ops.convert_to_numpy(latents)), 
                                                      axis=0)
    avg_latent_without_attribute = np.mean(avg_latent_without_attribute, axis=0)    
    
    return avg_latent_with_attribute - avg_latent_without_attribute

In [None]:
# Get the attribute vector using your function

attribute_vector = get_attribute_vector(labelled_train_ds, encoder)

We can view this attribute vector by decoding it:

In [None]:
# Display the decoded attribute vector

decoded_a = ops.sigmoid(decoder(attribute_vector[np.newaxis, ...]))
plt.imshow(ops.convert_to_numpy(decoded_a).squeeze())
plt.axis('off');

We can now use the attribute vector to add the attribute to an image reconstruction, where that attribute wasn't present before. To do this, we can just add the attribute vector to the latent vector encoding of the image, and then decode the result. We can also adjust the strength of the attribute vector by scaling with a multiplicative parameter.

In [None]:
# Add the attribute vector to a sample of images that don't have the attribute

k = 2.5  # Weighting of attribute vector
num_examples = 8
labelled_test_ds = get_labelled_dataset('test', 'Eyeglasses', labels=labels).shuffle(100)
images_without_attribute = []
reconstructions = []
modified_images = []
for image, label in labelled_test_ds:
    if label == 1:  # Only proceses images without the attribute
        continue
    else:
        images_without_attribute.append(ops.convert_to_numpy(image))
        encoding, _ = encoder(image[tf.newaxis, ...])
        encoding = ops.convert_to_numpy(encoding)
        decoded_image = ops.sigmoid(decoder(encoding))
        reconstructions.append(np.squeeze(ops.convert_to_numpy(decoded_image)))
        modified_encoding = encoding + (k * attribute_vector)
        modified_reconstruction = ops.sigmoid(decoder(modified_encoding))
        modified_images.append(np.squeeze(ops.convert_to_numpy(modified_reconstruction)))
    if len(modified_images) >= num_examples:
        break

In [None]:
# Display the original images, their reconstructions, and modified reconstructions

num_examples = 8
f, axs = plt.subplots(3, num_examples, figsize=(16, 6))
axs[0, 0].set_title("Original images", loc='left')
axs[1, 0].set_title("Reconstructed images", loc='left')
axs[2, 0].set_title("Images with added attribute", loc='left')
for j in range(num_examples):
    axs[0, j].imshow(images_without_attribute[j])
    axs[1, j].imshow(reconstructions[j])
    axs[2, j].imshow(modified_images[j])
    for ax in axs[:, j]: ax.axis('off')
    
plt.tight_layout();

You could also try removing the attribute from images that possess the attribute, or experiment with a different attribute.

Congratulations on completing this week's assignment! In this assignment you have developed the variational autoencoder algorithm for the CelebA dataset, and used the trained networks to compute reconstructions and modify dataset images with high-level semantic information extracted from the latent space.