# Linear and convolutional autoencoders

An autoencoder is a neural network that can be used to encode and decode data. The general structure of an autoencoder is shown in the figure below. It consists of two parts: an encoder and a decoder. The encoder compresses the input data into a lower dimensional representation (often referred to as the *latent space representation*) by extracting the most salient features of the data, while the decoder reconstructs the input data from the compressed representation. Therefore, autoencoder is often used for *dimensionality reduction*. In this tutorial, our goal is to compare the performance of two types of autoencoders, a linear autoencoder and a convolutional autoencoder, on reconstructing the [`Fashion-MNIST`](https://github.com/zalandoresearch/fashion-mnist) images. With the help of Covalent, we will see how to break a complex workflow into smaller and more manageable tasks, which allows users to track the task dependencies and execution results of individual steps. Another advantage of Covalent is its ability to auto-parallelize the execution of subtasks.

<div align="center">
<img src="././autoencoder_images/schematic.png" style="width: 40%; height: 40%"/>
</div>

## Building the autoencoders

We will build the two types of autoencoders in [PyTorch](https://pytorch.org/). The linear autoencoder is built on the [`Linear`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear) layers, while the convolutional autoencoder is built on the [`Conv2d`](https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d) layers. Let us first install all the necessary dependencies for this tutorial. Note that we will also be using the [Covalent Dask plugin](https://github.com/AgnostiqHQ/covalent-dask-plugin) in this tutorial.

In [2]:
# !pip install cova
# !pip install covalent-dask-plugin
# !pip install torch torchvision
# !pip install matplotlib

We can then start the Covalent UI and the local dispatcher server by running `!covalent start`. The UI will be available at http://localhost:8080. Next, we import the following modules:

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
from torchvision import datasets, transforms
import covalent as ct
from covalent.executor import DaskExecutor
from dask.distributed import LocalCluster

cluster = LocalCluster(processes=True)
dask_executor = DaskExecutor(scheduler_address=cluster.scheduler_address)

The linear and convolutional autoencoders are implemented as classes inheriting from the [`nn.Module`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html) class in PyTorch. In the case of a linear autoencoder, we will use four hidden layers in the encoder. Between each layer the `ReLU` activation function is applied. The decoder is essentially the "inverse" of the encoder, and we will use the same architecture for it except at the end an additional `Sigmoid` activation function is applied. The choice of this activation function depends on the range of the pixel intensity in the original input data. Note that the `Fashion-MNIST` dataset contains 28x28-pixel gray-scale images, so the input dimension of each image is 28x28x1. The compressed images generated by the encoder would have dimension 3x3x1 in this case.

In [None]:
class LinearAutoencoder(nn.Module):
    """Autoencoder with 4 hidden layers."""

    def __init__(self):
        super(LinearAutoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(28*28, 128),  # input size = 784 -> hidden size = 128
            nn.ReLU(True),
            nn.Linear(128, 64),  # hidden size = 128 -> hidden size = 64
            nn.ReLU(True),
            nn.Linear(64, 12),  # hidden size = 64 -> hidden size = 12
            nn.ReLU(True),
            nn.Linear(12, 3),  # hidden size = 12 -> output size = 3
        )
        self.decoder = nn.Sequential(
            nn.Linear(3, 12),  # input size = 3 -> hidden size = 12
            nn.ReLU(True),
            nn.Linear(12, 64),  # hidden size = 12 -> hidden size = 64
            nn.ReLU(True),
            nn.Linear(64, 128),  # hidden size = 64 -> hidden size = 128
            nn.ReLU(True),
            nn.Linear(128, 28*28),  # hidden size = 128 -> output size = 784
            nn.Sigmoid()  # output with pixel intensity in [0,1]
        )

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded

For the convolutional autoencoder, we will use three hidden layers. Each convolutional layer except for the last one in the encoder will use the `Conv2d` construction with a kernel size of 3x3, a stride of 1, and a padding of 1, followed by a `ReLU` activation function. The decoder will use the `ConvTranspose2d` layers to reverse the action of the `Conv2d` layers in the encoder, followed by a `Sigmoid` activation function.

In [None]:
class ConvAutoencoder(nn.Module):
    """Autoencoder with 3 hidden layers."""

    def __init__(self):
        super(ConvAutoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 16, 3, stride=2, padding=1),  # input size = 1x28x28 -> hidden size = 16x14x14
            nn.ReLU(True),
            nn.Conv2d(16, 32, 3, stride=2, padding=1),  # hidden size = 16x14x14 -> hidden size = 32x7x7
            nn.ReLU(True),
            nn.Conv2d(32, 64, 7),  # hidden size = 32x7x7 -> hidden size = 64x1x1
        )

        self.decoder = nn.Sequential(
            nn.ConvTranspose2d(64, 32, 7),  # input size = 64x1x1 -> hidden size = 32x7x7
            nn.ReLU(True),
            nn.ConvTranspose2d(32, 16, 3, stride=2, padding=1, output_padding=1),  # hidden size = 32x7x7 -> hidden size = 16x14x14
            nn.ReLU(True),
            nn.ConvTranspose2d(16, 1, 3, stride=2, padding=1, output_padding=1),  # hidden size = 16x14x14 -> hidden size = 1x28x28
            nn.Sigmoid()  # output with pixels in [0,1]
        )

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded

## Creating the workflow