Website:  https://www.geeksforgeeks.org/implementing-an-autoencoder-in-pytorch/

In [None]:
import torch 
from torchvision import datasets
from torchvision import transforms
import matplotlib.pyplot as plt

# 1. Data preparation

In [4]:
# Transforms images to a PyTorch Tensor
tensor_transform = transforms.ToTensor()

# Download the MNIST Dataset
dataset = datasets.MNIST(
                root="/Users/mac/我的文件/Notebook/Quantum_Mechanics/algorithm_implementation/7.Encoder&Decoder/code/data",
                train=True,
                download=True,
                transform=tensor_transform,
                )

# DataLoader is used to load the dataset for training
loader = torch.utils.data.DataLoader(dataset=dataset,
                                    batch_size=32,
                                    shuffle=True,
                                    )

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /Users/mac/我的文件/Notebook/Quantum_Mechanics/algorithm_implementation/7.Encoder&Decoder/code/data/MNIST/raw/train-images-idx3-ubyte.gz


 72%|███████▏  | 7101440/9912422 [13:21<36:55, 1268.57it/s]  

# 2. Create Autoencoder Class
In this coding snippet, the encoder section reduces the dimensionality of the data sequentially as given by:
```shell
28*28 = 784 ==> 128 ==> 64 ==> 36 ==> 18 ==> 9
```
Where the number of input nodes is 784 that are coded into 9 nodes in the latent space. Whereas, in the decoder section, the dimensionality of the data is linearly increased to the original input size, in order to reconstruct the input.
```shell
9 ==> 18 ==> 36 ==> 64 ==> 128 ==> 784 ==> 28*28 = 784
```
Where the input is the 9-node latent space representation and the output is the 28*28 reconstructed input.

The encoder starts with 28*28 nodes in a Linear layer followed by a ReLU layer, and it goes on until the dimensionality is reduced to 9 nodes. The decryptor uses these 9 data representations to bring back the original image by using the inverse of the encoder architecture. The decryptor architecture uses a Sigmoid Layer to range the values between 0 and 1 only.

In [None]:
# Creating a PyTorch class
class AE(torch.nn.Module):
    def __init__(self):
        super().__init__()

        self.encoder = torch.nn.Sequential(
            torch.nn.Linear(28 * 28, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 64),
            torch.nn.ReLU(),
            torch.nn.Linear(64, 36),
            torch.nn.ReLU(),
            torch.nn.Linear(36, 18),
            torch.nn.ReLU(),
            torch.nn.Linear(18, 9)
        )
    
        self.decoder = torch.nn.Sequential(
            torch.nn.Linear(9, 18),
            torch.nn.ReLU(),
            torch.nn.Linear(18, 36),
            torch.nn.ReLU(),
            torch.nn.Linear(36, 64),
            torch.nn.ReLU(),
            torch.nn.Linear(64, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 28 * 28),
            torch.nn.Sigmoid()
        )
    

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(x)
        return decoded

# 3. Initializing Model
We validate the model using the `Mean Squared Error function`, and we use an Adam Optimizer with a `learning rate` $0.1$ and `weight decay of` $10^{-8}$

In [None]:
# Model Initialization
model = AE()

# Validation using MSE Loss function
loss_function = torch.nn.MSELoss()

# Using an Adam Optimizer with lr=0.1
optimizer = torch.optim.Adam(
                        model.parameters(),
                        lr=1e-1,
                        weight_decay=1e-8,
                        )

# 4. Create Output Generation
1. The output against each epoch is computed by passing as a parameter into the Model() class and the final tensor is stored in an output list. The image into (-1, 784) and is passed as a parameter to the Autoencoder class, which in turn returns a reconstructed image. The loss function is calculated using MSELoss function and plotted. In the optimizer, the initial gradient values are made to zero using zero_grad(). loss.backward() computes the grad values and stored. Using the step() function, the optimizer is updated.
2. The original image and the reconstructed image from the outputs list are detached and transformed into a NumPy Array for plotting the images.

Note
----
This snippet takes 15 to 20 mins to execute, depending on the processor type. Initialize epoch = 1, for quick results. Use a GPU/TPU runtime for faster computations.

In [None]:
epochs = 20
outputs = []
losses = []

for epoch in range(epochs):
    for (image, _) in loader:

        # Reshaping the image to (-1, 784)
        image = image.reshape(-1, 28*28)

        # Output of AutoEncoder
        reconstructed = model(image)

        # Calculating the loss function
        loss = loss_function(reconstructed, image)

        # The gradient are set to zero,
        # the gradient is computed and stored
        # .step() performs parameters update
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Storing the losses in a list for plotting 
        losses.append(loss)
    outputs.append((epoch, image, reconstructed))

# Defining the Plot Style
plt.style.use('fivethirtyeight')
plt.xlabel('Iterations')
plt.ylabel('Loss')
 
# Plotting the last 100 values
plt.plot(losses[-100:])

# 5. `Input/Reconstructed` Input to/from Autoencoder

In [None]:
for i, item in enumerate(image):

    # Reshape the array for plotting
    item = item.reshape(-1, 28, 28)
    plt.imshow(item[0])

for i, item in enumerate(reconstructed):
  item = item.reshape(-1, 28, 28)
  plt.imshow(item[0])

<font color="steelblue" size="4">

Summary
-------
1. Although the rebuilt pictures appear to be adequate, they are extremely grainy. 
2. To enhance this outcome, extra layers and/or neurons may be added, or the autoencoder model could be built on convolutions neural network architecture. 
3. For `dimensionality reduction`, autoencoders are quite beneficial. 
4. However, it might also be used for data denoising and understanding a dataset’s spread.

</font>