<a href="https://colab.research.google.com/github/thad75/TP_ENSEA_ELEVE/blob/main/2A/Majeure%20Signal/AutoEncoder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Image Compression using AutoEncoders: A Deep Learning Task

Time : 8h

# Goal of this lab

* Get to know Deep Learning and PyTorch Framework
* Learn how to compress images using AutoEncoders
* Understand the differences between JPEG compression and Deep Learning for compression
* Know how to read and use code given by someone else


# Parts of this lab

The lab is constructed as a basic Data Science Pipeline as follows

* 0 - Introduction and Documents Reading
* 1 - Discovering a Dataset
* 2 - Creating a Model
* 3 - Training the Model
* 4 - Testing the Model
* 5 - Reporting the findings



---


# Disclaimer

We will refer to :
- AutoEncoder as AE
- Multi Layer Perceptron as MLP

Before beginning this lab, please make sure that your environment has a GPU. For that, go to : 
* Execution
* Modifier le type d'execution
* Accelerateur matériel : GPU


---



# 0 - Introduction

So basically during the part 1 of this lab, we have seen JPEG Compression. JPEG compression is a general algorithm that can compress any image. Let's first see your understanding of the JPEG algorithm :
* What are the component in the Encoding part of the algorithm ?
* What are the component in the Decoding part of the algorithm ?
* Is the down-sampling phase of JPEG linear ? (i.e : in a y = ax+b form)
* Is it a lossless compression algorithm ?

# Your Mission : AI for the Win


Hi there, we are a Big Company and we need your help. We have some images that we want to compress. We tried the JPEG algorithm but we want to try some new state of the art methods. Your mission is to help us as a Data Scientist in this task by creating a DL model that compresses image with as less loss as possible. 


Alright, let's get started.

<img src="https://i.pinimg.com/originals/16/b2/96/16b296afb78ec57d12c931bc72b42eec.gif">

In [None]:
%matplotlib inline
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import DataLoader
from torch.utils.data import random_split
from torchvision.datasets import MNIST, CIFAR10
from torchvision import transforms
import matplotlib.pyplot as plt
import numpy as np
import skimage
from skimage import io
import numpy as np
from google.colab.patches import cv2_imshow
import cv2
import torch
import torchvision
import torchvision.transforms as transform
import matplotlib.pyplot as plt
import numpy as np
import torch
import torchvision
import torchvision.transforms as transform
import matplotlib.pyplot as plt
import numpy as np

# Reading the Docs 

Welcome to Big Company, as usual when beginning a new project you have some reading to do. Take 5-10 minutes to read the following documentation to understand something

## What is Deep Learning ?

Deep Learning is a branch of AI where you **teach a Model** a certain **task** using a **Dataset**. The model or a neural network is built by multiple consecutive **layers** of neuron-like units, remotely based on neurons in the human brain. Typically, many consecutive layers are used, that is why it is referred to as deep learning. In those layers, each neuron has several **parameters** (**weights**) that are updated during **training** by minimizing a **loss** (error) function, using **Stochastic Gradient Descent**. Besides the model parameters, to be found using a dataset, there are also **hyperparameters** that you have to tune by yourself, for example, how many layers used in your model, how many neurons per layer,.... The Model infers a prediction from an **input**. In fact, a Deep Neural Network can be seen as a complex function ${f}$ that maps the input data to a learned space from the Dataset. 

Note the bold words. These are the important things you need to understand about Deep Learning. 

## Generalities on AutoEncoder


"Autoencoding" is a data compression algorithm where the compression and decompression functions are 1) data-specific, 2) lossy, and 3) learned automatically from examples rather than engineered by a human. In almost all contexts where the term "autoencoder" is used, the compression and decompression functions are implemented with neural networks.

1) Autoencoders are data-specific, which means that they will only be able to compress data similar to what they have been trained on. 

2) Autoencoders are lossy.

3) Autoencoders are learned automatically from data examples.

To build an autoencoder, you need three things: an encoding function, a decoding function, and a distance function between the amount of information loss between the compressed representation of your data and the decompressed representation. In fact, we can look at the model as a big function :

* The Encoder : This part of the model compresses the input image to a compressed version of it, where $f(x)= z$, where $x$ is the input image, and $z$ the compressed representation of it.
* The Decoder : This part of the model decompresses the compressed representation $z$ to the decompressed image $\tilde{x}$, in other terms we have a function $g$ where, $\tilde{x} = g(z)$
* The AutoEncoder : by stacking the Encoder and the Decoder, and as we want $\tilde{x} = x$ (a.k.a the reconstructed image as similar as the input image), we can rewrite the AE as $\tilde{x} = g(z) = g\circ f(x)=  x$ where $g= f^{-1}$


You will code two types of AE models :
* MLP style
* Conv style


## Generalities on Deep Learning

In this part, we are going to talk about important stuff in Deep Learning.

### Layers

A Layer is an important part of a Model. In fact, it is the key element of a DL Model. A Layer is a structure that takes information from a layer to pass them to the next layer. In a DL Network, each layer extracts features. 

### Activation Functions

An Activation Function is a function that is applied to the output of a Neural Layer. It is the equivalent of the excitation threshold for which a neuron reacts or not.


# 1- Data : Exploring the Unknown

## Dataset

<img src="https://labelyourdata.com/img/article-illustrations/splitting_data.png" height=200>


When training  a DL model, we use a Dataset. The model uses the data to learn something for a task. We usually divide the data into Training, Validation, Test sets.
- Training set is used to train the Model (i.e., to find the parameters of Model).
- Validation set is used to watch the Model's training (to verify whether the training procedure goes well).
- Test set is used to evaluate the performance of the Model (in our case, to measure if the model compresses and decompresses well new images).


<img src="https://i.imgflip.com/653bu2.jpg" height=400>

**In our case of AE, we do not need the label of image. Our method is an unsupervised algorithm.**

### 1 - a - The first one : the Training set

#### Understanding the Data : Data Exploration

Ok let's have a look at what the Data Engineering Team sent us. Let's understand the Dataset. Let's use their API.


- What is the size of the train dataset ?
- What are the elements available in one piece of data ? (image,label)
- What is the shape of one piece of data ?
- What is the type of one piece of data ?
- Plot few elements of the dataset using Matplotlib.

In [None]:
# TODO: Load MNIST Train Dataset from TorchVision

dataset = MNIST('', train=True, download=True, transform=transforms.ToTensor())

# TODO: What's the size of the Dataset ?
# TODO: Retrieve one element of the Dataset ? What is the shape of one piece of Data ? 

size_of_dataset =  len(dataset)
data = dataset[0]
print(data[0].shape)

# TODO: Plot the retrieved Data

plt.imshow(data[0].permute(1, 2, 0)[...,0])

## 1 - b - The second one : the Test Dataset

As you can see, there's a train attribute to the MNIST Class. When it's set to True, you're loading the train Dataset. Hence, change it to false to load the test Dataset

In [None]:
# TODO : Load the test dataset
mnist_test =  MNIST('', train=False, download=True, transform=transforms.ToTensor())

Let's verify that the Data in the Test dataset are in the same style of the Train Dataset :    
- Plot few Data from the Test Dataset with its label.
- Are the data similar ? Are the labels similar ?

In [None]:
# TODO: What's the size of the Dataset ?
# TODO: Retrieve one element of the Dataset ? What is the shape of one piece of Data ? 
# TODO: Plot the retrieved Data
size_of_dataset =  len(dataset)
data = dataset[0]
print(data[0].shape)


plt.imshow(data[0].permute(1, 2, 0)[...,0])

### 1 - c - The third one : Creating the Validation Dataset

At this moment, we have a Train and a Test Dataset. We also like having a Validation Dataset. The validation dataset is often a smaller part of the training dataset.The Validation Dataset allows us to follow the models training. In fact, the data of the Validation Dataset is sent to the model while training.  However, no gradients are computed for the Validation Dataset Data resulting in no update on the weights.

- What does 55000 and 5000 mean ?
- Determine the split value of the training set to create the validation dataset

In [None]:
# Questions : what does 55000 and 5000 mean ? Hint: look at the Dataset length and determine the split value
mnist_train, mnist_val = random_split(dataset, [55000, 5000])


### 1 - d - Creating the Dataloader

So the Dataset returns one element at a time. In DL, we like sending many items at the same time to the model. We form BATCH of Data using a DataLoader. Dataloader are an iterable over the dataset. It means that the Dataloader will form BATCH of Data for you and fetch them 
- Create a DataLoader for your Training, Valid and Testing Dataset
- What is the drop_last attribute ?

More information on dataloader : https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

In [None]:
train_loader = DataLoader(mnist_train, batch_size=128,drop_last =True)
val_loader = DataLoader(mnist_val, batch_size=128,drop_last =True)
test_loader = DataLoader(mnist_test, batch_size= 32,drop_last = True)

Did the Data Engineer do a good work ?
- Is there enough data ?
- Are they easily accessible ?
- Are they correctly labeled ?



# 2 - Creating the Model : The AutoEncoder

Now that we saw what the data was and created our datasets, we need to fullfil our mission. We need a model.

<img src="https://i.imgflip.com/640uob.jpg" height=300>

We are going to explore the path of AutoEncoders ! Alright let's write some readable codes. Our code must be modulable and easy to read. We should try two types of AutoEncoders :    
- MLP Style
- Conv Style

- Use PyTorch and Create Modulable and Stackable Models that inherits from nn.Module
- Choose wisely our Optimizer and Loss

## I - MLP Style : Exploring the Neurons

<img src='https://www.researchgate.net/publication/344394387/figure/fig1/AS:974657746399232@1609387923440/Figure-Computational-Schematics-of-the-MLP-and-the-autoencoder.png'>

We will first try a MLP AE.




#### Creating a Model in PyTorch

Creating a model in PyTorch is simple. A PyTorch is an object that inherits from nn.Module. The pseudo-code is the following :     

```
class Model(nn.Module):
  def __init__(self,...):
    """"
    Defines the model. You can put the input size as a parameter if needed..
    """"
    super().__init__() # to init the main class
    self.layers = ... # defining the model : could be Conv2d, Linear, RNN, LSTM


  def forward(self,x):
    """
    The input x is forwarded through the neural net. 
    """
    output = self.layers(x)
    return output

  # Other methods go down
```

More informations : https://pytorch.org/docs/stable/nn.html


### Correcting the Mistakes

First, we will try a really simple model :
* an Input Dense Layer
* a Latent Space
* an Output Dense Layer


We have received some codes from the other members of the Deep Learning Engineering Team. Looks like there are lots of mistakes.. Let's correct them.

In [None]:
# TODO : Correct the Following Class. 
# Hint : Look at the input, output size, the activations, how the data is forwarded ...

class AutoEncoder_MLP(nn.Module):
  def __init__(self, input_size, compressed_space_size):
    """
    The model is an Input Layer, a Hidden Layer and an Output layer 
    """
    super().__init__() 
    # Init the class attributes
    self.input_size = input_size
    self.output_size = input_size
    self.compressed_space_size = compressed_space_size
    # Init the model
    self.input = nn.Sequential(nn.Linear(self.input_size, self.compressed_space_size),
                               nn.ReLU()) 
    self.output = nn.Sequential(nn.Linear(self.compressed_space_size,self.output_size ),
                               nn.Sigmoid())


  def forward(self,x):
    """
    The input x is forwarded through the neural net. 
    """
    compressed_image = self.input(x)
    decompressed_image = self.output(compressed_image)
    return decompressed_image


In [None]:
# TODO : Create an Instance of the Model by calling the Class with the correct values
model = AutoEncoder_MLP(28*28,128)

# TODO : Print the model.
print(model)

AutoEncoder_MLP(
  (input): Sequential(
    (0): Linear(in_features=784, out_features=128, bias=True)
    (1): ReLU()
  )
  (output): Sequential(
    (0): Linear(in_features=128, out_features=784, bias=True)
    (1): Sigmoid()
  )
)


# 3 - Training

We can train the model. We have a Model and a Dataset. We need few more things..



## 3- a - A Loss

We need a Loss Function. The Loss function must tell us how far our predictions are from the labels (or not ?). It could be by comparing the distribution of two input datas, or by directly comparing the datas using distances .

Let's reason. We are recreating an Image from its compressed version and we want that the recreated image must be as similar as the original image. i.e $\tilde{x} = x$

<img src="https://i.imgflip.com/653jbl.jpg" height=400>



- How can you calculate the similarity between two vectors ?
- What type of loss do you know that calculates the **distance** between two inputs ?

More informations : https://pytorch.org/docs/stable/nn.html#loss-functions

In [None]:
# TODO : Load the needed loss function
criterion = nn.MSELoss()

## 3 - b - An Optimizer

<img src="https://i.imgflip.com/640sfs.jpg" height= 400>

We need something to update the weights of the model. In fact, we need to perform Gradient Descent to recalculate the weights of each layers regarding the model's predictions. The optimizer will search for an Optimum. However, it needs a step to perform this research. This step is called the Learning Rate. The learning rate has a huge effect on the learning. 

<img src=https://miro.medium.com/max/918/0*uIa_Dz3czXO5iWyI. height =300>

In this case, we will use Adam Optimizer as it is a really efficient Optimizer. Don't hesitate to have a look at the other optimizers. 

More informations : https://pytorch.org/docs/stable/optim.html

In [None]:
# TODO : define a small learning rate
learning_rate = 1e-3

# TODO : load the Adam optimizer in the optimizer variable
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate )

## 3 - c - Training

Now that we have everything that is needed for training, we have to create the training loop. We need hyperparameters, parameters that controls the learning. We also need to send the model and the data to the gpu for accelerated computation.


The loop consists of :
* Sending Data through the model to obtain Predictions
* Computing the Loss 
* Backwarding the Loss using Gradients 
* Logging the losses and accuracies (if exists)


The number of epochs is a hyperparameter that defines the number times that the learning algorithm will work through the entire training dataset.

Think of it as you trying to answer an exercise : the first time you won't understand, the second time you'll suceed more, and so on..

In [None]:
# Pre Defined and Useful variables
device = 'cuda' if torch.cuda.is_available() else 'cpu' # To send to the gpu
mini_batches_print = 10 # To print every 10 mini batches
# TODO : Send the model to the device using .to
net = model.to(device)

# TODO : Define your number of epochs
num_epochs = 20

loss_train , loss_val = [], []
for epoch in range(num_epochs) : 
    running_loss_t,running_loss_v = 0.0, 0.0
    # TODO : Create your Training Loop
    for i, data in enumerate(train_loader, 0): 
        # TODO : load the data into two variables
        image, label = data
        # TODO : reshape the input image so that it fit the input layers neuron numbers. Don't forget the Batch Size, the 1rst dimension must always be the Batch Size
        image_reshaped, label = image.view(image.shape[0],-1).to(device), data[1]
        optimizer.zero_grad()
        # TODO : send the image to the model
        outputs = net(image_reshaped)
        # TODO/Questions : Do we need to reshape the input image ? If yes, reshape the image
        outputs = outputs.view(image.shape)
        # TODO : Compute the loss
        loss = criterion(outputs, image.to(device))
       
        loss.backward()
        optimizer.step()
        running_loss_t += loss.item()
        if i == mini_batches_print :
          running_loss_t= running_loss_t/mini_batches_print  
          print('training loss is :',running_loss_t)
          loss_train.append(running_loss_t)      
    # TODO : Create your Validation Loop
    with torch.no_grad():
      for i, data in enumerate(val_loader, 0):    
        # TODO : Do the same as the Train loop but delete everything related to weight update (optimizer, loss backwards ...)
        image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
        outputs = net(image)
        outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        
        running_loss_v += loss.item()
        if i == mini_batches_print :
            running_loss_v= running_loss_v/mini_batches_print 
            print('validation loss is :',running_loss_v)
            loss_val.append(running_loss_v)

### 3 - d - Did it learn somethin' ?

As you might see we've logged into two lists (train_loss, val_loss) the losses computed while training. Let's plot them (Don't forget to put Titles and Axis)
- How can you tell that the training is over ?

In [None]:
# TODO : plot the train and val loss on the same graph using matplotlib.pyplot 

plt.plot(loss_train)
plt.plot(loss_val)
plt.show()

## 4 - Testing the Compression

Now that we trained our model, let's test it on the test dataset. What does testing mean ? Wait there's an email from the senior Data Scientist :


---

 🔊 🔊 🔊 **Message from the Senior Data Scientist** 🔊 🔊 🔊

Sup' heard you've trained your model. 
Ok so let's test it, shall we ?
Take the test dataloader, iterate through it and send the test data to the model. We need to check how well the decompressed image is..
Don't forget to delete all gradients calculations, it takes time and space for nothing. 


---





### 4 - a - Testing on the Test Dataset


Interesting, the Senior talked about deleting all gradient calculations 
* Why must we not compute the gradients for the testing step ?

In [None]:
def imshow(img,name= 'GT'):
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.title(name)
    plt.show()

# TODO : form your testing loop.
with torch.no_grad():
  losses = 0
  for i, data in enumerate(test_loader, 0):    
    image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
    outputs = net(image)
    outputs = outputs.view(data[0].shape)
    loss = criterion(outputs, data[0].to(device))
    losses += loss

# Plot the last batch
imshow(torchvision.utils.make_grid(outputs.detach().cpu()),'Pred')
imshow(torchvision.utils.make_grid(data[0]),'GT')

# TODO : Print the difference in decompression and write it somewhere
print('The difference between the Real Images and the Decompressed Images is: ',losses)

Ok now to see the effect of the compression, change the latent_size to different values. For example try : 512, 128, 16, 1.


<img src="https://i.imgflip.com/64elyi.jpg" height=200>

In [None]:
# TODO : Change your model, Test for differents size of Compressed Space Size. We advice you to try 1, 128, 512
model = AutoEncoder_MLP(28*28,2048)
# Train it
num_epochs = 20
device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = model.to(device)
criterion = nn.MSELoss()
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate )
for epoch in range(num_epochs) : 
    running_loss_t,running_loss_v = 0.0, 0.0
    for i, data in enumerate(train_loader, 0): 
        # TODO : load the data into two variables
        image, label = data
        # TODO : reshape the input image so that it fit the input layers neuron numbers. Don't forget the Batch Size, the 1rst dimension must always be the Batch Size
        image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
        optimizer.zero_grad()
        outputs = net(image)
        outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        loss.backward()
        optimizer.step()
        running_loss_t += loss.item()
        if i == 50:
          running_loss_t= running_loss_t/50
          print('training loss is :',loss)
    with torch.no_grad():
      for i, data in enumerate(val_loader, 0):    
        image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
        outputs = net(image)
        outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        running_loss_v += loss.item()
        if i == 50:
            running_loss_v= running_loss_v/50
            print('validation loss is :',loss)

# Test it

with torch.no_grad():
  running_loss = []
  for i, data in enumerate(test_loader, 0):    
    image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
    outputs = net(image)
    outputs = outputs.view(data[0].shape)

    loss = criterion(outputs, data[0].to(device))
    running_loss.append(loss)

# Plot the last batch
imshow(torchvision.utils.make_grid(outputs.detach().cpu()),'Pred')
imshow(torchvision.utils.make_grid(data[0]),'GT')
sum(running_loss)

### 4 - b - Testing on unseen Data

Alright, we can obsviously say that it works pretty fairly on the Test Data. We should be ok nah ?

Let's test it on other Data.

* Write the code to test the inference on Images taken from the internet.

In [None]:
# TODO : Test some images from the internet to see the compression effect 
image_filename = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQLwsAJaf0m-QmpP056GQMqvY3jTD6cF6FU7Q&usqp=CAU"
image_numpy = cv2.cvtColor(skimage.io.imread(image_filename ),cv2.COLOR_BGR2GRAY)
transform=transforms.Compose([transforms.ToTensor(),
                              transforms.Resize((28,28))])

# TODO : Transform the images and add a dimension for the batch size using unsqueeze
image = transform(image_numpy).unsqueeze(0)

#TODO : Send the model to the model and process the prediction
pred = model(image.view(1,-1).to(device))
pred = pred.view(image.shape).squeeze(0).squeeze(0)

#TODO : Plot dem results'
fig,axarr = plt.subplots(1, 2)
axarr[0].imshow(pred.detach().cpu().numpy())
axarr[1].imshow(image.squeeze(0).squeeze(0))


* What can you say ?

# 5 - Write your Research

Now we need to present our results to the Senior Data Scientist. 
* Sum up all your research on this subject in 2 pages.

You must explain :
* The dataset you used 
* The tests you did
* The models you tested
* The results you had
* The explanation of the results (why it works, why it doesn't)
* How can we compute the compression rate of the model ? 
* Do we need to consider the model's size ?

# II - Convolutional Layer Style : Seeing a Region

<img src="https://miro.medium.com/max/1838/1*LSYNW5m3TN7xRX61BZhoZA.png" height = 300>

The first sprint is over. The Senior Data Scientist sent a new message 🇮


---
 🔊 🔊 🔊 **Message from the Senior Data Scientist** 🔊 🔊 🔊


Sup'

Good work for your first results. However, we need to try another type of model.
Test the same using Conv2d layers. I know that it is new for you but here are some explanations:  Convolutional Layers are filters that "scans" the input image in order to extract features. 

I sent you some classes that you must reuse in your code. Tell me if it has better results.

Peace

Senior Data Scientist

P.S here's a funny image for ya 

<img src="https://i.imgflip.com/65b89l.jpg" height=400>


---



# TODO : Need more exaplanation on conv layers

Take 5 minutes and play with the following link :   

* https://ezyang.github.io/convolution-visualizer/

Questions :    
* What is the stride parameter ?
* What is the padding parameter ?
* What does it change on the output to increase the Kernel Size ?


## Some Definition

Receptive Field : The receptive field are the pixels seen by the kernel layer

# 1 - Model Definition : The code they sent us

Hmmm let's have a look at the model sent by the senior Data Scientist. Looks like there are bunch of submodules   

### 1 - a - SubModules : The stem

#### Conv Down

ConvDown is used to compress the input image. It applies a convolution between the input image and the kernel. In fact, it is used to extract features. Our ConvDown Model will be composed of two layers :     
* Conv2d layer
* Non Linearity (ReLU)

<img src="https://www.jeremyjordan.me/content/images/2017/07/no_padding_no_strides.gif">

In [None]:
class ConvDown(nn.Module):

    def __init__(self, input_channel, output_channel, kernel_size = 3):
        super().__init__()
        self.input_channel = input_channel
        self.output_channel = output_channel
        self.kernel_size = kernel_size
        self.model = nn.Sequential(nn.Conv2d(self.input_channel, self. output_channel, kernel_size =self.kernel_size ),
                                    nn.ReLU())
                                    
    def forward(self,x):
        return self.model(x)

#### Conv Up

ConvUp is used to decompress the input image. In fact, it uses extracted features to propose a reconstructed output feature map.

* From what you've seen on the website, is it possible to increase output size map using Conv2d layers ?

We introduce ConvTranpose2D layers, that applies Transpose Convolution over an input image. It also means that these layers upsamples the input image. In fact the ConvTranspose layers learns to upsample the images.

<img src="https://miro.medium.com/max/1400/1*HnxnJDq-IgsSS0q3Lut4xA.gif" height=300>

In [None]:
class ConvUp(nn.Module):
    """
    ConvUp stacks a Conv2d layer with an Activation 
    If output is True : the Activation is Sigmoid
    If output is False : the Activation is ReLU
    """
    def __init__(self, input_channel, output_channel, kernel_size = 3 , output = True):
        super().__init__()
        self.input_channel = input_channel
        self.output_channel = output_channel
        self.kernel_size = kernel_size
        self.output = output
        self.model = nn.Sequential(nn.ConvTranspose2d(self.input_channel, self. output_channel, kernel_size =self.kernel_size ),
                                    nn.ReLU()) if output is False else nn.Sequential(nn.ConvTranspose2d(self.input_channel, self. output_channel, kernel_size =self.kernel_size ),
                                    nn.Sigmoid())
        
    def forward(self,x):
        return self.model(x)

### 1 - b - Modules : The Wrappers

#### Encoder

The encoder stacks multiple ConvDown to compress and extract features.

<img src="https://i.imgflip.com/65bqe0.jpg" height=300>

In [None]:
class Encoder(nn.Module):

    def __init__(self,input_channel, output_channel, kernel_size = 3):
        super().__init__()
        self.input_channel = input_channel
        self.output_channel = output_channel
        self.kernel_size = kernel_size
        self.model = nn.Sequential( ConvDown(self.input_channel, self.output_channel,self.kernel_size))
                    
    def forward(self,x):
        return self.model(x)

#### Decoder

The decoder stacks multiple ConvUp to decompress and upsample the input.

<img src="https://i.imgflip.com/65br5o.jpg" height=300>

In [None]:
class Decoder(nn.Module):

    def __init__(self,input_channel, output_channel, kernel_size = 3, output = True):
        super().__init__()
        self.input_channel = input_channel
        self.output_channel = output_channel
        self.kernel_size = kernel_size
        self.output = output
        self.model = nn.Sequential( ConvUp(self.input_channel,self.input_channel,self.kernel_size, output))
                    
    def forward(self,x):
        return self.model(x)

### 1 - c - The Conv AutoEncoder : Final Model 

Stack the Encoder and Decoder, we just have to stack them in order to form the AutoEncoder. The stacking is different here as we refer to the input and output channels of each layers.


In [None]:
class AutoEncoder_Conv(nn.Module):

    def __init__(self,input_size,latent_size, output= True):
        super().__init__()
        self.input_size = input_size
        self.latent_size = latent_size
        self.output = output
        self.model = nn.Sequential(Encoder(self.input_size,self.latent_size),
                                   Decoder(self.latent_size, self.input_size, output= True))

    def forward(self,x):
        return self.model(x)



# 2 -Training and Testing 

We can reuse the previously written code. However, we need to make some changes...

* What changes must we do ?


In [None]:
# TODO : Change your model 
model = AutoEncoder_Conv(1,128)
# TODO : Reload your HyperParameters
num_epochs = 20
device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = model.to(device)
criterion = nn.MSELoss()
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate )

# TODO : Rewrite your Training Loop
for epoch in range(num_epochs) : 
    running_loss_t,running_loss_v = 0.0, 0.0
    for i, data in enumerate(train_loader, 0): 
        # TODO : load the data into two variables
        image, label = data
        # TODO : reshape the input image so that it fit the input layers neuron numbers. Don't forget the Batch Size, the 1rst dimension must always be the Batch Size
        image, label = data[0].to(device), data[1]
        optimizer.zero_grad()
        outputs = net(image)
        #outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        loss.backward()
        optimizer.step()
        running_loss_t += loss.item()
        if i == 50:
          running_loss_t= running_loss_t/50
          print('training loss is :',loss)
    with torch.no_grad():
      for i, data in enumerate(val_loader, 0):    
        image, label = data[0].to(device), data[1]
        outputs = net(image)
        #outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        running_loss_v += loss.item()
        if i == 50:
            running_loss_v= running_loss_v/50
            print('validation loss is :',loss)

# TODO : Rewrite your Testing Loop



with torch.no_grad():
  running_loss = []
  for i, data in enumerate(test_loader, 0):    
    image, label = data[0].to(device), data[1]
    outputs = net(image)
    loss = criterion(outputs, data[0].to(device))
    print(loss)
    running_loss.append(loss)

# Plot the last batch
imshow(torchvision.utils.make_grid(outputs.detach().cpu()),'Pred')
imshow(torchvision.utils.make_grid(data[0]),'GT')
sum(running_loss)

Hmm interesting.
Let's compare the results :  For the same compressed space size (512, 128, 16, 1) :
* What are the reconstruction values on the test set ? 
* What are the training time ?
* Which model would you advice ? 

# 3 - Deepening The Models

All the models we've created are only composed of 3 layers : 

      input layer => hidden layer => output layer

We can obviously stack more layers :    
* Modify the Convolutional Encoder and the Decoder so that the AE becomes :     
      input layer => hidden layer => hidden layer => hidden layer => output layer


We want the first and last hidden layer to have the same size

In [None]:
class Encoder(nn.Module):

    def __init__(self,input_channel, output_channel, kernel_size = 3):
        super().__init__()
        self.input_channel = input_channel
        self.output_channel = output_channel
        self.kernel_size = kernel_size
        self.model = nn.Sequential( ConvDown(self.input_channel, self.input_channel*2,self.kernel_size),
                                   ConvDown(self.input_channel*2, self.output_channel,self.kernel_size))
                    
    def forward(self,x):
        return self.model(x)

class Decoder(nn.Module):

    def __init__(self,input_channel, output_channel, kernel_size = 3, output = True):
        super().__init__()
        self.input_channel = input_channel
        self.output_channel = output_channel
        self.kernel_size = kernel_size
        self.output = output
        self.model = nn.Sequential( ConvUp(self.input_channel,self.output_channel*2,self.kernel_size, False),
                                   ConvUp(self.output_channel*2,self.output_channel,self.kernel_size, output))
                    
    def forward(self,x):
        return self.model(x)

class AutoEncoder_Conv(nn.Module):

    def __init__(self,input_size,latent_size, output= True):
        super().__init__()
        self.input_size = input_size
        self.latent_size = latent_size
        self.output = output
        self.model = nn.Sequential(Encoder(self.input_size,self.latent_size),
                                   Decoder(self.latent_size, self.input_size, output= True))

    def forward(self,x):
        return self.model(x)

print(AutoEncoder_Conv(1,128))


Retrain your Model for a Compressed Space of 128 and 64. 

* Compare the results

# 4 - Where's that noise ?

The second sprint is over, but we received a mail for the third sprint


---

 🔊 🔊 🔊 **Message from the Senior Data Scientist** 🔊 🔊 🔊

Sup'
I read somewhere that AE can denoise image. That means that if you add some noise to the input image, the AE will be able to reconsctruct the image without noise. Can you verify that ?

Senior Data Scientist

---


<img src="https://miro.medium.com/max/1400/1*z7SUcHkWp7jT1D_SqvTvgA.png" height=300>

Ok let's verify that, we received a function that add Gaussian Noise to an input image.

In [None]:
def add_noise(inputs):
     noise = torch.randn_like(inputs)*0.2
     return inputs + noise

test_image = mnist_test.__getitem__(2)[0]
test_image = add_noise(test_image.unsqueeze(0))
denoised_image = model(test_image.to(device))
fig,axarr = plt.subplots(1, 2)
axarr[0].imshow(test_image.squeeze(0).squeeze(0).squeeze(0))
axarr[1].imshow(denoised_image.detach().cpu().squeeze(0).squeeze(0).squeeze(0))



* What can you tell ?
* Does it work with lots of noise ?

# III - Getting Some Colors (Optional)

This is the fourth and final sprint. After that we can send the model to production. Let's read Senior Data Scientist email 



---
 🔊 🔊 🔊 **Message from the Senior Data Scientist** 🔊 🔊 🔊

Sup'

I like colors. We like colors. But do your model work on colors ? I'm asking the Data Engineer Team to send you the API for CIFAR10 so that you can test. Just reapply the same methodology and send me the results asap

Kiss kiss XOXO

Senior Data Scientist


---







## 1 - CIFAR10 : Colors and Classes

 🔊 🔊 🔊 **Message from the Data Engineering Team** 🔊 🔊 🔊

Wazzaaaaaaaaa, the API is ready.
Have a good day :D

Data Engineering Team

### Understanding the Data : Data Exploration

Ok so let's do the same thing as befor Let's use their API.


- What is the size of the train dataset ?
- What are the elements available in one piece of data ? (image,label)
- What is the shape of one piece of data ?
- What is the type of one piece of data ?
- Plot few elements of the dataset using Matplotlib.



In [None]:
transform = transforms.Compose([transforms.ToTensor()])
dataset_train = CIFAR10(root='./data', train=True,download=True, transform=transform)
dataset_test =  CIFAR10(root='./data', train=False,download=True, transform=transform)

# TODO : Do all the Data Exploration Things. 
# Questions : How similar are the data ? Coding Wise ? Data Wise ?



In [None]:
# TODO : Create the DataLoaders
train_loader = DataLoader(dataset_train, batch_size=128,drop_last =True)
test_loader = DataLoader(dataset_test, batch_size= 32,drop_last = True)

###  2 - Train and Test your MLP Model

In [None]:
# TODO : Create you
model = AutoEncoder_MLP(32*32*3,1024)
# TODO : Reload your HyperParameters
num_epochs = 20
device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = model.to(device)
criterion = nn.MSELoss()
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate )

# TODO : Rewrite your Training Loop
for epoch in range(num_epochs) : 
    running_loss_t,running_loss_v = 0.0, 0.0
    for i, data in enumerate(train_loader, 0): 
        # TODO : load the data into two variables
        image, label = data
       # print(image.shape)
        # TODO : reshape the input image so that it fit the input layers neuron numbers. Don't forget the Batch Size, the 1rst dimension must always be the Batch Size
        image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
        optimizer.zero_grad()
        outputs = net(image)
        outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        loss.backward()
        optimizer.step()
        running_loss_t += loss.item()
        if i == 50:
          running_loss_t= running_loss_t/50
          print('training loss is :',loss)
# TODO : Rewrite your Testing Loop

import torch
import torchvision
import torchvision.transforms as transform
import matplotlib.pyplot as plt
import numpy as np

with torch.no_grad():
  running_loss = []
  for i, data in enumerate(test_loader, 0):    
        image, label = data[0].view(data[0].shape[0],-1).to(device), data[1]
        outputs = net(image)
        outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
  running_loss.append(loss)

# Plot the last batch
imshow(torchvision.utils.make_grid(outputs.detach().cpu()),'Pred')
imshow(torchvision.utils.make_grid(data[0]),'GT')
sum(running_loss)

### 3 - Train and Test your Conv Model

In [None]:
# TODO : Create you
model = AutoEncoder_Conv(3,512)
# TODO : Reload your HyperParameters
num_epochs = 20
device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = model.to(device)
criterion = nn.MSELoss()
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate )

# TODO : Rewrite your Training Loop
for epoch in range(num_epochs) : 
    running_loss_t,running_loss_v = 0.0, 0.0
    for i, data in enumerate(train_loader, 0): 
        # TODO : load the data into two variables
        image, label = data
        # TODO : reshape the input image so that it fit the input layers neuron numbers. Don't forget the Batch Size, the 1rst dimension must always be the Batch Size
        image, label = data[0].to(device), data[1]
        optimizer.zero_grad()
        outputs = net(image)
        #outputs = outputs.view(data[0].shape)
        loss = criterion(outputs, data[0].to(device))
        loss.backward()
        optimizer.step()
        running_loss_t += loss.item()
        if i == 50:
          running_loss_t= running_loss_t/50
          print('training loss is :',loss)

# TODO : Rewrite your Testing Loop

import torch
import torchvision
import torchvision.transforms as transform
import matplotlib.pyplot as plt
import numpy as np

with torch.no_grad():
  running_loss = []
  for i, data in enumerate(test_loader, 0):    
    image, label = data[0].to(device), data[1]
    outputs = net(image)
    loss = criterion(outputs, data[0].to(device))
    print(loss)
    running_loss.append(loss)

# Plot the last batch
imshow(torchvision.utils.make_grid(outputs.detach().cpu()),'Pred')
imshow(torchvision.utils.make_grid(data[0]),'GT')
sum(running_loss)

# 4 - Results

Now analyze your results and write your report.

