# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint


## Learning Objectives





At the end of the experiment, you will be able to learn:
*  reconstructing images using convolutional autoencoder.




## Dataset

### Description

The fingerprint dataset has 320 images, 80 images per sensor and each sensor have varying image sizes. It consists of 4 different sensors fingerprints namely :

* Low-cost Optical Sensor
* Low-cost Capacitive Sensor
* Optical Sensor 
* Synthetic Generator





### Autoencoder

An autoencoder is made up of two components- the encoder and decoder network. The task of the encoder is to generate a lower dimensional embedding Z, which is referred to latent vector, or latent representation. After that, we have the decoder stage in which Z is reconstructed to X' prime, which is the same as X (input).


![alt text](https://cdn.talentsprint.com/aiml/Experiment_related_data/IMAGES/6.png)






In [None]:
! wget https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/Fingerprints.zip
! unzip /content/Fingerprints.zip

### Importing Required Packages

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from glob import glob
from tifffile import imread
from skimage.transform import resize
import torch
import torch.nn as nn      
import torch.nn.functional as F
import torch.optim as optim


### Load the  data

#### About glob.iglob:

The glob library  provides methods for traversing the file system and returning files that matched a defined set of glob patterns.

**Note:** Refer to  [glob.iglob](https://docs.python.org/3/library/glob.html)

In [None]:
data = glob('/content/fingerprints/DB*/*')
images = []
for i in range(len(data)):

    # Reading the data using imread
    img = imread(data[i])

    # Resize the images to 224 * 224 as the images are of different sizes
    img = resize(img,(224,224))
    
    # Appending all the images 
    images.append(img)

# Converting the images into float32 array
images_arr = np.asarray(images)
images_arr = images_arr.astype('float32')
print("Dataset:", images_arr.shape)# The data has a shape of 320 x 224 x 224 since there are 320 samples each of the 224 x 224-dimensional matrix.

### Visualizing the Images 

In [None]:
# Display the first 6 images in the dataset
for i in range(6):
  plt.subplot(3,3, i+1)
  plt.axis('off')
  plt.imshow(images_arr[i], cmap="gray")

### Data Preparation

The images are in grayscale with a dimension of 224 X 224 and the number of channels for grayscale image is '1'. Reshaping the input array into four dimensions to feed into the Neural Network, which is 320 X 1 X 224 X 224 (nsamples, nchannels, height, width)

In [None]:
images_arr = images_arr.reshape(-1,1, 224,224)
print(images_arr.shape)

### Split the data into training and a validation set.

In [None]:
from sklearn.model_selection import train_test_split
# Training images both act as the input as well as the ground truth similar to the labels have in the classification task
train_X,valid_X,train_ground,valid_ground = train_test_split(images_arr,images_arr,test_size=0.2,random_state=13)
train_X.shape,valid_X.shape,train_ground.shape,valid_ground.shape

In [None]:
# To convert numpy to tensor, load the data using tensordataset and convert the values to FloatTensor
train_dataset = torch.utils.data.TensorDataset(torch.FloatTensor(train_X),torch.FloatTensor(train_ground))
test_dataset = torch.utils.data.TensorDataset(torch.FloatTensor(valid_X),torch.FloatTensor(valid_ground))
 
# Loading the train dataset aand test dataset 
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64)

**Initializing CUDA**

CUDA is used as an interface between our code and the GPU.

Normally, we run the code in the CPU. To run it in the GPU, we need CUDA. Check if CUDA is available:

In [None]:
# To test whether GPU instance is present in the system or not.
use_cuda = torch.cuda.is_available()
print('Using PyTorch version:', torch.__version__, 'CUDA:', use_cuda)

If it's False, the code is runining on CPU. If it's True, the code is runinng  on GPU.

Let us initialize some GPU-related variables:

In [None]:
device  =  torch.device("cuda" if torch.cuda.is_available() else "cpu")

### Defining the Convolutional Autoencoder Architecture

Define the Convolutional Autoencoder as a class where the encoding network component is made up of two convolutional layers to compress the data. Decoding network component is made up of two convolutional layers. Each layer output in encoding and decoding network are passed through Relu activation function in the forward function.




The autoencoder is divided into two parts:

**Encoder**

The first layer will have 32 output channels with filter size 3 x 3

The second layer will have 64 output channels with filter size 3 x 3, followed by a downsampling (max-pooling) layer,



**Decoder**

The first layer will have 32 output channels with filter size size 2 x 2

The second layer will have 1 output channels with filter size size 2 x 2 




In [None]:
# Define the Convolutional Autoencoder
class ConvAutoencoder(nn.Module):
    def __init__(self):
        super(ConvAutoencoder, self).__init__()
       
        # Encoder
        # Defining the convolution layer with input_channels = 1, output_channels = 32, kernel_size = 3, padding =1
        self.conv1 = nn.Conv2d(1, 32, 3, padding=1)  
        # Defining the convolution layer with input_channels = 32, output_channels = 64, kernel_size = 3, padding =1
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        # Max pooling layer with filter size 2x2
        self.pool = nn.MaxPool2d(2, 2)
      
        
        # Decoder 
        # Defining the convolution layer with input_channels = 64, output_channels = 32, kernel_size = 2, stride =2
        self.t_conv1 = nn.ConvTranspose2d(64, 32, 2, stride=2)
        # Defining the convolution layer with input_channels = 32, output_channels = 1, kernel_size = 2, stride =2
        self.t_conv2 = nn.ConvTranspose2d(32, 1, 2, stride=2)
      

    def forward(self, x):
        # Linear layers with RELU activation
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = self.pool(x)
        x = F.relu(self.t_conv1(x))
        x = self.pool(x)
        x = F.relu(self.t_conv2(x))
               
        return x

#### Calling the instances of the network

In [None]:

#Instantiate the model
model = ConvAutoencoder().to(device)
print(model)

#### Defining the loss function and optimizer

In [None]:
# Initialization of Mean Square Error
loss_func = nn.MSELoss()

# Initialization of Optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)

### Training the Model 

Applying Autoencoders on the train data and finding the loss on the train dataset

In [None]:
EPOCH = 10
for epoch in range(EPOCH):
    for x, y in train_loader:
        t_x = x.to(device)
        t_y = y.to(device)
        # Zero the parameter gradients
        optimizer.zero_grad()

        # Passing the data to the model (Forward Pass)
        decoded1 = model(t_x)

        # Calculating mean square error loss
        loss = loss_func(decoded1, t_y) 
        train_loss = loss.item()

        # Performing backward pass (Backpropagation)
        loss.backward() 
      
        # optimizer.step() updates the weights accordingly                    
        optimizer.step()  
    print('Epoch: ', epoch, '| train loss: %.4f' % train_loss)

### Evaluate the Model

Applying Autoencoders on the test data and finding the loss on the test dataset

In [None]:
# Keeping the network in evaluation mode 
model.eval()
for x, y in test_loader:
  # Convert the images and labels to gpu for faster execution
  eval_x = x.to(device)
  eval_y = y.to(device)
  # Passing the data to the model (Forward Pass)
  decoded2 = model(eval_x)
   # Calculating mean square error loss
  loss = loss_func(decoded2, eval_y)
print(loss)

### Visualizing the reconstruct images of the test data.

In [None]:
f, a = plt.subplots(2, 5, figsize=(8,6))
for i in range(5):
  a[0,i].imshow(eval_x[i].detach().cpu().numpy().reshape(224,224), cmap='gray')
  a[0,i].set_xticks(()); 
  a[0,i].set_yticks(())
  a[0,0].title.set_text('Test Images')
  

for i in range(5):
  
  a[1,i].imshow(decoded2[i].detach().cpu().numpy().reshape(224,224), cmap='gray')
  a[1,i].set_xticks(())
  a[1,i].set_yticks(())
  a[1,0].title.set_text('Reconstructed Test Images')

plt.show()

From the above figures, you can observe that your model did a good job of reconstructing the test images that you predicted using the model. At least visually, the test and the reconstructed images look almost similar.