# Project 4: Image Classification and Feature Extraction
---

## Assignments

Please, edit your report by fulfilling the following list of assignments.

**Introduction.** Short summary of the goals of the  project. The sections composing the report.

**Section 1. Data loading and preparation**

Download, decompress, analyse:

https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz

* 1a. Download the dataset;
* 1b. Load the dataset and check its characteristics;
* 1c. Show the first 4 images for each category;
* 1d. Data preparation.

**Section 2. The AlexNet model** 
* 2a. Download the net, not pre-trained.

**Section 3. Training the Net**
* 3a. Design the architecture;
* 3b. Define the batch size and load the training and test set;
* 3c. Define a training function;
* 3d. Run the training;
* 3e. Save the trained network.

**Section 4. Test and performance evaluations**
* 4a. Define and execute a test function;
* 4b. Performance curves.

**Section 5. Extracting features**
* 5a. Load the saved network and set it in evaluation mode;
* 5b. Visualize the learned kernels of the first convolutional layer;
* 5c. Visualize the feature maps.

**Results, Observations and Conclusions**
Write your own notes, observations and conclusions about the results of your work.

**Full Code**
Report the complete code of the project.

## Introduction

This report is divided in ...TODO 

The aim of this project is ...TODO

## 1. Data loading and preparation

Import the libraries.

In [None]:
!pip install torch-inspect

In [None]:
import numpy as np
import matplotlib.pyplot as plt

import torch
import torch.nn as nn

import torchvision
import torchvision.transforms as transforms

# 1a. Download the dataset
from torchvision.datasets.utils import download_and_extract_archive

# 2a. (show AlexNet structure)
import torch_inspect as ti 

Define the device to use during the computation.

In [None]:
## TODO: if available use GPU, else use CPU


### 1a. Download the dataset
Description of the dataset: [tf_flowers](https://www.tensorflow.org/datasets/catalog/tf_flowers).

In [None]:
# download flower_photos  
url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz" #URL to download file from
filename = "flower_photos.tgz" #name to save the file under
root = "~/tmp/" #directory to place downloaded file in
download_and_extract_archive(url, root, filename)

### 1b. Load the dataset and check its characteristics

In [None]:
transform = transforms.Compose([
    transforms.ToTensor()
])

## TODO: load the dataset into the 'data' variable
# use ImageFolder -> https://pytorch.org/vision/stable/datasets.html#imagefolder
# hints: 
#   * you only need 'root' and 'transform';
#   * use 'flower_photos.tgz/flower_photos/' as root.
data = #...

## TODO: check the dataset size and get the names of the classes
# hint: check the attributes of the ImageFolder class to get the classes' names
# https://pytorch.org/vision/stable/_modules/torchvision/datasets/folder.html#ImageFolder


#### Summarize the number of images for each category and display the summary in a table

In [None]:
# calculate the number of samples in each class:
#   * torch.unique returns the unique elements of the input tensor;
#   * the .targets attrubute gets the class_index value for each image in the dataset.
_, imgs_per_class = torch.unique(torch.tensor(data.targets), return_counts=True)

## TODO: print the number of elements in each class
# hint: print 'imgs_per_class' to inspect it


#### Print the resolution of the first image for each category

Note that the samples are arranged per class, meaning that the first N images belong to class 0, the following M elements belong to the class 1, and so on.

In [None]:
## TODO: print the resolution of the first image for each category

# Note: there are many ways to do this, here's a suggestion:
#   * get the number of the classes;
#   * create a dictionary that maps the labels (integers) to the name of the class;
#   * create a list with the indices of the first image of each class. You already have
#     the number of elements in each class, you could incrementally add them to get the
#     indices using 'accumulate' from 'itertools' ('from itertools import accumulate');
#   * now you have all the data you need and can print the resolution of the 
#     first image for each category.


### 1c. Show the first 4 images for each category
Print them in a grid with 4 columns (samples) and 5 rows (classes).

In [None]:
## TODO: show the first 4 images for each category


### 1d. Data preparation

#### Load and normalize the data

Resize the images to 64x64 pixels and normalize the pixel values.

To normalize ([documentation](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.Normalize)) the tensor images, calculate the mean and standard deviation for the 3 colour channels of the images over the entire dataset.

In [None]:
IMGS_DIM = 64

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize((IMGS_DIM, IMGS_DIM)) #resize all images to IMGS_DIMxIMGS_DIM pixels
])

## TODO: load the dataset into the 'data' variable as before
data = #...

# stack all the images by iterating over the dataset and extracting the images
imgs = torch.stack([img for img, _ in data], dim=3)
print(imgs.shape)

# keep 3 channels and merge all the remaining dimensions into one 
temp = imgs.view(3, -1)
print(temp.shape)

# calculate the mean over the elements of each channel
mean = temp.mean(dim=1)

# calculate the standard deviation over the elements of each channel
std = temp.std(dim=1)

print(mean, std)

Use these values to normalize our data.

In [None]:
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=mean, std=std), #normalize the pixel values
    transforms.Resize((IMGS_DIM, IMGS_DIM)) #resize all images to IMGS_DIMxIMGS_DIM pixels
])

## TODO: load the dataset into the 'data' variable as before
data = #...

#### Constrain to 70 the number of images per category. Check this by printing a table

In [None]:
## TODO: get all the first 70 indices for each class
# put them in a list called 'indices'; use the list with the indices of 
# the first image of each class you created in 1a.


# create a subset of the original dataset with just 70 samples per class
data2 = torch.utils.data.Subset(data, indices)

## TODO: 
#   * check if the size of the dataset is 70x5=350
#   * check if each class contains 70 elements. Hint: you could use 'Counter'
#     ('from collections import Counter')


#### Define the Training and Test sets (300/50) and randomize the images

Define the size of the training set and test set, and construct them randomizing the images using `random_split`.

In [None]:
train_size = 300
test_size = 50

train_set, test_set = torch.utils.data.random_split(data2, [train_size, test_size])

## TODO: check training set and test set size


## 2. The AlexNet model

### 2a. Download the net, not pre-trained
Go to the [AlexNet page](https://pytorch.org/hub/pytorch_vision_alexnet/) on the PyTorch website and load the model.

In [None]:
## TODO: download the AlexNet network, not pre-trained


#### Show in your report a table with its structure
Visualize the network structure. Use the `torch-inspect` library ([PyPI](https://pypi.org/project/torch-inspect/), [GitHub](https://github.com/jettify/pytorch-inspect)) to visualize the network structure in more detail.

In [None]:
## TODO: visualize the network structure
# use both 'model.eval()' and 'summary()' from the 'torch-inspect' library


## 3. Training the Net

### 3a. Design the architecture
We modify some of the AlexNet layers to better suit our dataset. Consult the [documentation](https://pytorch.org/vision/stable/_modules/torchvision/models/alexnet.html).

In [None]:
## TODO: modify the 2 layers following these specifications:

# first convolutional layer
#   * set kernel size to 7
#   * set stride to 2
#   * set padding to 3 
model.features[0] = nn.Conv2d(#...)

# last linear layer (classifier)
#   * set output shape to 5 (there are 5 categories) 
#...


Visualize the modified architecture.

In [None]:
## TODO: visualize the modified architecture


### 3b. Define the batch size and load the training and test set
We define a batch size of 4, and load the training set and test set with `DataLoader`.

In [None]:
## TODO:
#   * define the 'BATCH_SIZE';
#   * load the training set and the test set in 'train_loader'
#     and 'test_loader'.


### 3c. Define a training function
Define a training function to loop over the epochs. First, specify the number of epochs (80) and the learning rate (0.01). Then, use the Stochastic Gradient Descent (SGD) as the optimization algorithm and the `CrossEntropyLoss` as thr performance estimate.


In [None]:
## TODO:
#   * define 'NUM_EPOCHS' and 'LEARNING_RATE';
#   * define 'optimizer' (SGD) and 'error' (CrossEntropyLoss);
#   * complete the 'train' function. This function skeleton misses the
#     code to calculate and register the training accuracy. You can add
#     it to plot the accuracy curve later.

#...

# the training function
def train(model, train_loader, NUM_EPOCHS):
    train_loss = [] #list to memorize the loss values

    for epoch in range(NUM_EPOCHS): #loop over the epochs  
        running_loss = 0.0
        
        for images, labels in train_loader:
            # get the inputs and load them to the computation device 
            #...

            # set the gradients to zero
            #...

            # feedforward pass
            #...

            # backpropagation
            #...

            # update the parameters
            #...

            running_loss += loss.item()

        # calculate the loss value and add it to the list    
        loss = running_loss / len(train_loader)
        train_loss.append(loss)
        # keep track of the training process
        print('Epoch {} of {}, Train Loss: {:.4f}'.format(epoch+1, NUM_EPOCHS, loss))

    return train_loss

### 3d. Run the training

In [None]:
# set the model in train mode
model.train()

## TODO:
#   * load the model to the computation device;
#   * execute the training function.


### 3e. Save the trained network

In [None]:
# save the model
print("Saving the model...")

try:
    torch.save(model.state_dict(), "AlexNet_flowers_saved_network.pth")
    print("Model saved!")
except:
    print("Could not save the model")

Saving the model...
Model saved!


## 4. Test and performance evaluations

In [None]:
# set the model in evaluation mode
model.eval()

### 4a. Define and execute a test function 
Define a test function that calculates the accuracy of the trained network by giving it as inputs the test set. While doing so, print the first 2 batches of images of the test set; check whether their class prediction is correct; print the predicted class as compared the right one.

In [None]:
## TODO: complete the 'test' function
# hint: it's similar to the testing part in the trainin loop in Project 1

def test(model, test_loader):
    predictions_list = [] 
    total = 0
    correct = 0
    batch = 1 #batch counter
    
    for images, labels in test_loader:
        # get the inputs and load them to the computation device 
        #...

        # prediction on the trained network
        outputs = #... 

        # get the predicted class
        predictions = #... 
        # append the predicted class to 'predictions_list'
        #...
        correct += (predictions == labels).sum() #increase the correct predictions counter
                                                 #when the prediction matches the correct label
        total += len(labels) 

        if batch <= 2: #print the images only for the first 2 batches
            figure = plt.figure(figsize=(15, 20))
            cols, rows = BATCH_SIZE, 1
            for j in range(cols):
                # visualize the images of the batch
                figure.add_subplot(rows, cols, j+1)
                if labels[j] == predictions_list[batch-1][j]: #for each image, check if
                    pred = "CORRECT"                          #it was correctly predicted
                else:
                    pred = "WRONG"
                plt.title("Correct class: {}\nPredicted class: {}\nPrediction: {}". format(labels_dict[labels[j].item()], 
                                                                                           labels_dict[predictions_list[batch-1][j].item()],
                                                                                           pred))
                plt.imshow(np.clip(images[j].permute(1,2,0), 0., 1.))
        batch += 1

    # calculate the accuracy
    accuracy = #...

    return accuracy

#### Execute the test function and print the accuracy value

In [None]:
## TODO: execute the test function and print the 
# accuracy value with 2 decimal places


### 4b. Performance curves
Plot the loss and accuracy curve.

In [None]:
## TODO:
#   * plot the loss curve;
#   * plot the accuracy curve (remember you have to modify
#     and add the code to calculate it in the 'train' function
#     if you haven't already done so).

## 5. Extracting Features

### 5a. Load the saved network and set it to evaluation mode

Read about saving and loading models in the [documentation](https://pytorch.org/tutorials/beginner/saving_loading_models.html).

It's more convenient to save the model using `state_dict` (see section **3e. Save the trained network**):
```
torch.save(model.state_dict(), PATH)
```

Steps to take to load a model, especially if you open this notebook and it's not already connected to a runtime (namely, you don't have any executed code, so no defined variables, etc):
* upload to Colab the `AlexNet_flowers_saved_network.pth` file using the file browser on the left side panel (select `Files` tab > click on `Upload to session storage`);
* run all the code blocks BEFORE the training. This will re-define the model structure, which is needed to load the model;
* use `model.load_state_dict(torch.load(PATH))` to load the model (remember to read the [documentation](https://pytorch.org/tutorials/beginner/saving_loading_models.html)).


In [None]:
## TODO: load the model and set it to evaluation mode (model.eval()).
# IMPORTANT: read the instructions above this code cell.


### 5b. Visualize the learned kernels of the first convolutional layer

#### Extract the convolutional layers from the model

Create a list called `layers_list` to save the desired layers. Add to this list the first 2 convolutional layers. 

Then, save into a variable (`first_conv_layer_filters`) the weights of the filters of the first convolutional layer.

In [None]:
# get all the model children as list
model_children = list(model.children())

layers_list = [] # list to save the layers in

## TODO: add to the list the desired layers 
# hint: print model_children to examine it, and extract the layers from here
layers_list.extend([
                    #...
                  ]) 

# weights of the first convolutional layer
first_conv_layer_filters = model_children[0][0].weight

Print the layers in `layers_list` to check them.

In [None]:
## TODO: print the layers in layers_list


#### Visualize the first convolutional layer filters

Visualize 16 of the first convolutional layer filters in a 4x4 grid.

In [None]:
# visualize 16 of the the first convolutional layer kernels
plt.figure(figsize=(8, 8)) #set the width and height of the figure
for i, filter in enumerate(first_conv_layer_filters): #loop over the kernels in the first conv layer
    if i < 16: #we want to visualize only 16 kernels
        # plot a filter
        plt.subplot(4, 4, i+1) 
        plt.imshow(filter[0, :, :].detach().cpu(), cmap='gray')
        plt.axis('off')
plt.show()

### 5c. Visualize the feature maps

#### Get an image from the test set
Get a random image from the test set and visualize it.

In [None]:
## TODO: 
# * get an image and its label from the test set
# * print the image size (it should be [1, 3, 64, 64])
# * visualize the image

# the image can be randomly selected, but it's better if the flower is
# clearly visible and occupies a lot of space. So if you select an 
# image at random, run the code until a good image is selected.
#
# Name the image as 'img' (we will use it later)

#### Pass the input image through each convolutional layer
Pass the image through the layers in `layers_list`.

In [None]:
# TODO: pass the image through the convolutional layers
pass_first_conv = layers_list[0](img) #image passed through the first conv layer
pass_second_conv = #... #output of first conv layer passed
                        #through the second conv layer

outputs = [pass_first_conv, pass_second_conv]

#### Visualize the feature maps from the 1st and 2nd convolutional layer

Visualize 16 feature maps from the first and the second convolutional layer. Note that each of the layers has more than 16 feature maps, but we want to show only 16 to visualize them better. 

In [None]:
## TODO: visualize 16 feature maps (filters) from the 1st convolutional layer 
#        and 16 feature maps from the 2nd convolutional layer, each of them in 
#        a 4x4 grid.
# hints:
# * The code to get the 64 feature maps of the 1st conv layer is the following:
#       feature_maps = outputs[0][0, :, :, :]
#       feature_maps = feature_maps.data
#   (print its size to inspect it). Remember you have to visualize the first 16 
#   out of 64. 
# * To tile the images (e.g. no white space between them) use
#       plt.subplots_adjust(hspace=-.02, wspace=-.02)
#   right before plt.show().


## Results, Observations and Conclusions
TODO: Write your own notes, observations and conclusions about the results of your work.



## Full Code
TODO: Report the complete code of the project.

---
## Bibliography

* [Visualizing Filters and Feature Maps in Convolutional Neural Networks using PyTorch](https://debuggercafe.com/visualizing-filters-and-feature-maps-in-convolutional-neural-networks-using-pytorch/)