# Deep Learning: Ex.6 - **Features visualization**

Submitted by: [... **your name and ID** ...]



In [None]:
# TensorFlow 
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt

# PCA and tSNE:
from sklearn.manifold import TSNE
from sklearn.decomposition import PCA

print(tf.__version__)

---
In this question we will use a "VGG-like" model (similar to the one you trained in past exercises) that was pre-trained on the CIFAR-10 dataset. you will need to download the model and put it in your working directory.

This model consists of the following layers: 

 - input (32x32x3) -> (Conv -> Conv -> Pool) -> (Conv -> Conv -> Pool) -> (Conv -> Conv -> Pool) -> 2048-Dense -> 10-output
 
There are also some batch-normalization and dropout layers in between. All `Conv2D` layers use 3x3 kernels with `padding='same'`.

We will use the 10,000 validation images of the CIFAR-10 dataset for the following analysis.

In [None]:
# download the pre-trained model:
!git clone https://github.com/rubinj/cifar_model.git

# load the model:
model = tf.keras.models.load_model('cifar_model/model.h5') 

for i,l in enumerate(model.layers):
    print('%-5i' % i,
          '%-20s' % (l.name,),
          '%s' % (l.output_shape[1:],))

In [None]:
# download the cifar10 dataset:

from tensorflow.keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

print('x_test.shape = ',x_test.shape)
print('y_test.shape = ',y_test.shape)

---
### 1. Embedding of the feature space in 2-D (using PCA)

In this question we will use our pre-trained model as a smart feature-extractor.
To this aim, we will use the output from the layer before the last one. This layer produces a 2048-D vector for any given image input of size (32,32,3).

- We will use this method, to extract a 2048-D features representation, for each of the 10,000 test images. 

- Use **PCA** to to reduce the dimensionality of the features **from 2048-D to 2-D**, and use a scatter plot to visualize all samples in this 2-D space. Color the samples by their true label (use the `tab10` color map).



In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

---
### 2. Embedding of the feature space in 2-D (using PCA + tSNE)

Usually, tSNE gives better results for this kind of task. The problem is that running tSNE on a large matrix (10,000 x 2,048), can take too long.

Therefore, we will first use **PCA** to reduce the dimensionality of the features: **from 2048-D to 50-D**.

Then, we will use **tSNE** to further reduce the dimensionality from **50-D to 2-D**.

As before, use a scatter plot to visualize all  samples in this 2-D space. Color the samples by their true label (use the `tab10` color map).




In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

---
### 3. Maximally activating patches - `conv2d_1` layer

In this question, we will explore the different filters along the pre-trained model.


- pick the **2nd conv** layer in the model (`conv2d_1`), and calculate its activation pattern for each and every of the 10,000 images. The result should be a tensor of size: (10000,32,32,32). 
- pick the **1st filter (channel) in that layer** (out of the 32 available), and look for the neuron with highest activations there (over all 32x32 neurons and 10,000 images). 
- Print the location `(i,j)` of this neuron and the index number (1-10,000) of the choosen image.
- Plot the corresponding patch in that image.

hint: `np.argmax` and `np.unravel_index` might come handy.


In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

- now, instead of finding the single highest activation, find the **8 highest activations** (in a descending order), for the same filter as before. In other words, out of the total (10000,32,32,1) activations find the highest 8.
- find the corresponding image patch for each of these 8 activations, and plot them in a single row of subplots.

In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

- finally, repeat the same process for **9 more filters** (channels) in the same layer (`conv2d_1`): for each of these filters, find the 8 highest activations and extract their corresponding image patches.

- plot all the patches you extracted (total 10x8 patches: 10 filters x 8 patches for each one). Use `10x8` subplots.

In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

---
### 4. Maximally activating patches - `conv2d_2` layer

Repeat the same process for a different layer now: `conv2d_2`.

- Extract all activations of that filter: (10000,16,16,64)

- Pick 10 filters (out of its 64), and find the 8 highest activations for each filter.

- plot all the patches you extracted (total 10x8 patches: 10 filters x 8 patches for each one). Use 10x8 subplots.



In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

---
### 5. Maximally activating patches - `conv2d_3` layer

Repeat the same process for a different layer now: `conv2d_3`.

- Extract all activations of that filter: (10000,16,16,64)

- Pick 10 filters (out of its 64), and find the 8 highest activations for each filter.

- plot all the patches you extracted (total 10x8 patches: 10 filters x 8 patches for each one). Use 10x8 subplots.


Pay careful attention to the way you transform the index `(i,j)` in the activation layer to the correct patch in the image.. 

You should get results similar to the ones in the presentation slides.

In [None]:
    ################################
    ###  your code goes here...  ###
    ################################

***
## Good Luck!