# Visualizing what ConvNets learn


There are many approces for visualising convolutional networks. In this notebook we will look at few ways to approch them:

* Layer Activation
* Conv/FC Filters
* Embedding the codes with t-SNE


###  Layer Activation

Layer Activations. The most straight-forward visualization technique is to show the activations of the network during the forward pass. For ReLU networks, the activations usually start out looking relatively blobby and dense, but as the training progresses the activations usually become more sparse and localized. One dangerous pitfall that can be easily noticed with this visualization is that some activation maps may be all zero for many different inputs, which can indicate dead filters, and can be a symptom of high learning rates.


![https://cs231n.github.io/assets/cnnvis/act1.jpeg](https://cs231n.github.io/assets/cnnvis/act1.jpeg)![https://cs231n.github.io/assets/cnnvis/act2.jpeg](https://cs231n.github.io/assets/cnnvis/act2.jpeg)





In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

from torch.utils.data import  DataLoader
from torchvision import models

import torchvision.transforms as transforms
import torchvision.datasets as dataset

import matplotlib.pyplot as plt
import numpy as np
import cv2 as cv

In [None]:
!wget https://download.pytorch.org/tutorial/hymenoptera_data.zip

In [None]:
!unzip hymenoptera_data.zip

In [None]:
modelVGG = models.vgg16(pretrained=True)

In [None]:
print(modelVGG)

In [None]:
img=cv.imread("/content/hymenoptera_data/val/bees/1297972485_33266a18d9.jpg")
img=cv.cvtColor(img,cv.COLOR_BGR2RGB)
plt.imshow(img)
plt.show()


transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomResizedCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
img=np.array(img)
img=transform(img)
img=img.unsqueeze(0)
print(img.size())

In [None]:
no_of_layers=0
conv_layers=[]

model_children=list(modelVGG.children())

for child in model_children:
  if type(child)==nn.Conv2d:
    no_of_layers+=1
    conv_layers.append(child)
  elif type(child)==nn.Sequential:
    for layer in child.children():
      if type(layer)==nn.Conv2d:
        no_of_layers+=1
        conv_layers.append(layer)
print(no_of_layers)

In [None]:
results = [conv_layers[0](img)]
for i in range(1, len(conv_layers)):
    results.append(conv_layers[i](results[-1]))
outputs = results

In [None]:
# visualize 8 features map from each layer 
for num_layer in range(len(outputs)):
    plt.figure(figsize=(50, 10))
    layer_viz = outputs[num_layer][0, :, :, :]
    layer_viz = layer_viz.data
    print("Layer ",num_layer+1)
    for i, filter in enumerate(layer_viz):
        if i == 16: 
            break
        plt.subplot(2, 8, i + 1)
        plt.imshow(filter, cmap='gray')
        plt.axis("off")
    plt.show()
    plt.close()

### Conv/FC Filters. 
The second common strategy is to visualize the weights. These are usually most interpretable on the first CONV layer which is looking directly at the raw pixel data, but it is possible to also show the filter weights deeper in the network. The weights are useful to visualize because well-trained networks usually display nice and smooth filters without any noisy patterns. Noisy patterns can be an indicator of a network that hasn’t been trained for long enough, or possibly a very low regularization strength that may have led to overfitting.

![https://cs231n.github.io/assets/cnnvis/filt1.jpeg](https://cs231n.github.io/assets/cnnvis/filt1.jpeg)![https://cs231n.github.io/assets/cnnvis/filt2.jpeg](https://cs231n.github.io/assets/cnnvis/filt2.jpeg)

In [None]:
#plot the first and second layer of convolutional layers


### Embedding the codes with t-SNE
(WE WILL STUDY THIS LATER)

ConvNets can be interpreted as gradually transforming the images into a representation in which the classes are separable by a linear classifier. We can get a rough idea about the topology of this space by embedding images into two dimensions so that their low-dimensional representation has approximately equal distances than their high-dimensional representation. There are many embedding methods that have been developed with the intuition of embedding high-dimensional vectors in a low-dimensional space while preserving the pairwise distances of the points. Among these, t-SNE is one of the best-known methods that consistently produces visually-pleasing results.

To produce an embedding, we can take a set of images and use the ConvNet to extract the CNN codes (e.g. in AlexNet the 4096-dimensional vector right before the classifier, and crucially, including the ReLU non-linearity). We can then plug these into t-SNE and get 2-dimensional vector for each image. The corresponding images can them be visualized in a grid:

![https://cs.stanford.edu/people/karpathy/cnnembed/cnn_embed_1k.jpg](https://cs.stanford.edu/people/karpathy/cnnembed/cnn_embed_1k.jpg)

In [1]:
#To produce an embedding, we can take a set of images and use the ConvNet to extract the CNN codes (e.g. in AlexNet the 4096-dimensional vector right before the classifier, and crucially, including the ReLU non-linearity). We can then plug these into t-SNE and get 2-dimensional vector for each image. The corresponding images can them be visualized in a grid

from sklearn.manifold import TSNE
#TODO: visualize code using t-sne