Visualizing Convolutional Neural Nets

This project aims to study the kind of representations learned by ConvNets at different layers. The implementations in this project are based on the ideas presented in the papers:

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps by Simonyan et.al.
Understanding deep image representations by inverting them by Mahendran et.al.

Visualizing Score based representations and Extracting Visual Saliency Maps

Key concepts

This idea was presented in the paper Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps by Simonyan et.al. The authors exlore the following two ideas:

What kind of representations are learned by CNN's when we talk about a particular classification score in the solftmax layer of any CNN ? This can be visualized by trying to learn an image from random noise which maximizes the class specific score in the CNN.
What is the contribution of individual pixels of any image in the final score achieved by the network? This is explored by extracting Visual Saliency maps by computing the derivative of the final score w.r.t to the input image (this can be done in a single backprop step). The authors also use the Visual Saliency Maps as an initialization tool for GraphCut segmentation which can be used for Object localization and detection. We do not explore this aspect of the saliency map in this implementation

Running the Code

The code for this Visualization can be found in the file Class_score_and_Saliency_Map_Visualization.ipynb. It can be run in a top-down manner like any other IPython notebook.

Idea Dump

I pretrain a MobileNetV2 architeture on the cats vs dogs dataset and then learnt the Visualizations. Would it be better to directly optimize over the learnable image using one of the imagenet class scores rather than pretraining the model

Results

[WIP]

Visualizing convnet representations by Inverting them

Key concepts

This idea was presented in the paper Understanding deep image representations by inverting them by Mahendran et.al. by Mahendran et.al. The authors exlore the following idea:

What kind of representations are learned by CNN's at different convolutional blocks? This is explored by trying to invert the representations of a test image at different layers of the network. This can provide us with alternate forms of the same image that can have similiar representations for the ConvNet. This alternative formulation helps us visualize the kind of images that are sufficient to form meaningful representations at different layers of the network.

The inversion procedure involves the following steps:

Randomly initialize an image.
Compute the reconstruction loss between the representation of the actual image and the random image at a layer l.
Update the random image by backproping though it using any standard optimizer.

Running the Code

The code for this Visualization can be found in the file Inverting_convnet_representations.ipynb. It can be run in a top-down manner like any other IPython notebook.

I use a pretrained VGG16 network for Visualization at various levels of the network.

Results

Original Image:

Reconstructed Images:

Block3 Conv1	Block3 Conv2	Block3 Conv3
Block4 Conv1	Block4 Conv2	Block4 Conv3
Block5 Conv1	Block5 Conv2	Block5 Conv3

Conclusion

Inverting the image representations clearly show that as we move down a deep hierarchical model like the VGG16 convnet, the representations become more abstract with the network discarding most of the finer level details in the image and considering only higher level features like shape, eyes, ears etc.
Visualizing the Image Saliency Maps clearly shows the main image pixels being responsible for the classfication score assigned by the network. This is somewhat obvious and is expected.
Updating the image representation by maximizing the image score does not give any clear representations in my implementation at convergence but this might be due to fine tuning the network and not optimizing the image net score. This is possible since the convnet layers are freezed when pretraining the network.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
results		results
.gitignore		.gitignore
Class_score_and_Saliency_Map_Visualization.ipynb		Class_score_and_Saliency_Map_Visualization.ipynb
Inverting_convnet_representations.ipynb		Inverting_convnet_representations.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

results

results

.gitignore

.gitignore

Class_score_and_Saliency_Map_Visualization.ipynb

Class_score_and_Saliency_Map_Visualization.ipynb

Inverting_convnet_representations.ipynb

Inverting_convnet_representations.ipynb

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Visualizing Convolutional Neural Nets

Visualizing Score based representations and Extracting Visual Saliency Maps

Key concepts

Running the Code

Idea Dump

Results

Visualizing convnet representations by Inverting them

Key concepts

Running the Code

Results

Conclusion

About

Releases

Packages

Languages

License

kpandey008/Visualizing-CNNs

Folders and files

Latest commit

History

Repository files navigation

Visualizing Convolutional Neural Nets

Visualizing Score based representations and Extracting Visual Saliency Maps

Key concepts

Running the Code

Idea Dump

Results

Visualizing convnet representations by Inverting them

Key concepts

Running the Code

Results

Conclusion

About

Topics

Resources

License

Stars

Watchers

Forks

Languages