Skip to content
master
Go to file
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
gif
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Constrained Attention Filter (CAF)

(ECCV 2020) Tensorflow implementation of A Generic Visualization Approach for Convolutional Neural Networks

Paper | 1 Min Video | 10 Mins Video

Qualitative Evaluation -- L2-CAF Slow Motion Convergence

One Object Two Objects
Last Conv
Intermediate Conv

TL;DR

L2-CAF has three core components:

1- TF filter This is the function that inserts L2-CAF inside a network (E.g, inside a DenseNet). L2-CAF is by default disabled; it is passive during classification. To active/de-activate L2-CAF (turn on and off the filter), I use the bool atten_var_gate. False deactivate L2-CAF, while True activates the filter.

2- Optimization loop In this loop, we computes the class-oblivious and class-specific loss and leverage gradient descent to minimize it. When the loss stabilize (loss - prev_loss< 10e-5), break out of the loop.

3- Finalize filter before saving After convergence, the output filter is normalized (L2-Norm|Softmaxed|Gauss-ed) before generating the heatmap.

Requirements

  • Python 3+ [Tested on 3.7]
  • Tensorflow 1.X [Tested on 1.14]

ImageNet Pretrained Models

I used the following

Usage example

Update base_config._load_user_setup with your configurations. Mainly, set the location of pre-trained model (e.g, densenet). The released code optimizes the constrained attention filter on samples images from the "input_imgs" directory. However, if you plan to run the code on a whole dataset (e.g, ImageNet), you shoud set the local_datasets_dir in _load_user_setup

The unit L2-Norm constrained attention filter has two operating modes.

  • visualize_attention.py is the script for the vanilla "slow" (4 seconds) mode. I recommend running this first before experimenting with the fast L2-CAF version. The code of this mode is much easier to understand. The script's main function sets all the hyper-parameters needed. I will ellaborate more on each hyper-parameter soon.

  • visualize_attention_fast.py is the script for the fast (0.3 seconds) mode. The script only supports denseNet. I will add support to Inception and ResNet soon. This script only works for visualizing attention is the last conv layer. I only use it for quantitative evaluation experiments, for instance, when I evaluate L2-CAF using ImageNet validation split.

TODO LIST

  • Add Fast L2-CAF on DenseNet
  • Add InceptionNet and ResNet support
  • Document to use the code
  • Document the intermediate layer visualization
  • Document extra technical tricks not mentioned in the paper

Contributing

It would be great if someone re-implement this in pytorch. Let me know and I will add a link to your Pytorch implementation here

MISC Notes

  • We did not write localization evaluation code. We used the matlab code released by CAM in Tables 1 and 3. We used the python code released by ADL in Table 2. Feel free to evaluate L2-CAF localization with other evaluation codes.

  • The softmax and Gaussian filters are released upon a reviewer request. The current Gaussian filter implementation is hard-coded to support only 7x7 attention filter. It is straight forward to extend it for any odd filter-size (e.g., 13x13). However, for even filter-size I think more changes are required. The last conv layer in standard architectures is 7x7. So the current configuration should cover most case-scenarios.

  • I used modules of this code (especially the nets package) in multiple projects, so there is a lot of code that is not related to L2-CAF. I will iteratively clean the code. The TL;DR section, at the top of the readme file, highlights the core functions related to L2-CAF.

Release History

  • 1.0.0
    • First commit Vanilla L2-CAF on DenseNet, InceptionV1, and ResNet50V2 on 12, 15,18 July 2020
    • Add Fast L2-CAF on DenseNet 21 July 2020
    • Add Fast L2-CAF on Inception 22 July 2020
    • Add Fast L2-CAF on ResNet 23 July 2020

Citation

@inproceedings{taha2020generic,
title={A Generic Visualization Approach for Convolutional Neural Networks},
author={Taha, Ahmed and Yang, Xitong and Shrivastava, Abhinav and Davis, Larry},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2020}
}
You can’t perform that action at this time.