Some Drawbacks of ViT

This is a PyTorch implementation for exploring the attention mechanism of ViT. The methods in my experiments including ViT, IG, rollout, PGD, and also the new attack method especially for TiV to fool the attention mechanism while maintain the prediction correct. The method achieves only 0.4 cosine similarity for attention map while maintain around 100% label in imagenette data set.

Experimental Results

We used the pretrained deep learning model ViT. We have varied experiment on ViT showing the drawbacks of ViT. The experimental results are shown as follows.

Running the code

Dependencies

python 3.7
CUDA 10.2
PyTorch with GPU
Anaconda3
pytorch_pretrained_vit
seaborn
matplotlib

Downloading Out-of-Distribtion Datasets

We provide download links of imagenette dataset, for reproducing the experiment results, plz download the dataset and unzip it to the \Attention-ViT folder as the \Attention-ViT\data.

Rollout and IG

For rollout:

python ViT_IG.py

For IG:

python ViT_rollout.py

We can find out that the rollout method is a better method for visualizing ViT network.

PGD and AttentionAttack

For PGD:

python ViT_pgd.py

For AttentionAttack:

python ViT_change_attention.py

Visualize Attention Trend

We point out the drawbacks of the attention mechanism in ViT. The attention maps of the deep layers really have some rediculous attention.

python ViT_rollout_trend.py

The attention maps are in the folder \panda1 and \monkey1. And you can also find them in the presentation slides \Some Draw Backs of ViT.pptx.

Video

You can download the video here.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
examples		examples
figs		figs
monkey1		monkey1
panda1		panda1
utils		utils
vit_pytorch		vit_pytorch
LICENSE		LICENSE
README.md		README.md
Some Drawbacks of ViT.pptx		Some Drawbacks of ViT.pptx
cut_img.py		cut_img.py
distill.png		distill.png
img.jpg		img.jpg
labels_map.txt		labels_map.txt
monkey.jpg		monkey.jpg
monkey1_img.jpg		monkey1_img.jpg
monkey2_img.jpg		monkey2_img.jpg
panda1_img.jpg		panda1_img.jpg
panda2_img.jpg		panda2_img.jpg
setup.py		setup.py
t2t.png		t2t.png
t_grad.hdf5		t_grad.hdf5

License

myhakureimu/Attention_ViT

Folders and files

Latest commit

History

Repository files navigation

Some Drawbacks of ViT

Experimental Results

Running the code

Dependencies

Downloading Out-of-Distribtion Datasets

Rollout and IG

PGD and AttentionAttack

Visualize Attention Trend

Video

About

Resources

License

Stars

Watchers

Forks

Languages