# CLIP-Dissect

Keywords: Neuron-level Interpretability, Network Dissection

Link to paper: https://arxiv.org/abs/2204.10965

About the datasets:

- CIFAR100: standard dataset which contains 60k RGB images of size  32×32  belonging to 100 classes of general objects and animals.

- Places365: A scene recognition dataset with 365 classes: http://places.csail.mit.edu/

- Broden: A diverse dataset for probing neurons with some overlap with Places365, introduced in Network Dissection http://netdissect.csail.mit.edu/.  

Notes:
- Make sure to enable GPU: Runtime -> Change runtime type -> Hardware accelerator: GPPU
- Free Colab RAM is not large enough to run all experiments but should be enough for these. Sometimes restarting helps if it crashes due to RAM constraints.
- Run all the cells in order.
- Do not edit the cells marked with !!DO NOT EDIT!!

## Qualitative evaluation for hidden layers

This notebook generates descriptions for neurons in the hidden layers of a neural network. Neuron descriptions are shown together with 5 most highly activating images for that neuron to evaluate their quality.

In [None]:
# !!DO NOT EDIT!!
!git clone https://github.com/Trustworthy-ML-Lab/CLIP-dissect
!pip install ftfy regex
import os
os.chdir('CLIP-dissect')

In [None]:
#Downloads Broden dataset and ResNet-18 trained on Places
!bash dlbroden.sh
!bash dlzoo_example.sh

In [None]:

import torch

import matplotlib
from matplotlib import pyplot as plt

import utils
import data_utils
import similarity

## Settings

In [None]:
clip_name = 'ViT-B/16'
d_probe = 'cifar100_train'
concept_set = 'data/3k.txt'
batch_size = 200
device = 'cuda'
pool_mode = 'avg'

save_dir = 'saved_activations'
similarity_fn = similarity.soft_wpmi

In [None]:
target_name = 'resnet18_places'
target_layer = 'layer4'

## Run CLIP-Dissect

In [None]:
utils.save_activations(clip_name = clip_name, target_name = target_name, target_layers = [target_layer],
                       d_probe = d_probe, concept_set = concept_set, batch_size = batch_size,
                       device = device, pool_mode=pool_mode, save_dir = save_dir)

with open(concept_set, 'r') as f:
    words = (f.read()).split('\n')

pil_data = data_utils.get_data(d_probe)

In [None]:
save_names = utils.get_save_names(clip_name = clip_name, target_name = target_name,
                                  target_layer = target_layer, d_probe = d_probe,
                                  concept_set = concept_set, pool_mode=pool_mode,
                                  save_dir = save_dir)

target_save_name, clip_save_name, text_save_name = save_names

similarities, target_feats = utils.get_similarity_from_activations(target_save_name, clip_save_name,
                                                                text_save_name, similarity_fn, device=device)

## Visualize

In [None]:
top_vals, top_ids = torch.topk(target_feats, k=5, dim=0)
#neurons_to_check = torch.sort(torch.max(similarities, dim=1)[0], descending=True)[1][0:20]
neurons_to_check = range(10)
font_size = 14
font = {'size'   : font_size}

matplotlib.rc('font', **font)

fig = plt.figure(figsize=[10, len(neurons_to_check)*2])#constrained_layout=True)
subfigs = fig.subfigures(nrows=len(neurons_to_check), ncols=1)
for j, orig_id in enumerate(neurons_to_check):
    vals, ids = torch.topk(similarities[orig_id], k=5, largest=True)

    subfig = subfigs[j]
    subfig.text(0.13, 0.96, "Neuron {}:".format(int(orig_id)), size=font_size)
    subfig.text(0.27, 0.96, "CLIP-Dissect: {}, {:.2f}".format(words[int(ids[0])], vals[0]), size=font_size)
    #subfig.text(0.4, 0.96, words[int(ids[0])], size=font_size)
    axs = subfig.subplots(nrows=1, ncols=5)
    for i, top_id in enumerate(top_ids[:, orig_id]):
        im, label = pil_data[top_id]
        im = im.resize([375,375])
        axs[i].imshow(im)
        axs[i].axis('off')
plt.show()

# To do

Complete the following tasks and record the results in Google Slides

**TASK 1:** Change neurons_to_check to a different set of neurons of your choosing, visualize and save the results (hint: only need to rerun last cell).
- Keep the list of neurons and output image for later comparisons.
- Evaluate whether the description of the neurons match the highly activating images

**TASK 2:** Change d_probe to 'broden'. Evaluate again with the same neurons. Are the concepts similar to before? Which dataset gives better matching concepts and why?

**TASK 3:** Vary other parameters and evaluate how it changes the results. Some possibilities:
- Different layers: try 'conv1', 'layer1', 'layer2' or 'layer3'
- Different similarity functions: see similarity.py for options
- Different concept sets: for example try broden_labels_clean.txt (10k.txt and 20k.txt might run out of RAM in Colab)
- Look at different models, for example 'resnet50' or 'resnet101' (by default loads an ImageNet trained model from torchvision)
- Modify code to display top-k best descriptions for a neuron