# CLIP-Dissect

Keywords: Neuron-level Interpretability, Network Dissection

Link to paper: https://arxiv.org/abs/2204.10965

About the datasets:

- CIFAR100: standard dataset which contains 60k RGB images of size  32×32  belonging to 100 classes of general objects and animals.

- Places365: A scene recognition dataset with 365 classes: http://places.csail.mit.edu/

- Broden: A diverse dataset for probing neurons with some overlap with Places365, introduced in Network Dissection http://netdissect.csail.mit.edu/.  

Notes:
- Make sure to enable GPU: Runtime -> Change runtime type -> Hardware accelerator: GPPU
- Free Colab RAM is not large enough to run all experiments but should be enough for these. Sometimes restarting helps if it crashes due to RAM constraints.
- Run all the cells in order.
- Do not edit the cells marked with !!DO NOT EDIT!!

## Quantitative evaluation for final layer neurons

This notebook generates descriptions for neurons in the final(classification) layer of the neural network. We then compare how well they match the ground truth (name of the class the neuron corresponds to).

In [None]:
# !!DO NOT EDIT!!
# run this only once per runtime
import os
!git clone https://github.com/Trustworthy-ML-Lab/CLIP-dissect
os.chdir('CLIP-dissect')
!pip install -q -r requirements.txt

In [None]:
import torch

import matplotlib
from matplotlib import pyplot as plt

import clip
import utils
import data_utils
import similarity
import clip
import pandas as pd
from sentence_transformers import SentenceTransformer

In [None]:
#Downloads Broden dataset and ResNet-18 trained on Places
!bash dlbroden.sh
!bash dlzoo_example.sh

## Settings

In [None]:
clip_name = 'ViT-B/16'
d_probe = 'broden'
concept_set = 'data/broden_labels_clean.txt'
batch_size = 200
device = 'cuda'
pool_mode = 'avg'

save_dir = 'saved_activations'
similarity_fn = similarity.soft_wpmi

In [None]:
target_name = 'resnet18_places'
target_layer = 'fc'

## Run CLIP-Dissect

In [None]:
utils.save_activations(clip_name = clip_name, target_name = target_name, target_layers = [target_layer],
                       d_probe = d_probe, concept_set = concept_set, batch_size = batch_size,
                       device = device, pool_mode=pool_mode, save_dir = save_dir)

with open(concept_set, 'r') as f:
    words = (f.read()).split('\n')

pil_data = data_utils.get_data(d_probe)

In [None]:
save_names = utils.get_save_names(clip_name = clip_name, target_name = target_name,
                                  target_layer = target_layer, d_probe = d_probe,
                                  concept_set = concept_set, pool_mode=pool_mode,
                                  save_dir = save_dir)

target_save_name, clip_save_name, text_save_name = save_names

similarities, target_feats = utils.get_similarity_from_activations(target_save_name, clip_save_name,
                                                                text_save_name, similarity_fn, device=device)

In [None]:
model = SentenceTransformer('all-mpnet-base-v2')
clip_model, _ = clip.load(clip_name, device=device)

with open('data/categories_places365.txt', 'r') as f:
    classnames = (f.read()).split('\n')

# Collect results of CLIP-Dissect and baselines

In [None]:
clip_preds = torch.argmax(similarities, dim=1)
clip_preds = [words[int(pred)] for pred in clip_preds]

In [None]:
netdissect_res = pd.read_csv('data/NetDissect_results/resnet18_places365_fc.csv')
nd_preds = netdissect_res['label'].values

## Visualize

In [None]:
#most activating images for each neuron
top_vals, top_ids = torch.topk(target_feats, k=5, dim=0)

pil_data = data_utils.get_data(d_probe)

ids_to_check = [0, 1, 2]

for orig_id in ids_to_check:#range(20):
    orig_id = int(orig_id)
    #print(mse)
    print('\n Layer:{} Neuron:{}, Gt:{}'.format(target_layer, orig_id, classnames[orig_id]))

    clip_cos, mpnet_cos = utils.get_cos_similarity(nd_preds[orig_id:orig_id+1],
                                                   classnames[orig_id:orig_id+1],
                                                   clip_model, model, device, batch_size)
    print('Network Dissection: {} {:.4f} {:.4f}'.format(nd_preds[orig_id], clip_cos, mpnet_cos))

    clip_cos, mpnet_cos = utils.get_cos_similarity(clip_preds[orig_id:orig_id+1],
                                                  classnames[orig_id:orig_id+1],
                                                  clip_model, model, device, batch_size)
    print('CLIP-Dissect: {} {:.4f} {:.4f}'.format(clip_preds[orig_id], clip_cos, mpnet_cos))


    fig = plt.figure(figsize=(15, 7))
    for i, top_id in enumerate(top_ids[:, orig_id]):
        im, label = pil_data[top_id]
        im = im.resize([375,375])
        fig.add_subplot(1, 5, i+1)
        plt.imshow(im)
        plt.axis('off')

    plt.title('Layer:{} Neuron:{}'.format(target_layer, (int(orig_id))))
    plt.show()

# Problems

Complete the following tasks and record the results in Google Slides

**TASK 1:** Compare explanations for 10 different final layer neurons (hint: only need to rerun last cell).
- Which explanations were better, Network Dissection or CLIP-Dissect? Does the highest cosine similarity to ground truth neuron correspond to good match based on your intuition?


clip_cos, mpnet_cos = utils.get_cos_similarity(nd_preds[orig_id:orig_id+1],
                                                   classnames[orig_id:orig_id+1],
                                                   clip_model, model, device, batch_size)
    print('Network Dissection: {} {:.4f} {:.4f}'.format(nd_preds[orig_id], clip_cos, mpnet_cos))

    clip_cos, mpnet_cos = utils.get_cos_similarity(clip_preds[orig_id:orig_id+1],
                                                  classnames[orig_id:orig_id+1],
                                                  clip_model, model, device, batch_size)
    print('CLIP-Dissect: {} {:.4f} {:.4f}'.format(clip_preds[orig_id], clip_cos, mpnet_cos))


    fig = plt.figure(figsize=(15, 7))
    for i, top_id in enumerate(top_ids[:, orig_id]):
        im, label = pil_data[top_id]
        im = im.resize([375,375])
        fig.add_subplot(1, 5, i+1)
        plt.imshow(im)
        plt.axis('off')

    plt.title('Layer:{} Neuron:{}'.format(target_layer, (int(orig_id))))
    plt.show()

**TASK 3:** Vary other parameters and evaluate how it changes the results. Some possibilities:
- Different similarity functions: see similarity.py for options
- Different concept sets: for example try data/3k.txt or data/imagenet_labels (10k.txt and 20k.txt might run out of RAM in Colab)
- Change probing dataset to CIFAR100_train (note: this only changes probing data for CLIP-Dissect, Network Dissection results are precalculated using Broden)
- Extra: Can you postprocess classnames to be more readable without / etc. How does this change cos similarity numbers?