This notebook implements the [Assignment 4](https://github.com/sprintml/tml_2024/blob/main/Assignment4.pdf) - Task 1 of Trustworthy Machine Learning course offered in the Summer Semester 2024 at the Saarland University. This task focuses obtaining explainations for predictions of last three layers in the Resnet18 models trained on Places 365 and ImageNet datasets using [clip-dissect](https://github.com/Trustworthy-ML-Lab/CLIP-dissect) library and explaining the the predictions made by Resnet 50 model. The report analyzing the results of this task can be accessed [here](https://github.com/nupur412/TML_Assignment4_Explainability/blob/main/TML_Task_1_Report.pdf)

In [None]:
import torch
from torchvision import models
from collections import OrderedDict
import matplotlib.pyplot as plt
from PIL import Image
import pandas as pd
import seaborn as sns
import os
import datetime
import json
from torchvision import datasets, transforms, models
from torchvision.models import ResNet18_Weights
from torch.utils.data import DataLoader
import sys

Obtaining the resnet18 model trained on places 365

In [None]:
! wget --progress=bar http://places2.csail.mit.edu/models_places365/resnet18_places365.pth.tar

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Cloning the clip dissect library to use it for explainations of the model predictions

In [None]:
! git clone https://github.com/Trustworthy-ML-Lab/CLIP-dissect.git

Downloading the broden dataset which will act as a probing dataset

In [None]:
! bash /content/CLIP-dissect/dlbroden.sh

Before implementing the command in the next sections, we make some changes in the describe neurons file specifically to access the images corresponding to the neurons. The updated describe neurons file can be accessed [here](https://drive.google.com/file/d/1Qhzn1mCPiNsVAMp0avN8g1AmOY_N_ULt/view?usp=drive_link). We leverage the code from [this](https://github.com/Trustworthy-ML-Lab/CLIP-dissect/blob/main/experiments/fig5_use_case.ipynb) experiment from the clip dissect library

In [None]:
! python /content/CLIP-dissect/describe_neurons.py --target_model resnet18_places --target_layers layer3,layer4,fc --d_probe broden --batch_size 200 --device cuda --pool_mode avg

In [None]:
! python /content/CLIP-dissect/describe_neurons.py --target_model resnet18 --target_layers layer3,layer4,fc --d_probe broden --concept_set --batch_size 200 --device cuda --pool_mode avg

Activation files are generated for each layer after running the above commands and description files are generated that explain which neuron in which layer learns which concept and what is the similarity score

In [None]:
activation_files = {
    'fc': '/content/CLIP-dissect/saved_activations/broden_resnet18_places_fc.pt',
    'layer3': '/content/CLIP-dissect/saved_activations/broden_resnet18_places_layer3.pt',
    'layer4': '/content/CLIP-dissect/saved_activations/broden_resnet18_places_layer4.pt'
}

The following code blocks are re-executed to obtain results for Resnet 18 trained on ImageNet

In [None]:
# Load the description file into a DataFrame
description_file = '/content/CLIP-dissect/results/resnet18_places/descriptions.csv'
df = pd.read_csv(description_file, sep=',', header=0)
concepts_places365 = df['description'].tolist()
unique_concepts_places365 = set(concepts_places365)
num_objects_places365 = len(unique_concepts_places365)

if 'description' in df.columns:
    concept_counts = df['description'].value_counts()
    print(concept_counts)

# Count the occurrences of each concept
concept_counts = df['description'].value_counts()

concept_counts.to_csv('/content/CLIP-dissect/results/resnet18_places/concept_counts.csv', header=['count'])

Obtaining a plot that shows which are the concepts that a large number of neurons learnt

In [None]:
if 'description' in df.columns and 'unit' in df.columns:
    top_concepts_per_layer = {}

    for layer, activation_file in activation_files.items():
        layer_df = df[df['layer'] == layer]

        # Count the occurrences of each concept in the current layer
        concept_counts = layer_df['description'].value_counts()

        # Get the top 20 concepts for the current layer
        top_concepts = concept_counts.nlargest(20)
        top_concepts_per_layer[layer] = top_concepts

        # Save the top concepts to a csv file
        output_csv_path = f'/content/CLIP-dissect/results/resnet18_places/concept_counts_{layer}.csv'
        top_concepts.to_csv(output_csv_path, header=['count'])

        # Plot the top 20 concept counts
        plt.figure(figsize=(10, 5))
        top_concepts.plot(kind='bar')
        plt.title(f'Top 20 Concepts Learned by Most Neurons in {layer}. Total concepts - {len(concept_counts)}')
        plt.xlabel('Concepts')
        plt.ylabel('Number of Neurons')
        plt.tight_layout()

        output_image_path = f'/content/CLIP-dissect/results/resnet18_places/concept_counts_plot_{layer}.png'
        plt.savefig(output_image_path)
        plt.show()

The following function analyzes the similarity scores

In [None]:
def analyze_similarity_scores(description_df, model_name):
    similarity_scores = description_df['similarity']

    plt.figure(figsize=(10, 6))
    sns.histplot(similarity_scores, bins=30, kde=True)
    plt.title(f'Similarity Scores Distribution in {model_name}')
    plt.xlabel('Similarity Score')
    plt.ylabel('Frequency')
    plt.savefig('/content/CLIP-dissect/results/resnet18_places/similarity_scores.png')
    plt.show()

analyze_similarity_scores(df, 'ResNet18 (Places365)')