# AIPI 590 - XAI | Assignment #3

# Description

Generate local explanations for individual predictions from a pre-trained blackbox model (ie ResNet34, Inception, BERT, YOLO, GPT-2). You may use LIME, SHAP, or Anchors for this assignment. At least one visualization of your explanation is required.



Include a discussion that explains why you chose the explanation technique you did. In this discussion, include strengths, limitations, and potential improvements to your approach.

[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/suneel-nadipalli/xai-assignments-duke-fall24/blob/main/Assignment%203/XAI_Assignment_3_Interpretable_Models.ipynb)

# Suneel Nadipalli

# Setting Up

## Importing Libraries

In [1]:
!pip install lime shap --quiet
!pip install datasets --quiet
!pip install transformers --quiet

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/275.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m266.2/275.7 kB[0m [31m24.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m275.7/275.7 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m540.1/540.1 kB[0m [31m11.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for lime (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m471.6/471.6 kB[0m [31m11.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━

In [2]:
import datasets
import numpy as np
import transformers

import shap

## Load the IMDB movie review dataset

In [3]:
dataset = datasets.load_dataset("imdb", split="test")

# shorten the strings to fit into the pipeline model
short_data = [v[:500] for v in dataset["text"][:20]]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/7.81k [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/21.0M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/20.5M [00:00<?, ?B/s]

unsupervised-00000-of-00001.parquet:   0%|          | 0.00/42.0M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/25000 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/25000 [00:00<?, ? examples/s]

Generating unsupervised split:   0%|          | 0/50000 [00:00<?, ? examples/s]

## Load and run a sentiment analysis pipeline

In [4]:
classifier = transformers.pipeline("sentiment-analysis", return_all_scores=True)
classifier(short_data[:2])

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]



[[{'label': 'NEGATIVE', 'score': 0.07582098245620728},
  {'label': 'POSITIVE', 'score': 0.924178957939148}],
 [{'label': 'NEGATIVE', 'score': 0.018342554569244385},
  {'label': 'POSITIVE', 'score': 0.9816573858261108}]]

# Explanation - SHAP Library

## Explain the sentiment analysis pipeline

In [5]:
# define the explainer
shap_explainer = shap.Explainer(classifier)

In [6]:
# explain the predictions of the pipeline on the first two samples
shap_values = shap_explainer(short_data[:2])

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer:  50%|█████     | 1/2 [00:00<?, ?it/s]

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer: 3it [04:31, 135.54s/it]


In [7]:
shap.plots.text(shap_values[:, :, "POSITIVE"])

In [8]:
shap.plots.text(shap_values[:, :, "NEGATIVE"])


In [9]:
shap_values = shap_explainer(["This movie is great but it could have been shorter!"])
shap.plots.text(shap_values[:, :, "POSITIVE"])

  0%|          | 0/156 [00:00<?, ?it/s]

PartitionExplainer explainer: 2it [00:15, 15.31s/it]               


In [11]:
shap_values = shap_explainer(["For all its big-hitting visual ambition, philosophical window dressing and pick-and-mix literary references, this is a work of screaming emptiness."])
shap.plots.text(shap_values[:, :, "POSITIVE"])

  0%|          | 0/498 [00:00<?, ?it/s]

PartitionExplainer explainer: 2it [00:47, 47.12s/it]               


In [12]:
classifier(["For all its big-hitting visual ambition, philosophical window dressing and pick-and-mix literary references, this is a work of screaming emptiness."])

[[{'label': 'NEGATIVE', 'score': 0.996156632900238},
  {'label': 'POSITIVE', 'score': 0.003843332175165415}]]

Explanation - Why SHAP

I chose SHAP for the following reasons:

- Faster than LIME
- Can print/visualize explanations for subset of predictions, the average or just a singe data point
- Provides more information in the visualization as opposed to LIME
- SHAP is more suited to complex models (BERT),  as opposed to just simpler models
- SHAP is more stable than the LIME

Situations where it could fail:

- For more complex models and larger datasets, SHAP could really slow down
- There is an opportunity to hide existing biases
- Kernel SHAP ignores feature dependence

Potential Imrpovements:

- Use SHAP for different kinds of models to see how fast it takes and the kind of insights it provides
- Compare with LIME as well to see the difference in interpretations
- Try out different dataset sizes and Kernel SHAP as well