# Testing Captum  + Text Generation

This notebook was created to test Captum Package usage with Text Generation Models from Huggingface Models using Huggingface Transformers.

### Tested Model
- GPT-2
- GODEL
- Mistral 7B Instruct
- LlaMa 2 7B Chat (HF Version)

#### Tested Interpretability Implementation
Tests run with thesis-custom , which introduces a few minor fixes.

### Hardware Acceleration
This was run on a hardware accelerated google colab notebook with 50GB of RAM.**Using less RAM will lead to issues.** Also loading all models in the same session will lead to crashes (i.e. Mistral Instruct takes up 30GB of Memory alone).


### Hardware Acceleration
This was run on a hardware accelerated google colab notebook with 50GB of RAM.**Using less RAM will lead to issues.** Also loading all models in the same session will lead to crashes (i.e. Mistral Instruct takes up 30GB of Memory alone). 


Additionally a GPU **should be used**.

---

**Do not run this entire Jupyter Notebook all at once, there most certainly will be crashes. Rather check individual code blocks and run the one by one.**

---


## Installation, Imports and Setup

### Tokens for Downloads

Without a Github token the custom variant of shap and captum cannot be loaded. Without a HGF Token LlaMa 2 cannot load from the huggingface hub. Which means the tokens are needed to run the notebook.

This is set up for colab, alternatively the commented string variant below can be used. For this replace the string with an actual token.

*   Github [Token Info](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens)
*   Huggingface [Token Info](https://huggingface.co/docs/hub/security-tokens)

In [7]:
# grabbing tokens for repository and model access
from google.colab import userdata

gh_token = userdata.get("GITHUB_TOKEN")
hgf_token = userdata.get("HGF_TOKEN")

# gh_token="TOKEN"
# hgf_token="TOKEN"

### Installs and Imports

In [2]:
# basic installs and additional utilies (usually not needed in colab)
!pip install matplotlib
!pip install numpy
!pip install pandas
!pip install ipywidgets
!pip install ipython

# model package installs
!pip install torch
!pip install transformers
!pip install huggingface_hub
!pip install accelerate
!pip install sklearn

Collecting jedi>=0.16 (from ipython>=4.0.0->ipywidgets)
  Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: jedi
Successfully installed jedi-0.19.1
Collecting accelerate
  Downloading accelerate-0.26.1-py3-none-any.whl (270 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.9/270.9 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: accelerate
Successfully installed accelerate-0.26.1
Collecting sklearn
  Downloading sklearn-0.0.post12.tar.gz (2.6 kB)
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... 

In [1]:
# installing captum package from GitHub repository
!pip install git+https://${gh_token}@github.com/LennardZuendorf/thesis-captum.git

# alternatively captum can be installed from pip
## !pip install captum

Collecting git+https://****@github.com/LennardZuendorf/thesis-captum.git
  Cloning https://****@github.com/LennardZuendorf/thesis-captum.git to /tmp/pip-req-build-xyj8anie
  Running command git clone --filter=blob:none --quiet 'https://****@github.com/LennardZuendorf/thesis-captum.git' /tmp/pip-req-build-xyj8anie
  Resolved https://****@github.com/LennardZuendorf/thesis-captum.git to commit 7dd85e4a2762b0d2c9850c33c966fb9d049dd909
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [2]:
# basic imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# model imports
import torch
import transformers

# interpretability import
import captum

### Setup Models



In [3]:
# setting device based on available hardware
if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

print(f"Device set to {device}.")

Device set to cpu.


In [4]:
# setup gpt-2 and godel model and tokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForSeq2SeqLM

# gpt and godel loading function so this can be run individually
def load_gd_gpt():

    # load tokenizer and model from huggingface
    gpt_tokenizer = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
    gpt_model = AutoModelForCausalLM.from_pretrained("gpt2")

    # manage setup based on available device
    gpt_model.to(device)

    # update model config
    gpt_model.config.is_decoder = True
    gpt_model.config.max_new_tokens = 50
    gpt_model.config.do_sample = True

    # load tokenizer and model from huggingface
    gd_tokenizer = AutoTokenizer.from_pretrained("microsoft/GODEL-v1_1-large-seq2seq")
    gd_model = AutoModelForSeq2SeqLM.from_pretrained(
        "microsoft/GODEL-v1_1-large-seq2seq"
    )

    # manage setup based on available device
    gd_model.to(device)

    # update GODEL model config
    gd_model.config.is_decoder = True
    gd_model.config.max_new_tokens = 50
    gd_model.config.do_sample = True

    return gpt_model, gpt_tokenizer, gd_model, gd_tokenizer

#### Setup Mistral and Llama 2

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

# mistral loading function, so this doesn't run automatically on load
def load_mistral():

    # load tokenizer and model from huggingface
    mistral_tokenizer = AutoTokenizer.from_pretrained(
        "mistralai/Mistral-7B-Instruct-v0.2"
    )
    mistral_model = AutoModelForCausalLM.from_pretrained(
        "mistralai/Mistral-7B-Instruct-v0.2"
    )

    # manage setup based on available device
    mistral_model.to(device)

    # update model config
    mistral_model.config.is_decoder = True
    mistral_model.config.max_length = 50
    mistral_model.config.no_repeat_ngram_size = 2
    mistral_model.config.do_sample = True

    return mistral_model, mistral_tokenizer

In [5]:
from transformers import AutoTokenizer, AutoModelForCausalLM

# llama loading function, so this doesn't run automatically on load
def load_llama():

    # load tokenizer and model from huggingface
    llama_tokenizer = AutoTokenizer.from_pretrained(
        "meta-llama/Llama-2-7b-chat-hf", token=hgf_token
    )
    llama_model = AutoModelForCausalLM.from_pretrained(
        "meta-llama/Llama-2-7b-chat-hf", token=hgf_token
    )

    # manage setup based on available device
    llama_model.to(device)

    # update model config
    llama_model.config.is_decoder = True
    llama_model.config.max_length = 50

    # update tokenizer config
    llama_tokenizer.pad_token = llama_tokenizer.eos_token

    return llama_model, llama_tokenizer

**(Loading all Models in Parallel will overload the 50GB RAM)**

-> load either GPT-2 + GODAL **or** Mistral **or** Llama2

In [None]:
# loading gpt and godel model and tokenizer
gpt_model, gpt_tokenizer, gd_model, gd_tokenizer = load_gd_gpt()

In [None]:
# loading mistral model and tokenizer
mistral_model, mistral_tokenizer = load_mistral()

In [8]:
# loading llama model and tokenizer
llama_model, llama_tokenizer = load_llama()

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



## Captum Testing Code

### Simple Testing Code with GPT-2

In [None]:
# defining test input
test_input = "Harry is a lawyer on the east coast, his hobbies include"

In [None]:
# interpretability creation function for captum powered interpretability methods
# CREDIT: Adapted from Miglani, V., Yang, A., Markosyan, A.H., Garcia-Olano, D. and Kokhlikyan, N., 2023.
## Using Captum to Explain Generative Language Models. arXiv preprint arXiv:2312.05491.
from captum.attr import ShapleyValueSampling, TextTokenInput, LLMAttribution

# creating llm attribution instance
llm_attribution = LLMAttribution(ShapleyValueSampling(gpt_model), gpt_tokenizer)

# setting input, running attribution, printing attribution plot and values
attribution_input = TextTokenInput(test_input, gpt_tokenizer)
attribution_result = llm_attribution.attribute(attribution_input)
attribution_result.plot_token_attr()

### Extended Captum Testing

#### Utlity Functions

In [9]:
# formatting function for format output text and tokens
import re

# function to format the model reponse nicely
def format_output_text(output: list):
    # remove special tokens from list
    formatted_output = format_tokens(output)

    # start string with first list item if it is not empty
    if formatted_output[0] != "":
        output_str = formatted_output[0]
    else:
        # alternatively start with second list item
        output_str = formatted_output[1]

    # add all other list items with a space in between
    for txt in formatted_output[1:]:
        # check if the token is a punctuation mark
        if txt in [".", ",", "!", "?"]:
            # add punctuation mark without space
            output_str += txt
        # add token with space if not empty
        elif txt != "":
            output_str += " " + txt

    # return the combined string with multiple spaces removed
    return re.sub(" +", " ", output_str)


# format the tokens by removing special tokens and special characters
def format_tokens(tokens: list):
    # define special tokens to remove and initialize empty list
    special_tokens = ["[CLS]", "[SEP]", "[PAD]", "[UNK]", "[MASK]", "▁", "Ġ", "</w>"]
    updated_tokens = []

    # loop through tokens
    for t in tokens:
        # remove special token from start of token if found
        if t.startswith("▁"):
            t = t.lstrip("▁")

        # loop through special tokens and remove them if found
        for s in special_tokens:
            t = t.replace(s, "")

        # add token to list
        updated_tokens.append(t)

    # return the list of tokens
    return updated_tokens


# function to remove orphan whitespaces in a list of text
def remove_orphan_whitespaces(texts: list):
    # instantiating a new empty list
    cleaned_list = []

    # loopin over list
    for text in texts:
        if text != " " and text != "":
            cleaned_list.append(text)

    # additionally rmeoving multiple spaces and return
    return cleaned_list


# function to prettyfy formatting of runtime seconds
def prettyfy_second(runtime_float):
    # formats seconds runtime in nice format
    runtime_str = time.strftime("%H:%M:%S", time.gmtime(runtime_float))
    return runtime_str

#### Captum Runner Functions

In [10]:
# CREDIT: Adapted from Miglani, V., Yang, A., Markosyan, A.H., Garcia-Olano, D. and Kokhlikyan, N., 2023.
## Using Captum to Explain Generative Language Models. arXiv preprint arXiv:2312.05491.
# interpretability creation function for captum powered interpretability methods
from captum.attr import (
    ShapleyValueSampling,
    KernelShap,
    Lime,
    TextTokenInput,
    LLMAttribution,
    LLMGradientAttribution,
)

import time

# function for shapley value sampling attribution with captum
def shapley_attribution(test_input: str, model, tokenizer):

    # creating llm attribution class with Shapley Value Sampling and given Model, Tokenizer
    llm_attribution = LLMAttribution(ShapleyValueSampling(model), tokenizer)

    # generation attribution
    attribution_input = TextTokenInput(test_input, tokenizer)
    runtime = time.time()
    attribution_result = llm_attribution.attribute(attribution_input)
    print(f"runtime is: {prettyfy_second(time.time()-runtime)}")

    return attribution_result


# function for KernelSHAP attribution with captum
def kernel_attribution(test_input: str, model, tokenizer):

    # creating llm attribution class with KernelSHAP and given Model, Tokenizer
    llm_attribution = LLMAttribution(KernelShap(model), tokenizer)

    # generation attribution
    attribution_input = TextTokenInput(test_input, tokenizer)
    runtime = time.time()
    attribution_result = llm_attribution.attribute(attribution_input)
    print(f"runtime is: {prettyfy_second(time.time()-runtime)}")

    return attribution_result


# function to create explanation with LIME
def lime_attribution(test_input: str, model, tokenizer):

    # creating llm attribution class with LIME and given Model, Tokenizer
    llm_attribution = LLMAttribution(Lime(model), tokenizer)

    # generating attribution
    attribution_input = TextTokenInput(test_input, tokenizer)
    runtime = time.time()
    attribution_result = llm_attribution.attribute(attribution_input)
    print(f"runtime is: {prettyfy_second(time.time()-runtime)}")

    return attribution_result

#### Testing with GODEL

In [None]:
# formatting function to formatting input for the model
# CREDIT: Adapted from official interference example on Huggingface
## see https://huggingface.co/microsoft/GODEL-v1_1-large-seq2seq
def gd_format_prompt(
    message: str = "Does money buy happiness?",
    system_prompt: str = "Given a dialog context, you need to respond empathically.",
    knowledge: str = "",
):

    # adds knowledge text if not empty
    if knowledge != "":
        knowledge = "[KNOWLEDGE] " + knowledge

    # adds the message to the prompt
    prompt = f" {message}"
    # combines the entire prompt
    full_prompt = f"{system_prompt} [CONTEXT] {prompt} {knowledge}"

    # returns the formatted prompt
    return full_prompt

In [None]:
# running godel with kernelSHAP attribution
gd_kernel_att = kernel_attribution(gd_format_prompt(), gd_model, gd_tokenizer)
gd_kernel_att.plot_seq_attr()

##### Comment
GODEL is a sequence 2 to sequence model which is currently not supported with captum LLM attribution method since it not autoregressive. This is also evident in the error message.

#### Mistral

In [None]:
# formatting function to format input for the model
# CREDIT: Inspired by offical documentation and example on Huggingface
## see https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
def mistral_format_prompt(
    message: str = "Does money buy happiness?",
    system_prompt: str = "Given a dialog context, you need to respond empathically.",
):
    prompt = (
        f"<s>[INST] {system_prompt} [/INST] Hello, how can I assist you"
        f" today?</s>[INST] {message} [/INST]"
    )
    return prompt

In [None]:
# running mistral with kernelSHAP attribution and plotting resutlts
mistral_kernel_att = kernel_attribution(
    mistral_format_prompt(), mistral_model, mistral_tokenizer
)
mistral_kernel_att.plot_seq_attr()

In [None]:
# running mistral with shapley values attribution and plotting results
mistral_shapley_att = shapley_attribution(
    mistral_format_prompt(), mistral_model, mistral_tokenizer
)
mistral_shapley_att.plot_token_attr()

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [None]:
# running mistral with lime attribution and plotting results
mistral_lime_att = lime_attribution(
    mistral_format_prompt(), mistral_model, mistral_tokenizer
)
mistral_lime_att.plot_seq_attr()

#### Llama 2

In [14]:
# formatting function to format input for the model
# CREDIT: Adapted from Philipp Schmid
## see https://www.philschmid.de/llama-2#how-to-prompt-llama-2-chat
def llama_format_prompt(
    message: str = "Does money buy happiness?",
    system_prompt: str = "Given a dialog context, you need to respond empathically.",
):
    prompt = f"<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{message} [/INST]"
    return prompt

In [12]:
# running llama with KernelSHAP attribution and plotting results
llama_kernel_att = kernel_attribution(
    llama_format_prompt(), llama_model, llama_tokenizer
)
llama_kernel_att.plot_seq_attr()



runtime is: 00:11:45


In [15]:
# running llama with shapley value attribution and plotting results
llama_shapley_att = shapley_attribution(
    llama_format_prompt(), llama_model, llama_tokenizer
)
llama_shapley_att.plot_token_attr()



KeyboardInterrupt: 

In [13]:
# running llama with lime attribution and plotting results
llama_lime_att = lime_attribution(llama_format_prompt(), llama_model, llama_tokenizer)
llama_lime_att.plot_seq_attr()



runtime is: 00:12:02
