<a href="https://colab.research.google.com/github/LennardZuendorf/thesis-files/blob/main/exploration%20shap.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Testing SHAP + Text Generation

This notebook was created to test SHAP usage with Text Generation Models from Huggingface Models using Huggingface Transformers.

### Tested Model
- GPT-2
- GODEL
- Mistral 7B Instruct
- LlaMa 2 7B Chat (HF Version)

#### Tested Interpretability Implementation
Tests run with thesis-shap. Thesis SHAP is a fork of the shap package, updated with logs and a few fixes.

### Hardware Acceleration
This was run on a hardware accelerated google colab notebook with 50GB of RAM.**Using less RAM will lead to issues.** Also loading all models in the same session will lead to crashes (i.e. Mistral Instruct takes up 30GB of Memory alon).


Additionally a GPU can be used, but shap does not use GPU Acceleration and the models are reasonably fast on pure CPU performance.

---

**Do not run this entire Jupyter Notebook all at once, there most certainly will be crashes. Rather check individual code blocks and run the one by one.**

---



## Installation, Imports and Setup

### Tokens for Downloads

Without a Github token the different variant of shap cannot be loaded. Without a HGF Token llama cannot load from the huggingface hub.

This is set up for colab, alternatively the commented string variant below can be used. For this replace the string with an actual token.

*   Github [Token Info](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens)
*   Huggingface [Token Info](https://huggingface.co/docs/hub/security-tokens)


In [None]:
# grabbing tokens for repository and model access
from google.colab import userdata

gh_token=userdata.get('GITHUB_TOKEN')
hgf_token=userdata.get('HGF_TOKEN')

#gh_token="TOKEN"
#hgf_token="TOKEN"

### Installs and Imports

In [None]:
# basic installs and additional utilies (usually not needed in colab)
!pip install matplotlib
!pip install numpy
!pip install pandas
!pip install ipywidgets
!pip install ipython

# model package installs
!pip install torch
!pip install transformers
!pip install huggingface_hub
!pip install accelerate

In [None]:
# installing shap package from GitHub repository
!pip install git+https://${gh_token}@github.com/LennardZuendorf/thesis-shap.git

# alternatively shap can be installed from pip
# !pip install shap

In [None]:
# basic imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# model imports
import torch
import transformers

# interpretability import
import shap

### Setup Models

In [None]:
# setup gpt2 and godel model and tokenizer from huggingface
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForSeq2SeqLM

# gpt and godel loading function so this can be run individually
def load_gd_gpt():

  # load tokenizer and model from huggingface
  gpt_tokenizer = AutoTokenizer.from_pretrained("gpt2",  use_fast=True)
  gpt_model = AutoModelForCausalLM.from_pretrained("gpt2")

  # manage setup based on available device
  device = torch.device("cpu")
  gpt_model.to(device)

  # update model config
  gpt_model.config.is_decoder = True
  gpt_model.config.max_new_tokens=50
  gpt_model.config.do_sample=True


  # load tokenizer and model from huggingface
  gd_tokenizer = AutoTokenizer.from_pretrained("microsoft/GODEL-v1_1-large-seq2seq")
  gd_model = AutoModelForSeq2SeqLM.from_pretrained("microsoft/GODEL-v1_1-large-seq2seq")

  # manage setup based on available device
  device = torch.device("cpu")
  gd_model.to(device)

  # update GODEL model config
  gd_model.config.max_new_tokens=50
  gd_model.config.do_sample=True

  return gpt_model, gpt_tokenizer, gd_model, gd_tokenizer

In [None]:
# setup mistral model and tokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM

# mistral loading function so this can be run individually
def load_mistral():

  # load tokenizer and model from huggingface
  mistral_tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
  mistral_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")


  # manage setup based on available device
  device = torch.device("cpu")
  mistral_model.to(device)

  # update model config
  mistral_model.config.is_decoder=True
  mistral_model.config.max_length=50
  mistral_model.config.no_repeat_ngram_size=2
  mistral_model.config.do_sample=True

  return mistral_model, mistral_tokenizer

In [None]:
# setup llama model and tokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM

# llama loading function so this can be run individually
def load_llama():

  # load tokenizer and model from huggingface
  llama_tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", token=hgf_token)
  llama_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf", token=hgf_token)

  # manage setup based on available device
  device = torch.device("cpu")
  llama_model.to(device)

  # update model config
  llama_model.config.is_decoder=True
  llama_model.config.max_length=50
  llama_model.config.no_repeat_ngram_size=2
  llama_model.config.do_sample=True

  # update tokenizer config
  llama_tokenizer.pad_token = llama_tokenizer.eos_token

  return llama_model, llama_tokenizer

**(Loading all Models in Parallel will overload the 50GB RAM)**

-> load either GPT-2 + GODAL **or** Mistral **or** Llama2

In [None]:
# loading gpt and godel model and tokenizer
gpt_model, gpt_tokenizer, gd_model, gd_tokenizer = load_gd_gpt()

In [None]:
# loading mistral model and tokenizer
mistral_model, mistral_tokenizer = load_mistral()

In [None]:
# loading llama model and tokenizer
llama_tokenizer, llama_model = load_llama()

## Running SHAP Example Code (GPT-2)

CREDIT: Copied and minimally changed from offical shap documentation

see [here](https://shap.readthedocs.io/en/latest/example_notebooks/text_examples/text_generation/Open%20Ended%20GPT2%20Text%20Generation%20Explanations.html)

In [None]:
# code to run a auto shap explainer (uses PartitionSHAP) with GPT2
from shap import Explainer, PartitionExplainer

# function that runs a basic shap example
def gpt_basic_shap(text:list):

  # create explainer and run it
  basic_explainer = Explainer(gpt_model, gpt_tokenizer)
  shap_values = basic_explainer(text)

  return shap_values

In [None]:
# code to run a teacher forcing shap example
from shap.models import TeacherForcing
from shap import maskers, Explainer

def gpt_tch_shap(text:list[list:str]):
  print(text[0], text[1])

  # wrap gpt model in teacher forcing wrapper
  tch_model = TeacherForcing(gpt_model, gpt_tokenizer)

  # setup masker using an empty string instead of "..."
  masker = maskers.Text(gpt_tokenizer, mask_token=" ", collapse_mask_token=True)

  # setup explainer, generate explanation
  tch_explainer = Explainer(tch_model, masker)
  shap_values = tch_explainer(text[0], text[1])

  return shap_values

In [None]:
# defining test text from example

tch_test_text = [[
    "I know many people who are Russian.",
    "I know many people who are Greek.",
    "I know many people who are Australian.",
    "I know many people who are American.",
    "I know many people who are Italian.",
    "I know many people who are Spanish.",
    "I know many people who are German.",
    "I know many people who are Indian.",
],
[
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
]]

basic_test_text = ["I enjoy walking with my cute dog"]

In [None]:
# running tests
basic_gpt_test = gpt_basic_shap(basic_test_text)

[32m2024-01-24 08:26:50.649[0m | [34m[1mDEBUG   [0m | [36mshap.utils.transformers[0m:[36mis_transformers_lm[0m:[36m97[0m - [34m[1mModel has been detected as a transformers model: (<class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>)[0m
[32m2024-01-24 08:26:50.652[0m | [34m[1mDEBUG   [0m | [36mshap.utils.transformers[0m:[36mis_transformers_lm[0m:[36m97[0m - [34m[1mModel has been detected as a transformers model: (<class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>)[0m
[32m2024-01-24 08:26:50.654[0m | [34m[1mDEBUG   [0m | [36mshap.models._teacher_forcing[0m:[36m__init__[0m:[36m50[0m - [34m[1mInitalized a TecherForcing Model.[0m
[32m2024-01-24 08:26:50.714[0m | [34m[1mDEBUG   [0m | [36mshap.utils.transformers[0m:[36mis_transformers_lm[0m:[36m95[0m - [34m[1mModel is not a transformers language model: <class 'shap.models._teacher_forcing.TeacherForcing'>[0m
[32m2024-01-24 08:26:50.715[0m | [34m[1mDEBUG   

In [None]:
# plotting the values
from shap import plots

plots.text(basic_gpt_test)

## Testing Code

### Helper & SHAP Runner Functions

In [None]:
# formatting function for format output text and tokens
import re

# function to format the model reponse nicely
def format_output_text(output: list):
    # remove special tokens from list
    formatted_output = format_tokens(output)

    # start string with first list item if it is not empty
    if formatted_output[0] != "":
        output_str = formatted_output[0]
    else:
        # alternatively start with second list item
        output_str = formatted_output[1]

    # add all other list items with a space in between
    for txt in formatted_output[1:]:
        # check if the token is a punctuation mark
        if txt in [".", ",", "!", "?"]:
            # add punctuation mark without space
            output_str += txt
        # add token with space if not empty
        elif txt != "":
            output_str += " " + txt

    # return the combined string with multiple spaces removed
    return re.sub(" +", " ", output_str)


# format the tokens by removing special tokens and special characters
def format_tokens(tokens: list):
    # define special tokens to remove and initialize empty list
    special_tokens = ["[CLS]", "[SEP]", "[PAD]", "[UNK]", "[MASK]", "▁", "Ġ", "</w>"]
    updated_tokens = []

    # loop through tokens
    for t in tokens:
        # remove special token from start of token if found
        if t.startswith("▁"):
            t = t.lstrip("▁")

        # loop through special tokens and remove them if found
        for s in special_tokens:
            t = t.replace(s, "")

        # add token to list
        updated_tokens.append(t)

    # return the list of tokens
    return updated_tokens

# function to remove orphan whitespaces in a list of text
def remove_orphan_whitespaces(texts:list):
  # instantiating a new empty list
  cleaned_list = []

  # loopin over list
  for text in texts:
    if(text != " " and text !=""):
      cleaned_list.append(text)

  # additionally rmeoving multiple spaces and return
  return cleaned_list

In [None]:
# shap runner functions (copied from above)
# CREDIT: Copied and minimally changed from offical shap documentation
# see https://shap.readthedocs.io/en/latest/example_notebooks/text_examples/text_generation/Open%20Ended%20GPT2%20Text%20Generation%20Explanations.html

# code to run a auto shap explainer (uses PartitionSHAP)
from shap import PartitionExplainer
from shap import maskers, Explainer
from shap.models import TeacherForcing
from scipy.cluster.hierarchy import dendrogram
import time

# function that runs a basic shap example
def basic_shap(text:list,model, tokenizer):

  masker = maskers.Text(tokenizer)
  masker.mask_token = ' '

  # create explainer and run it
  basic_explainer = PartitionExplainer(model, masker)

  runtime = time.time()
  shap_values = basic_explainer(text)
  print(time.time()-runtime)

  return shap_values

# code to run a teacher forcing shap example

def tch_shap(text:list[list:str], model, tokenizer):

  # wrap gpt model in teacher forcing wrapper
  tch_model = TeacherForcing(model, tokenizer)

  # setup masker using an empty string instead of "..."
  masker = maskers.Text(tokenizer, mask_token=" ", collapse_mask_token=True)

  # setup explainer, generate explanation
  tch_explainer = PartitionExplainer(tch_model, masker)
  shap_values = tch_explainer(text[0], text[1])

  return shap_values

# visualizing SHAP Text Masking
# CREDIT: Copied and minimally changed from Lilo Wagner, "Shap’s partition explainer for language models"
## see https://towardsdatascience.com/shaps-partition-explainer-for-language-models-ec2e7a6c1b77
def viz_masking(text:str, tokenizer):

  # creating a new masker
  masker = maskers.Text(tokenizer, mask_token=" ", collapse_mask_token=True)

  # clustering text using the master
  clust_text_val = masker.clustering(text)
  clust_text_splits = masker.feature_names(text)[0]

  # creating a plot figure and plotting
  fig = plt.figure(figsize=(8, 4))
  dn = dendrogram(clust_text_val, labels=clust_text_splits)
  plt.xlabel("Input Texts")
  plt.ylabel("Partition Tree Levels")
  plt.title("Visualization of Clustering By SHAP Masker")
  plt.show()

### GODEL

In [None]:
# formatting function to formatting input for the model
# CREDIT: Adapted from official interference example on Huggingface
## see https://huggingface.co/microsoft/GODEL-v1_1-large-seq2seq
def gd_format_prompt(message: str, system_prompt: str, knowledge: str = ""):

    # adds knowledge text if not empty
    if knowledge != "":
        knowledge = "[KNOWLEDGE] " + knowledge

    # adds the message to the prompt
    prompt = f" {message}"
    # combines the entire prompt
    full_prompt = f"{system_prompt} [CONTEXT] {prompt} {knowledge}"

    # returns the formatted prompt
    return full_prompt

In [None]:
# running PartitionSHAP Explainer on GODEL
# imports
from shap import plots

# getting formatted prompt
gd_test_prompt = gd_format_prompt("Does money buy happiness?","Given a dialog context, you need to respond empathically.")

# visualizing clustering done by SHAP masker
viz_masking(gd_test_prompt, gd_tokenizer)

# running tests
basic_gd_test = basic_shap([gd_test_prompt], gd_model, gd_tokenizer)

# plotting the values
plots.text(basic_gd_test)

In [None]:
# running a teacher forced SHAP Explainer on GODEL
# imports
from shap import plots

# text text data
gd_tch_test_text = [
  [
    gd_test_prompt,
    gd_test_prompt,
    gd_test_prompt,
    gd_test_prompt,
    gd_test_prompt
  ],
  [
    "It's heartening to hear you view money as a means to happiness; do you find that this belief positively impacts your life?",
    "Your perspective on money and happiness is intriguing; have you always felt this way, or has your opinion evolved over time?",
    "I agree that money can be a tool for happiness; it's great you're thinking about this, what's your most fulfilling experience with it?",
    "Your view is quite insightful; do you think there's more to happiness than just money and possessions?",
    "That's an optimistic approach to wealth and happiness; what other factors do you believe play a role in achieving true contentment?"
  ]
]

# running test with predfined class
tch_gd_test = tch_shap(tch_test_text, gd_model, gd_tokenizer)

# plotting the values
plots.text(tch_gd_test)
plots.bar(tch_gd_test[0, :, "empathically"])

### Mistral AI

In [None]:
# formatting function to format input for the model
# CREDIT: Inspired by offical documentation and example on Huggingface
## see https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
def mistral_format_prompt(message: str, system_prompt: str):
    prompt = f"<s>[INST] {system_prompt} [/INST] Hello, how can I assist you today?</s>[INST] {message} [/INST]"
    return prompt

In [None]:
# basic generation function for the model
# CREDIT: Adapted from by offical documentation and example on Huggingface
## see https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
def mistral_generate(message: str="Does money buy happiness?", system_prompt: str="Given a dialog context, you need to respond empathically."):

  # getting device again and updating config
  device = torch.device("gpu") if torch.cuda.is_available()
  mistral_model.to(device)

  # formatting prompt
  prompt = mistral_format_prompt(message, system_prompt)

  # tokenizing inputs
  input_ids = mistral_tokenizer.encode(f"{prompt}", return_tensors="pt")
  model_inputs = input_ids.to(device)

  # generating response and decoding it
  generated_ids = mistral_model.generate(model_inputs)
  decoded = mistral_tokenizer.decode(generated_ids)

  # return text
  return decoded[0]

In [None]:
print(mistral_generate(message="Does money buy happiness?", system_prompt="Given a dialog context, you need to respond empathically."))

In [None]:
# function to test shap TextGeneration, TeacherForing Models
# -> Explainer automatically wraps mistral model in these
from shap.models import TextGeneration, TeacherForcing

# create formatted prompt
mistral_text = mistral_format_prompt(message="Does money buy happiness?", system_prompt="Given a dialog context, you need to respond empathically.")

# update tokenizer config, create mistral teacher forcing model
mistral_tokenizer.pad_token = mistral_tokenizer.eos_token
mistral_tch_model = TeacherForcing(mistral_model, mistral_tokenizer)

#logits = mistral_tch_model.get_teacher_forced_logits(np.array([mistral_text]),np.array(["I understand that this is a common question with no definitive answer, as everyone's experiences and priorities are unique."]))
tch_output=mistral_tch_model(np.array([mistral_text,mistral_text, mistral_text]),np.array(["I understand that this is a common question with no definitive answer, as everyone's experiences and priorities are unique."]))

In [None]:
from shap import plots

# running basic shap with mistral
basic_mist_test = basic_shap([mistral_text], mistral_model, mistral_tokenizer)

# plotting the values
plots.text(basic_mist_test)

### Comment

It is evident that the calculation of SHAP values does not work correctly with the Mistral Model. This is because the teacher forced logits are not calculated correctly for the different nodes.

Even through several debugging and fixing steps it was not possible to resolve the issue.

### LlaMa 2

In [None]:
# formatting function to format input for the model
# CREDIT: Adapted from Philipp Schmid
## see https://www.philschmid.de/llama-2#how-to-prompt-llama-2-chat
def llama_format_prompt(message:str, system_prompt:str):
  prompt = f"<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{message} [/INST]"
  return prompt

In [None]:
from shap import plots

# create formatted prompt
llama_text = llama_format_prompt(message="Does money buy happiness?", system_prompt="Given a dialog context, you need to respond empathically.")

# running basic shap with mistral
basic_llama_test = basic_shap([llama_text], llama_model, llama_tokenizer)

# plotting the values
plots.text(basic_llama_test)

### Comment

It is evident that the calculation of SHAP values does not work correctly with the Mistral Model. This is because the teacher forced logits are not calculated correctly for the different nodes.

Even through several debugging and fixing steps it was not possible to resolve the issue.