
# **SHAP**
(SHapley Additive exPlanations) is a theoretic approach to explain the output of any machine learning model.


# **Example 1: Text-Generation**
We will write a piece of text and let the model predict the following text. with plot we will see how model prediction is done.



In [49]:
# install transformers and shap packages
!pip install transformers shap



In [50]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import shap

In [51]:
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("gpt2")

In [52]:
# set model decoder to true
model.config.is_decoder = True
# set text-generation params under task_specific_params
model.config.task_specific_params["text-generation"] = {
    "do_sample": True,
    "max_length": 50,
    "temperature": 0.7,
    "top_k": 50,
    "no_repeat_ngram_size": 2,
}

In [53]:
# Input
sentence = ["Once upon a time, in a cozy little house,"]

In [54]:
# Call PartitionExplainer as a default Explainer
explainer = shap.Explainer(model, tokenizer)
# Apply input to generate shap values
shap_values_text_gen = explainer(sentence)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


  0%|          | 0/110 [00:00<?, ?it/s]

PartitionExplainer explainer: 2it [00:24, 24.06s/it]               


**SHAP values provide a way to interpret the output of machine learning models. For text models, SHAP values explain how each word or token in the input text contributes to the prediction of the model.**

In [55]:
# Generate plot
shap.plots.text(shap_values_text_gen)

In the context of text generation models like GPT-2, the plot shows which tokens (words) in the input text have the most influence on the model’s generated text.









# **Plot Explanation**
**Base Value:**
It ia a Starting point of prediction. It represents the average prediction of the model before considering the specific input tokens.

**SHAP values:** It helps to explain why the model gave a certain output by showing the impact of each input token on the prediction.

**Tokens Contribution:**
The plot is displaying the input tokens along with their corresponding SHAP values.Positive SHAP values indicate that a token contributes positively towards the model's prediction. It will Push the prediction up/down from the base as per possitive/negetive value.

**Text Highlights:**
In the plot, tokens with higher SHAP values are highlighted red colors to indicate their importance towards positive influence on the prediction. while the blue color indicates negative contribution and the intensity of the color shows its strength in the respective direction.

Click on each token from the output-text to see the influence of input tokens.

# **Example 2: Emotion Classifier**
We will provide a text and model will classify the emotions like
sadness, joy, love, anger, fear, surprise as our model is trained for this emotions.



In [42]:
#Emotion Classifier
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
import shap

In [43]:
# load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("nateraw/bert-base-uncased-emotion", use_fast=True)
model = AutoModelForSequenceClassification.from_pretrained("nateraw/bert-base-uncased-emotion")

In [44]:
#create a pipeline to use the model
pred = pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
    device=-1,  # Use -1 to indicate CPU
)

In [45]:
# function to analyse the text using model and create explainer using SHAP
def analyze_text(texts):
    # Create SHAP explainer
    explainer = shap.Explainer(pred)
    # Generate SHAP values
    shap_values = explainer(texts)
    # Create and show the SHAP plot
    shap.plots.text(shap_values)

In [46]:
 # Text Input from the user
 user_input = input("Enter the text you want to analyze: ")
analyze_text([user_input])

Enter the text you want to analyze: I'm furious that they canceled the event without any notice or explanation


  0%|          | 0/240 [00:00<?, ?it/s]

PartitionExplainer explainer: 2it [00:29, 29.14s/it]               


# **Plot Explanation:**

**Base Value:** The base value is the average prediction score across the dataset or a baseline prediction. It represents the model’s default or expected prediction before any specific tokens are considered.

**Input Tokens:** The tokens are the individual words or subwords from the input text that were tokenized by the tokenizer.

**SHAP Values:** It indicate how each token affects the model’s output prediction. These values are calculated based on how the model's prediction changes when each token is included or excluded.

Click on each input token to understand the impact of the higher shap values on the emotion classification.