# Testing Inseq + Sequence 2 Sequence Text Generation

This notebook was created to test the Inseq package and usage with GODEL

### Tested Model
- GODEL

#### Tested Interpretability Implementation
Tests run with inseq. Inseq is a implementation on top of captum and other interpretability methods, specifically for sequence based text generation models.


## Installation, Imports and Setup

### Installs and Imports

In [None]:
# basic installs and additional utilies (usually not needed in colab)
!pip install matplotlib
!pip install numpy
!pip install pandas
!pip install ipywidgets
!pip install ipython

# model package installs
!pip install torch
!pip install transformers
!pip install huggingface_hub
!pip install accelerate
!pip install sklearn

# install inseq
!pip install inseq

In [None]:
# basic imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# model imports
import torch
import transformers

# interpretability import
import inseq

## GODEL Testing Code

In [None]:
# formatting function to formatting input for the model
# CREDIT: Adapted from official interference example on Huggingface
## see https://huggingface.co/microsoft/GODEL-v1_1-large-seq2seq
def gd_format_prompt(message: str, system_prompt: str = "", knowledge: str = ""):

    # adds knowledge text if not empty
    if knowledge != "":
        knowledge = "[KNOWLEDGE] " + knowledge

    # adds the message to the prompt
    prompt = f" {message}"
    # combines the entire prompt
    full_prompt = f"{system_prompt} [CONTEXT] {prompt} {knowledge}"

    # returns the formatted prompt
    return full_prompt

In [11]:
# running inseq gradient shap explainer on GODEL
# CREDIT: copied and minimally changed from the offical inseq documentation
# see https://inseq.org/en/latest/examples/quickstart.html#post-processing-attributions-with-aggregators

# seting up GODEL test text with prompt formatter
gd_test_text = gd_format_prompt(
    "Does money buy happiness?",
    "Given a dialog context, you need to respond empathically.",
)

# setting up special wrapped inseq model with gradient shap explainer
model = inseq.load_model("microsoft/GODEL-v1_1-large-seq2seq", "gradient_shap")

# generating model attribution with inseq
attribution = model.attribute(
    input_texts=gd_test_text,
    generation_args={"max_new_tokens": 50},
    attribute_target=True,
    step_scores=["probability"],

)

Attributing with gradient_shap...: 100%|██████████| 7/7 [00:12<00:00,  2.09s/it]


In [43]:
# using inseq show function to display attributions
attribution.show()

Unnamed: 0_level_0,▁Yes,",",▁it,▁does,.,</s>
▁Given,0.045,0.035,0.033,0.025,0.029,0.021
▁,0.029,0.016,0.021,0.019,0.015,0.015
a,0.017,0.016,0.016,0.013,0.016,0.011
▁dialog,0.06,0.061,0.04,0.03,0.041,0.026
▁context,0.044,0.078,0.032,0.02,0.035,0.027
",",0.021,0.021,0.025,0.013,0.014,0.014
▁you,0.028,0.019,0.027,0.017,0.015,0.022
▁need,0.027,0.015,0.023,0.016,0.015,0.035
▁to,0.026,0.012,0.016,0.012,0.013,0.046
▁respond,0.043,0.027,0.034,0.025,0.027,0.042
▁emp,0.079,0.061,0.062,0.049,0.054,0.058
a,0.021,0.013,0.016,0.011,0.022,0.019
th,0.043,0.023,0.03,0.019,0.028,0.033
ically,0.031,0.033,0.028,0.019,0.024,0.07
.,0.032,0.023,0.028,0.022,0.021,0.062
▁[,0.035,0.025,0.031,0.026,0.024,0.04
CON,0.027,0.029,0.023,0.019,0.022,0.029
TE,0.028,0.044,0.025,0.02,0.034,0.022
X,0.017,0.045,0.015,0.013,0.037,0.015
T,0.021,0.026,0.017,0.017,0.032,0.02
],0.04,0.026,0.027,0.029,0.024,0.045
▁Does,0.062,0.032,0.042,0.081,0.031,0.045
▁money,0.046,0.043,0.085,0.071,0.043,0.028
▁buy,0.058,0.044,0.082,0.123,0.055,0.027
▁happiness,0.073,0.045,0.075,0.072,0.044,0.031
?,0.047,0.022,0.025,0.024,0.024,0.021
</s>,0.0,0.0,0.0,0.0,0.0,0.0
probability,0.176,0.522,0.204,0.453,0.722,0.161

Unnamed: 0_level_0,▁Yes,",",▁it,▁does,.,</s>
▁Yes,Unnamed: 1_level_1,0.168,0.078,0.087,0.103,0.045
",",Unnamed: 1_level_2,Unnamed: 2_level_2,0.045,0.045,0.039,0.024
▁it,Unnamed: 1_level_3,Unnamed: 2_level_3,Unnamed: 3_level_3,0.061,0.048,0.023
▁does,Unnamed: 1_level_4,Unnamed: 2_level_4,Unnamed: 3_level_4,Unnamed: 4_level_4,0.071,0.032
.,Unnamed: 1_level_5,Unnamed: 2_level_5,Unnamed: 3_level_5,Unnamed: 4_level_5,Unnamed: 5_level_5,0.053
</s>,Unnamed: 1_level_6,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6
probability,0.176,0.522,0.204,0.453,0.722,0.161
