<a href="https://colab.research.google.com/github/danielhou13/cogs402longformer/blob/main/src/CaptumLongformerSequenceClassificationMultiembedding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook adapts the Captum tutorial for question answering and refactors it into the longformer sequence classification task. Specifically, this notebook focuses on using the model's embeddings to get word attributions for the examples of your choice, or the entire dataset if needed.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
import sys
sys.path.append('/content/drive/My Drive/{}'.format("cogs402longformer/"))

Import and install dependencies

In [4]:
pip install transformers --quiet

[K     |████████████████████████████████| 4.4 MB 8.2 MB/s 
[K     |████████████████████████████████| 596 kB 70.9 MB/s 
[K     |████████████████████████████████| 101 kB 12.1 MB/s 
[K     |████████████████████████████████| 6.6 MB 37.1 MB/s 
[?25h

In [5]:
pip install captum --quiet

[?25l[K     |▎                               | 10 kB 29.1 MB/s eta 0:00:01[K     |▌                               | 20 kB 34.6 MB/s eta 0:00:01[K     |▊                               | 30 kB 13.7 MB/s eta 0:00:01[K     |█                               | 40 kB 6.5 MB/s eta 0:00:01[K     |█▏                              | 51 kB 6.7 MB/s eta 0:00:01[K     |█▍                              | 61 kB 7.9 MB/s eta 0:00:01[K     |█▋                              | 71 kB 8.3 MB/s eta 0:00:01[K     |█▉                              | 81 kB 7.8 MB/s eta 0:00:01[K     |██                              | 92 kB 8.7 MB/s eta 0:00:01[K     |██▎                             | 102 kB 7.2 MB/s eta 0:00:01[K     |██▌                             | 112 kB 7.2 MB/s eta 0:00:01[K     |██▊                             | 122 kB 7.2 MB/s eta 0:00:01[K     |███                             | 133 kB 7.2 MB/s eta 0:00:01[K     |███▏                            | 143 kB 7.2 MB/s eta 0:00:01[K  

In [6]:
pip install datasets --quiet

[K     |████████████████████████████████| 362 kB 8.1 MB/s 
[K     |████████████████████████████████| 212 kB 64.1 MB/s 
[K     |████████████████████████████████| 1.1 MB 45.0 MB/s 
[K     |████████████████████████████████| 140 kB 53.1 MB/s 
[K     |████████████████████████████████| 127 kB 52.3 MB/s 
[K     |████████████████████████████████| 94 kB 3.4 MB/s 
[K     |████████████████████████████████| 144 kB 53.4 MB/s 
[K     |████████████████████████████████| 271 kB 45.6 MB/s 
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.[0m
[?25h

In [7]:
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

In [8]:
from transformers import BertTokenizer, BertForSequenceClassification, BertConfig

from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

import torch

In [9]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Import model from Huggingface

In [10]:
from transformers import LongformerForSequenceClassification, LongformerTokenizer, LongformerConfig
# replace <PATH-TO-SAVED-MODEL> with the real path of the saved model
model_path = 'danielhou13/longformer-finetuned_papers_v2'
#model_path = 'danielhou13/longformer-finetuned-new-cogs402'

# load model
model = LongformerForSequenceClassification.from_pretrained(model_path, num_labels = 2)
model.to(device)
model.eval()
model.zero_grad()

# load tokenizer
tokenizer = LongformerTokenizer.from_pretrained("allenai/longformer-base-4096")

Downloading:   0%|          | 0.00/0.99k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/567M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/694 [00:00<?, ?B/s]

In [11]:
# model2 = BertForSequenceClassification.from_pretrained("bert-base-uncased")

In [12]:
print(tokenizer)

PreTrainedTokenizer(name_or_path='allenai/longformer-base-4096', vocab_size=50265, model_max_len=4096, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'eos_token': AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'unk_token': AddedToken("<unk>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'sep_token': AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'pad_token': AddedToken("<pad>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'cls_token': AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'mask_token': AddedToken("<mask>", rstrip=False, lstrip=True, single_word=False, normalized=True)})


Create functions that give us the input ids and the position ids for the text we want to examine

In [13]:
def predict(inputs, position_ids=None, attention_mask=None):
    output = model(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask)
    return output.logits

In [14]:
ref_token_id = tokenizer.pad_token_id # A token used for generating token reference
sep_token_id = tokenizer.sep_token_id # A token used as a separator between question and text and it is also added to the end of the text.
cls_token_id = tokenizer.cls_token_id # A token used for prepending to the concatenated question-text word sequence

In [15]:
max_length = 2046
def construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id):

    text_ids = tokenizer.encode(text, truncation = True, add_special_tokens=False, max_length = max_length)
    # construct input token ids
    input_ids = [cls_token_id] + text_ids + [sep_token_id]
    # construct reference token ids 
    ref_input_ids = [cls_token_id] + [ref_token_id] * len(text_ids) + [sep_token_id]

    return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device), len(text_ids)

def construct_input_ref_pos_id_pair(input_ids):
    seq_length = input_ids.size(1)
    position_ids = torch.arange(seq_length, dtype=torch.long, device=device)
    # we could potentially also use random permutation with `torch.randperm(seq_length, device=device)`
    ref_position_ids = torch.zeros(seq_length, dtype=torch.long, device=device)

    position_ids = position_ids.unsqueeze(0).expand_as(input_ids)
    ref_position_ids = ref_position_ids.unsqueeze(0).expand_as(input_ids)
    return position_ids, ref_position_ids

def construct_attention_mask(input_ids):
    return torch.ones_like(input_ids)

Import dataset and take one example from it for testing purposes

In [16]:
from datasets import load_dataset
cogs402_ds = load_dataset("danielhou13/cogs402dataset")["test"]

Downloading:   0%|          | 0.00/739 [00:00<?, ?B/s]

Using custom data configuration danielhou13--cogs402dataset-144b958ac1a53abb


Downloading and preparing dataset None/None (download: 157.87 MiB, generated: 311.56 MiB, post-processed: Unknown size, total: 469.43 MiB) to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8...


Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/33.6M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/132M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

0 tables [00:00, ? tables/s]

0 tables [00:00, ? tables/s]

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8. Subsequent calls will reuse this data.


  0%|          | 0/2 [00:00<?, ?it/s]

In [17]:
testval = 976
text = cogs402_ds['text'][testval]
label = cogs402_ds['labels'][testval]
print(label)

1


In [18]:
#set 1 if we are dealing with a positive class, and 0 if dealing with negative class
def custom_forward(inputs, position_ids=None, attention_mask=None):
    preds = predict(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask)
    return torch.softmax(preds, dim = 1)

In [19]:
input_ids, ref_input_ids, sep_id = construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id)
position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids)
attention_mask = construct_attention_mask(input_ids)

indices = input_ids[0].detach().tolist()
all_tokens = tokenizer.convert_ids_to_tokens(indices)

In [20]:
def summarize_attributions(attributions):
    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.norm(attributions)
    return attributions

Perform Layer Integrated Gradients using the longformer's word and position embedddings. According to Huggingface, the longformer does not use token type ids

In [21]:
lig2 = LayerIntegratedGradients(custom_forward, \
                                [model.longformer.embeddings.word_embeddings, \
                                 model.longformer.embeddings.position_embeddings])

  "Multiple layers provided. Please ensure that each layer is"


In [22]:
attributions = lig2.attribute(inputs=(input_ids, position_ids),
                               baselines=(ref_input_ids, ref_position_ids),
                               target=label,
                               additional_forward_args=(attention_mask),
                               n_steps=200,
                               internal_batch_size = 2)

In [23]:
attributions_word = summarize_attributions(attributions[0])
attributions_position = summarize_attributions(attributions[1])
print(len(attributions_word))

2048


See which words had the strongest (most positive and most negative) attributions. Change the number of tokens you wish to visualize for your needs

In [24]:
def get_topk_attributed_tokens(attrs, k=15):
    values, indices = torch.topk(attrs, k)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

In [25]:
def get_botk_attributed_tokens(attrs, k=15):
    values, indices = torch.topk(attrs, k, largest=False)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

Convert the values, index of the values, and the token into a pandas Dataframe for visualization. It will be sorted by highest value for attributions to lowest. Alternatively, if youre looking for the lowest attributions, it goes from lowest to highest.



In [26]:
import pandas as pd
top_words_start, top_words_val_start, top_word_ind_start = get_topk_attributed_tokens(attributions_word)
bot_words_start, bot_words_val_start, bot_word_ind_start = get_botk_attributed_tokens(attributions_word)


top_pos_start, top_pos_val_start, pos_ind_start = get_topk_attributed_tokens(attributions_position)
bot_pos_start, bot_pos_val_start, pos_ind_start2 = get_botk_attributed_tokens(attributions_position)

df_high = pd.DataFrame({'Word(Index), Attribution': ["{} ({}), {}".format(word, pos, round(val.item(),2)) for word, pos, val in zip(top_words_start, top_word_ind_start, top_words_val_start)],
                   'Position(Index), Attribution': ["{} ({}), {}".format(position, pos, round(val.item(),2)) for position, pos, val in zip(top_pos_start, pos_ind_start, top_pos_val_start)]})

df_low = pd.DataFrame({'Word(Index), Attribution': ["{} ({}), {}".format(word, pos, round(val.item(),2)) for word, pos, val in zip(bot_words_start, bot_word_ind_start, bot_words_val_start)],
                   'Position(Index), Attribution': ["{} ({}), {}".format(position, pos, round(val.item(),2)) for position, pos, val in zip(bot_pos_start, pos_ind_start2, bot_pos_val_start)]})
# df_start.style.apply(['cell_ids: False'])

# ['{}({})'.format(token, str(i)) for i, token in enumerate(all_tokens)]

In [27]:
df_high

Unnamed: 0,"Word(Index), Attribution","Position(Index), Attribution"
0,"Ġtraining (1538), 0.3","Ġtraining (1538), 0.53"
1,"ing (1026), 0.24","Ġas (2), 0.22"
2,"Ġtraining (1544), 0.09","]. (1827), 0.13"
3,"Ġwork (1565), 0.09",". (1930), 0.09"
4,"Ġtraining (1593), 0.09",". (1584), 0.09"
5,"Ġtraining (1506), 0.09","Ġsystems (514), 0.08"
6,"Ġtraining (1791), 0.09",". (1072), 0.08"
7,"Ġtraining (1659), 0.09",". (1295), 0.07"
8,"Ġtraining (1687), 0.09",". (1586), 0.07"
9,"Ġbias (1670), 0.08",". (1307), 0.06"


In [28]:
df_low

Unnamed: 0,"Word(Index), Attribution","Position(Index), Attribution"
0,"ĠThis (152), -0.07","ing (1026), -0.45"
1,"Ġto (1485), -0.07","ĠThis (152), -0.05"
2,"Ġto (236), -0.07",") (15), -0.05"
3,"Ġto (156), -0.06","ĠWe (1548), -0.04"
4,"Ġto (1381), -0.05","</s> (2047), -0.04"
5,"Ġto (1365), -0.05","Ġbi (933), -0.04"
6,"Ġto (1436), -0.05",", (120), -0.04"
7,"Ġto (1416), -0.05","Ġroad (801), -0.04"
8,"Ġto (1542), -0.05","Ġachieve (1877), -0.04"
9,"Ġto (277), -0.05","Ġbias (1670), -0.04"


We notice that there are many repeating tokens in each example that have different positions. The position of the token may have important information, but we might want to know the tokens that has the most impact (most positive and most negative) on the prediction

In [29]:
d = {"tokens":all_tokens, "attribution":attributions_word[:len(all_tokens)].cpu()}
df_attrib = pd.DataFrame(d)
aggregation_functions = {'attribution': 'sum'}
df_new = df_attrib.groupby(df_attrib['tokens']).aggregate(aggregation_functions)

In [30]:
highest_attrib_tokens = df_new.sort_values(by=['attribution'], ascending=False)
highest_attrib_tokens[:15]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġtraining,1.645096
.,1.223755
Ġof,0.972543
Ġcapt,0.870154
-,0.71621
Ġin,0.608891
Ġ[,0.590056
ing,0.575385
Ġon,0.430506
Ġimage,0.412368


In [31]:
lowest_attrib_tokens = df_new.sort_values(by=['attribution'])
lowest_attrib_tokens[:15]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġto,-1.081017
Ġthe,-0.177463
Ġcaption,-0.132096
gram,-0.128994
Ġis,-0.11268
Ġwhich,-0.094631
Ġgeneration,-0.094317
arial,-0.093841
Ġmachine,-0.075527
Ġgenerator,-0.070401


Using the notebook https://colab.research.google.com/drive/1lktilbL1IY4nBanlzCdP8TLsBNfUsl_U?usp=sharing, we can get the files to view the attributions for the entire dataset for both the positive and negative classes

In [32]:
df_pos_word = pd.read_csv("/content/drive/MyDrive/cogs402longformer/results/papers/papers_attributions/word_emb_attrib_ones_papers.csv")
df_pos_posi = pd.read_csv("/content/drive/MyDrive/cogs402longformer/results/papers/papers_attributions/pos_emb_attrib_ones_papers.csv")
df_neg_word = pd.read_csv("/content/drive/MyDrive/cogs402longformer/results/papers/papers_attributions/word_emb_attrib_zero_papers.csv")
df_neg_posi = pd.read_csv("/content/drive/MyDrive/cogs402longformer/results/papers/papers_attributions/pos_emb_attrib_zero_papers.csv")

In [33]:
df_pos_word[:10]

Unnamed: 0,tokens,attribution
0,Ġof,1.337319
1,.,0.808798
2,Ġ,0.462594
3,Ġ(,0.440659
4,Ġin,0.410429
5,-,0.399644
6,Ġfor,0.252253
7,Ġ[,0.220462
8,Ġand,0.218836
9,Ġlearning,0.216856


In [34]:
df_pos_posi[:10]

Unnamed: 0,tokens,attribution
0,.,0.409151
1,-,0.125002
2,",",0.099744
3,Ġ(,0.056294
4,Ġthe,0.040966
5,:,0.034147
6,].,0.033753
7,),0.031152
8,Ġof,0.030003
9,).,0.028448


In [35]:
df_neg_word[:10]

Unnamed: 0,tokens,attribution
0,Ġto,0.686362
1,Ġthe,0.321515
2,Ġprogramming,0.273249
3,Ġcode,0.1783
4,Ġ.,0.123711
5,ĠThe,0.115306
6,Ġlanguages,0.107551
7,Ġcompiler,0.103048
8,ĠJava,0.096667
9,Ġlanguage,0.094121


In [36]:
df_neg_posi[:10]

Unnamed: 0,tokens,attribution
0,Ġ,0.16289
1,Ġa,0.155547
2,Ġto,0.119488
3,Ġis,0.087661
4,Ġand,0.068928
5,ĠThe,0.058366
6,Ġin,0.048117
7,Ġfor,0.04591
8,ĠIn,0.044377
9,Ġof,0.037033
