<a href="https://colab.research.google.com/github/danielhou13/cogs402longformer/blob/main/src/CaptumLongformerSequenceClassificationMultiembedding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
import sys
sys.path.append('/content/drive/My Drive/{}'.format("cogs402longformer/"))

Import and install dependencies

In [3]:
pip install transformers --quiet

[K     |████████████████████████████████| 4.4 MB 15.1 MB/s 
[K     |████████████████████████████████| 101 kB 9.0 MB/s 
[K     |████████████████████████████████| 6.6 MB 70.4 MB/s 
[K     |████████████████████████████████| 596 kB 62.3 MB/s 
[?25h

In [4]:
pip install captum --quiet

[K     |████████████████████████████████| 1.4 MB 12.9 MB/s 
[?25h

In [5]:
pip install datasets --quiet

[K     |████████████████████████████████| 362 kB 14.3 MB/s 
[K     |████████████████████████████████| 212 kB 72.2 MB/s 
[K     |████████████████████████████████| 1.1 MB 19.6 MB/s 
[K     |████████████████████████████████| 140 kB 50.4 MB/s 
[K     |████████████████████████████████| 127 kB 52.1 MB/s 
[K     |████████████████████████████████| 94 kB 1.9 MB/s 
[K     |████████████████████████████████| 271 kB 54.3 MB/s 
[K     |████████████████████████████████| 144 kB 54.2 MB/s 
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.[0m
[?25h

In [6]:
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

In [7]:
from transformers import BertTokenizer, BertForSequenceClassification, BertConfig

from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

import torch

In [8]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Import model from Huggingface

In [9]:
from transformers import LongformerForSequenceClassification, LongformerTokenizer, LongformerConfig
# replace <PATH-TO-SAVED-MODEL> with the real path of the saved model
model_path = 'danielhou13/longformer-finetuned_papers_v2'
#model_path = 'danielhou13/longformer-finetuned-new-cogs402'

# load model
model = LongformerForSequenceClassification.from_pretrained(model_path, num_labels = 2)
model.to(device)
model.eval()
model.zero_grad()

# load tokenizer
tokenizer = LongformerTokenizer.from_pretrained("allenai/longformer-base-4096")

Downloading:   0%|          | 0.00/0.99k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/567M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/694 [00:00<?, ?B/s]

In [10]:
# model2 = BertForSequenceClassification.from_pretrained("bert-base-uncased")

In [11]:
print(tokenizer)

PreTrainedTokenizer(name_or_path='allenai/longformer-base-4096', vocab_size=50265, model_max_len=4096, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'eos_token': AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'unk_token': AddedToken("<unk>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'sep_token': AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'pad_token': AddedToken("<pad>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'cls_token': AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'mask_token': AddedToken("<mask>", rstrip=False, lstrip=True, single_word=False, normalized=True)})


Create functions that give us the input ids and the position ids for the text we want to examine

In [12]:
def predict(inputs, position_ids=None, attention_mask=None):
    output = model(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask)
    return output.logits

In [13]:
ref_token_id = tokenizer.pad_token_id # A token used for generating token reference
sep_token_id = tokenizer.sep_token_id # A token used as a separator between question and text and it is also added to the end of the text.
cls_token_id = tokenizer.cls_token_id # A token used for prepending to the concatenated question-text word sequence

In [16]:
max_length = 2046
def construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id):

    text_ids = tokenizer.encode(text, truncation = True, add_special_tokens=False, max_length = max_length)
    # construct input token ids
    input_ids = [cls_token_id] + text_ids + [sep_token_id]
    # construct reference token ids 
    ref_input_ids = [cls_token_id] + [ref_token_id] * len(text_ids) + [sep_token_id]

    return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device), len(text_ids)

def construct_input_ref_pos_id_pair(input_ids):
    seq_length = input_ids.size(1)
    position_ids = torch.arange(seq_length, dtype=torch.long, device=device)
    # we could potentially also use random permutation with `torch.randperm(seq_length, device=device)`
    ref_position_ids = torch.zeros(seq_length, dtype=torch.long, device=device)

    position_ids = position_ids.unsqueeze(0).expand_as(input_ids)
    ref_position_ids = ref_position_ids.unsqueeze(0).expand_as(input_ids)
    return position_ids, ref_position_ids

def construct_attention_mask(input_ids):
    return torch.ones_like(input_ids)

Import dataset and take one example from it for testing purposes

In [15]:
from datasets import load_dataset
cogs402_ds = load_dataset("danielhou13/cogs402dataset")["test"]

Downloading:   0%|          | 0.00/739 [00:00<?, ?B/s]

Using custom data configuration danielhou13--cogs402dataset-144b958ac1a53abb


Downloading and preparing dataset None/None (download: 157.87 MiB, generated: 311.56 MiB, post-processed: Unknown size, total: 469.43 MiB) to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8...


Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/132M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/33.6M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

0 tables [00:00, ? tables/s]

0 tables [00:00, ? tables/s]

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8. Subsequent calls will reuse this data.


  0%|          | 0/2 [00:00<?, ?it/s]

In [17]:
testval = 976
text = cogs402_ds['text'][testval]
label = cogs402_ds['labels'][testval]
print(label)

1


In [18]:
#set 1 if we are dealing with a positive class, and 0 if dealing with negative class
def custom_forward(inputs, position_ids=None, attention_mask=None):
    preds = predict(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask)
    return torch.softmax(preds, dim = 1)

In [19]:
input_ids, ref_input_ids, sep_id = construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id)
position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids)
attention_mask = construct_attention_mask(input_ids)

indices = input_ids[0].detach().tolist()
all_tokens = tokenizer.convert_ids_to_tokens(indices)

In [20]:
def summarize_attributions(attributions):
    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.norm(attributions)
    return attributions

Perform Layer Integrated Gradients using the longformer's word and position embedddings. According to Huggingface, the longformer does not use token type ids

In [21]:
lig2 = LayerIntegratedGradients(custom_forward, \
                                [model.longformer.embeddings.word_embeddings, \
                                 model.longformer.embeddings.position_embeddings])

  "Multiple layers provided. Please ensure that each layer is"


In [22]:
attributions = lig2.attribute(inputs=(input_ids, position_ids),
                               baselines=(ref_input_ids, ref_position_ids),
                               target=label,
                               additional_forward_args=(attention_mask),
                               n_steps=200,
                               internal_batch_size = 2)

In [23]:
attributions_word = summarize_attributions(attributions[0])
attributions_position = summarize_attributions(attributions[1])
print(len(attributions_word))

2048


See which words had the strongest (most positive and most negative) attributions. Change the number of tokens you wish to visualize for your needs

In [27]:
def get_topk_attributed_tokens(attrs, k=15):
    values, indices = torch.topk(attrs, k)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

In [25]:
def get_botk_attributed_tokens(attrs, k=15):
    values, indices = torch.topk(attrs, k, largest=False)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

Convert the values, index of the values, and the token into a pandas Dataframe for visualization. It will be sorted by highest value for attributions to lowest. Alternatively, if youre looking for the lowest attributions, it goes from lowest to highest.



In [28]:
import pandas as pd
top_words_start, top_words_val_start, top_word_ind_start = get_topk_attributed_tokens(attributions_word)
bot_words_start, bot_words_val_start, bot_word_ind_start = get_botk_attributed_tokens(attributions_word)


top_pos_start, top_pos_val_start, pos_ind_start = get_topk_attributed_tokens(attributions_position)
bot_pos_start, bot_pos_val_start, pos_ind_start2 = get_botk_attributed_tokens(attributions_position)

df_high = pd.DataFrame({'Word(Index), Attribution': ["{} ({}), {}".format(word, pos, round(val.item(),2)) for word, pos, val in zip(top_words_start, top_word_ind_start, top_words_val_start)],
                   'Position(Index), Attribution': ["{} ({}), {}".format(position, pos, round(val.item(),2)) for position, pos, val in zip(top_pos_start, pos_ind_start, top_pos_val_start)]})

df_low = pd.DataFrame({'Word(Index), Attribution': ["{} ({}), {}".format(word, pos, round(val.item(),2)) for word, pos, val in zip(bot_words_start, bot_word_ind_start, bot_words_val_start)],
                   'Position(Index), Attribution': ["{} ({}), {}".format(position, pos, round(val.item(),2)) for position, pos, val in zip(bot_pos_start, pos_ind_start2, bot_pos_val_start)]})
# df_start.style.apply(['cell_ids: False'])

# ['{}({})'.format(token, str(i)) for i, token in enumerate(all_tokens)]

In [29]:
df_high

Unnamed: 0,"Word(Index), Attribution","Position(Index), Attribution"
0,"ĠThe (1931), 0.3","ĠThis (152), 0.31"
1,"ĠThis (152), 0.3",". (151), 0.22"
2,"Ġtraining (1538), 0.26","Ġas (2), 0.18"
3,"ing (1026), 0.21",". (1072), 0.16"
4,". (1072), 0.13","Ġfrom (1926), 0.16"
5,"Ġwork (1565), 0.11","Ġtraining (1538), 0.14"
6,". (151), 0.1",". (1930), 0.13"
7,". (1168), 0.09","]. (1827), 0.12"
8,"Ġfrom (1926), 0.09","ĠThe (1931), 0.12"
9,"Ġtraining (1544), 0.08",". (1584), 0.09"


In [30]:
df_low

Unnamed: 0,"Word(Index), Attribution","Position(Index), Attribution"
0,"Ġthe (1040), -0.09","ing (1026), -0.45"
1,"ative (1995), -0.08","Ġsystems (514), -0.23"
2,"Ġto (1485), -0.07","Ġthe (1040), -0.1"
3,"ĠIn (1116), -0.06","ĠWe (1548), -0.09"
4,"Ġto (236), -0.06","ĠIn (1116), -0.08"
5,"ĠWe (1548), -0.06",", (95), -0.07"
6,"ions (1920), -0.06","</s> (2047), -0.07"
7,"Ġto (1381), -0.05",", (1120), -0.06"
8,"Ġto (1365), -0.05","Ġand (529), -0.05"
9,"Ġto (1436), -0.05","Č (1844), -0.05"


We notice that there are many repeating tokens in each example that have different positions. The position of the token may have important information, but we might want to know the tokens that has the most impact (most positive and most negative) on the prediction

In [33]:
d = {"tokens":all_tokens, "attribution":attributions_word[:len(all_tokens)].cpu()}
df_attrib = pd.DataFrame(d)
aggregation_functions = {'attribution': 'sum'}
df_new = df_attrib.groupby(df_attrib['tokens']).aggregate(aggregation_functions)

In [34]:
highest_attrib_tokens = df_new.sort_values(by=['attribution'], ascending=False)
highest_attrib_tokens[:15]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġtraining,1.433494
.,1.016832
Ġof,0.867203
Ġcapt,0.710953
-,0.588795
Ġ[,0.523759
Ġin,0.502248
ing,0.484259
Ġon,0.355791
Ġimage,0.341716


In [35]:
lowest_attrib_tokens = df_new.sort_values(by=['attribution'])
lowest_attrib_tokens[:15]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġto,-1.154728
Ġthe,-0.363547
gram,-0.109309
arial,-0.09991
Ġcaption,-0.092737
ĠIn,-0.085074
Ġwhich,-0.080924
Ġgeneration,-0.080496
ĠTo,-0.068841
Ġgenerator,-0.063941


We can also find the aggregate total for each token over the entire dataset in order to find which words are the key words that plays into predicting positive and which words play into predicting negative with respect to both word embeddings and position embeddings.

In [37]:
from tqdm import tqdm
aggregate_attrib_zero = []
aggregate_attrib_ones = []
aggregate_pos_zero = []
aggregate_pos_ones = []

aggregation_function = {'attribution': 'sum'}

for i in tqdm(range(len(cogs402_ds)), position = 0, leave = True):
  text = cogs402_ds[i]['text']
  label = cogs402_ds[i]['labels']
  input_ids, ref_input_ids, sep_id = construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id)
  position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids)
  attention_mask = construct_attention_mask(input_ids)

  indices = input_ids[0].detach().tolist()
  all_tokens = tokenizer.convert_ids_to_tokens(indices)

  attributions2 = lig2.attribute(inputs=(input_ids, position_ids),
                               baselines=(ref_input_ids, ref_position_ids),
                               target=label,
                               additional_forward_args=(attention_mask),
                               n_steps=15,
                               internal_batch_size = 2)
  attributions_word = summarize_attributions(attributions2[0])
  attributions_position = summarize_attributions(attributions2[1])

  d = {"tokens":all_tokens, "attribution":attributions_word[:len(all_tokens)].cpu()}  
  d2 = {"tokens":all_tokens, "attribution":attributions_position[:len(all_tokens)].cpu()}  
  
  df_attrib = pd.DataFrame(d)
  df_attrib2 = pd.DataFrame(d2)

  df_attrib = df_attrib.groupby(df_attrib['tokens']).aggregate(aggregation_function)
  df_attrib2 = df_attrib2.groupby(df_attrib2['tokens']).aggregate(aggregation_function)

  if label == 0:
    aggregate_attrib_zero.append(df_attrib)
    aggregate_pos_zero.append(df_attrib2)
  else:
    aggregate_attrib_ones.append(df_attrib)
    aggregate_pos_ones.append(df_attrib2)

100%|██████████| 1070/1070 [2:25:44<00:00,  8.17s/it]


In [38]:
def combinedataframe(listframes, aggregation_func):
  df_attrib = pd.concat(listframes)
  df_attrib = df_attrib.reset_index(level=0)
  df_attrib = df_attrib.groupby(df_attrib['tokens']).aggregate(aggregation_func)
  df_attrib['attribution'] = df_attrib['attribution'].div(len(listframes))
  highest_attrib_tokens_all = df_attrib.sort_values(by=['attribution'], ascending=False)
  return highest_attrib_tokens_all

In [39]:
df_attrib_zero = combinedataframe(aggregate_attrib_zero, aggregation_function)
df_attrib_ones = combinedataframe(aggregate_attrib_ones, aggregation_function)
df_pos_zero = combinedataframe(aggregate_pos_zero, aggregation_function)
df_pos_ones = combinedataframe(aggregate_pos_ones, aggregation_function)

Here we get the attributions for the negative class with respect to the word embeddings.

In [40]:
df_attrib_zero[:10]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġto,0.686362
Ġthe,0.321515
Ġprogramming,0.273249
Ġcode,0.1783
Ġ.,0.123711
ĠThe,0.115306
Ġlanguages,0.107551
Ġcompiler,0.103048
ĠJava,0.096667
Ġlanguage,0.094121


Here we get the highest attributions for the positive class with repsect to word embeddings.

In [41]:
df_attrib_ones[:10]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġof,1.337319
.,0.808798
Ġ,0.462594
Ġ(,0.440659
Ġin,0.410429
-,0.399644
Ġfor,0.252253
Ġ[,0.220462
Ġand,0.218836
Ġlearning,0.216856


Here we have the highest attributions for the negative class with repsect to positional embeddings.

In [42]:
df_pos_zero[:10]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
Ġ,0.16289
Ġa,0.155547
Ġto,0.119488
Ġis,0.087661
Ġand,0.068928
ĠThe,0.058366
Ġin,0.048117
Ġfor,0.04591
ĠIn,0.044377
Ġof,0.037033


Here we have the highest attributions for the negative class with repsect to positional embeddings.

In [43]:
df_pos_ones[:10]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
.,0.409151
-,0.125002
",",0.099744
Ġ(,0.056294
Ġthe,0.040966
:,0.034147
].,0.033753
),0.031152
Ġof,0.030003
).,0.028448


Save the pandas dataframe into a csv to access it in the future without having to run through the entire dataset

In [44]:
df_attrib_zero.to_csv('/content/drive/MyDrive/cogs402longformer/results/word_emb_attrib_zero_papers.csv')  
df_attrib_ones.to_csv('/content/drive/MyDrive/cogs402longformer/results/word_emb_attrib_ones_papers.csv')  
df_pos_zero.to_csv('/content/drive/MyDrive/cogs402longformer/results/pos_emb_attrib_zero_papers.csv')  
df_pos_ones.to_csv('/content/drive/MyDrive/cogs402longformer/results/pos_emb_attrib_ones_papers.csv')  