<a href="https://colab.research.google.com/github/danielhou13/cogs402longformer/blob/main/src/Attributions_Longformer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook adapts the [Captum tutorial for question answering](https://captum.ai/tutorials/Bert_SQUAD_Interpret) and refactors it into the longformer sequence classification task. Specifically, this notebook focuses on using the model's embeddings to get token attributions for the examples of your choice, or the entire dataset if needed. By doing so, we can visualize which tokens have the most influence in the model's prediction, and find out the k tokens with the most influence at helping the model predict correctly as well as incorrectly.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Import dependencies

In [None]:
pip install transformers --quiet

[K     |████████████████████████████████| 4.7 MB 5.2 MB/s 
[K     |████████████████████████████████| 596 kB 59.5 MB/s 
[K     |████████████████████████████████| 6.6 MB 42.6 MB/s 
[K     |████████████████████████████████| 101 kB 2.9 MB/s 
[?25h

In [None]:
pip install captum --quiet

[?25l[K     |▎                               | 10 kB 21.1 MB/s eta 0:00:01[K     |▌                               | 20 kB 18.6 MB/s eta 0:00:01[K     |▊                               | 30 kB 11.5 MB/s eta 0:00:01[K     |█                               | 40 kB 9.8 MB/s eta 0:00:01[K     |█▏                              | 51 kB 4.5 MB/s eta 0:00:01[K     |█▍                              | 61 kB 5.4 MB/s eta 0:00:01[K     |█▋                              | 71 kB 6.0 MB/s eta 0:00:01[K     |█▉                              | 81 kB 4.4 MB/s eta 0:00:01[K     |██                              | 92 kB 4.9 MB/s eta 0:00:01[K     |██▎                             | 102 kB 5.3 MB/s eta 0:00:01[K     |██▌                             | 112 kB 5.3 MB/s eta 0:00:01[K     |██▊                             | 122 kB 5.3 MB/s eta 0:00:01[K     |███                             | 133 kB 5.3 MB/s eta 0:00:01[K     |███▏                            | 143 kB 5.3 MB/s eta 0:00:01[K  

In [None]:
pip install datasets --quiet

[K     |████████████████████████████████| 365 kB 5.2 MB/s 
[K     |████████████████████████████████| 141 kB 44.0 MB/s 
[K     |████████████████████████████████| 212 kB 41.0 MB/s 
[K     |████████████████████████████████| 115 kB 29.9 MB/s 
[K     |████████████████████████████████| 127 kB 55.3 MB/s 
[?25h

In [None]:
pip install nltk --quiet

In [None]:
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

In [None]:
from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

import torch
import pandas as pd

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

## Import model

Here we are importing the model and tokenizer and letting the model use our GPU to run. Please change model path, and tokenizer to whichever one you wish to use.

In [None]:
from transformers import LongformerForSequenceClassification, LongformerTokenizer, LongformerConfig
# replace <PATH-TO-SAVED-MODEL> with the real path of the saved model
model_path = 'danielhou13/longformer-finetuned_papers_v2'
#model_path = 'danielhou13/longformer-finetuned-news-cogs402'

# load model
model = LongformerForSequenceClassification.from_pretrained(model_path, num_labels = 2)
model.to(device)
model.eval()
model.zero_grad()

# load tokenizer
tokenizer = LongformerTokenizer.from_pretrained("allenai/longformer-base-4096")

Downloading config.json:   0%|          | 0.00/0.99k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/567M [00:00<?, ?B/s]

Some weights of the model checkpoint at danielhou13/longformer-finetuned_papers_v2 were not used when initializing LongformerForSequenceClassification: ['longformer.embeddings.position_ids']
- This IS expected if you are initializing LongformerForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LongformerForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Downloading vocab.json:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/694 [00:00<?, ?B/s]

Create functions that give us the input ids and the position ids for the text we want to examine along with the baselines for integrated gradients.

In [None]:
ref_token_id = tokenizer.pad_token_id # A token used for generating token reference
sep_token_id = tokenizer.sep_token_id # A token used as the end of the text.
cls_token_id = tokenizer.cls_token_id # A token used for prepending to the text

Please adjust the max_length accordingly for your project. The length should be the length you desire subtracted by 2 (as we are adding the CLS token at the beginning and the seperator token at the end.

In [None]:
max_length = 2046
def construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id):

    text_ids = tokenizer.encode(text, truncation = True, add_special_tokens=False, max_length = max_length)
    # construct input token ids
    input_ids = [cls_token_id] + text_ids + [sep_token_id]
    # construct reference token ids 
    ref_input_ids = [cls_token_id] + [ref_token_id] * len(text_ids) + [sep_token_id]

    return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device), len(text_ids)

def construct_input_ref_pos_id_pair(input_ids):
    seq_length = input_ids.size(1)

    #taken from the longformer implementation
    mask = input_ids.ne(ref_token_id).int()
    incremental_indices = torch.cumsum(mask, dim=1).type_as(mask) * mask
    position_ids = incremental_indices.long().squeeze() + ref_token_id

    # we could potentially also use random permutation with `torch.randperm(seq_length, device=device)`
    ref_position_ids = torch.zeros(seq_length, dtype=torch.long, device=device)

    position_ids = position_ids.unsqueeze(0).expand_as(input_ids)
    position_ids = position_ids[:, :seq_length]
    ref_position_ids = ref_position_ids.unsqueeze(0).expand_as(input_ids)
    return position_ids, ref_position_ids
    
def construct_attention_mask(input_ids):
    return torch.ones_like(input_ids)

### Import Dataset

Here we import the papers dataset

In [None]:
from datasets import load_dataset
import numpy as np
cogs402_ds = load_dataset("danielhou13/cogs402dataset")["validation"]

Downloading:   0%|          | 0.00/739 [00:00<?, ?B/s]

Using custom data configuration danielhou13--cogs402dataset-144b958ac1a53abb


Downloading and preparing dataset None/None (download: 157.87 MiB, generated: 311.56 MiB, post-processed: Unknown size, total: 469.43 MiB) to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec...


Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/132M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/33.6M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

0 tables [00:00, ? tables/s]

0 tables [00:00, ? tables/s]

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec. Subsequent calls will reuse this data.


  0%|          | 0/2 [00:00<?, ?it/s]

Here we import the news dataset

In [None]:
# cogs402_ds = load_dataset("danielhou13/cogs402dataset2")["validation"]

## Getting the Attributions

A custom forward function that returns the softmaxed logits, which are the class probabilities that the model uses for prediction.

In [None]:
def predict(inputs, position_ids=None, attention_mask=None):
    output = model(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask)
    return output.logits

In [None]:
#set 1 if we are dealing with a positive class, and 0 if dealing with negative class
def custom_forward(inputs, position_ids=None, attention_mask=None):
    preds = predict(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask
                   )
    return torch.softmax(preds, dim = 1)

A helper function to summarize attributions for each word token in the sequence. The attribution output has a shape of (seq_len, model_embedding_size) so this function summarizes the output to an array of shape (seq_len).

In [None]:
def summarize_attributions(attributions):
    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.linalg.norm(attributions)
    return attributions

Perform Layer Integrated Gradients using the longformer's embeddings.

In [None]:
lig = LayerIntegratedGradients(custom_forward, model.longformer.embeddings)

This function will let us get the example and the baseline inputs in order to perform integrated gradients, and add the attributions to our visualization tool. Additionally, we will add the attributions and tokens for each example into an array so we can use them when we want to further examine the attributions scores for each example. We remove the whitespace character than the Longformer Tokenizer applies. More information about the integrated gradients function can be found [here](https://captum.ai/api/layer.html#layer-integrated-gradients).

In [None]:
vis_data_records = []
all_attributions = {}
all_tokens = {}
all_deltas = {}

In [None]:
# Takes in dataset and example number
def get_token_attributions(dataset, example):
  text = dataset['text'][example]
  label = dataset['labels'][example]

  # get the inputs, position ids, attention mask, and the baselines
  input_ids, ref_input_ids, sep_id = construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id)
  position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids)
  attention_mask = construct_attention_mask(input_ids)

  #get the tokens
  indices = input_ids[0].detach().tolist()
  all_tokens_curr = tokenizer.convert_ids_to_tokens(indices)
  all_tokens[str(example)] = all_tokens_curr

  #perform integrated gradients
  attributions, delta = lig.attribute(inputs=input_ids,
                                    baselines=ref_input_ids,
                                    return_convergence_delta=True,
                                    additional_forward_args=(position_ids, attention_mask),
                                    target=1,
                                    n_steps=500,
                                    internal_batch_size = 2)

  # We want one value for every token.
  attributions_sum = summarize_attributions(attributions)

  # store the values in our dictionary
  all_attributions[str(example)] = attributions_sum
  all_deltas[str(example)] = attributions_sum

  # get the score for our visualization
  score = predict(input_ids, position_ids, attention_mask)

  # storing couple samples in an array for visualization purposes
  # requires array of attributions, prediction score, predicted class, true class 
  # the label you want your attributions to associate positive with, the attribution score
  # the tokens, and the delta if you have it.
  vis_data_records.append(viz.VisualizationDataRecord(
                        attributions_sum,
                        torch.softmax(score, dim = 1).max(),
                        torch.argmax(torch.softmax(score, dim = 1)),
                        label,
                        str(1),
                        attributions_sum.sum(),       
                        all_tokens_curr,
                        delta)
  )

Here we are taking some examples from the Papers datasets.

In [None]:
# get_token_attributions(cogs402_ds, 976)
get_token_attributions(cogs402_ds, 891)
# get_token_attributions(cogs402_ds, 589)
# get_token_attributions(cogs402_ds, 605)
# get_token_attributions(cogs402_ds, 148)

Here we are taking some examples from the News datasets.

In [None]:
# get_token_attributions(cogs402_ds, 102)
# get_token_attributions(cogs402_ds, 1168)
# # get_token_attributions(cogs402_ds, 2307)
# # get_token_attributions(cogs402_ds, 2359)

This function allows us to display our attributions in a manner that is easy to read. We can see the attributions of the word overlayed on top of their respective token. The green colour represents positive attributions (i.e. the model is attributing this token to influential for predicting the positive class) while the red colour represents negative attributions. 

In [None]:
# # storing couple samples in an array for visualization purposes
# score_vis = viz.VisualizationDataRecord(
#                         attributions_sum,
#                         torch.softmax(score, dim = 1).max(),
#                         torch.argmax(torch.softmax(score, dim = 1)),
#                         label,
#                         str(1),
#                         attributions_sum.sum(),       
#                         all_tokens,
#                         delta)

print('\033[1m', 'Visualization For Score', '\033[0m')
_ = viz.visualize_text(vis_data_records)

[1m Visualization For Score [0m


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,0 (1.00),1.0,-0.1,"#s ar X iv : 17 05 . 0 39 16 v 1 [ cs . MA ] 10 May 2017 Under consideration for publication in Theory and Practice of Logic Programming 1 Sol ving Dist ributed Con str aint Optim ization Problems Using Logic Programming Tie p Le , Tr an Cao Son , En ric o P onte lli , William Ye oh Computer Science Department New Mexico State University Las Cru ces , NM , 8 800 1 , USA E - mail : { tile , t son , ep on tell , w ye oh } @ cs . n ms u . edu submitted 1 January 2003 ; revised 1 January 2003 ; accepted 1 January 2003 Abstract This paper explores the use of Answer Set Programming ( AS P ) in solving Dist ributed Con str aint Optim ization Problems ( DC OP s ). The paper provides the following novel contributions : ( 1 ) It shows how one can formulate DC OP s as logic programs ; ( 2 ) It introduces ASP - DP OP , the first DC OP algorithm that is based on logic programming ; ( 3 ) It experiment ally shows that ASP - DP OP can be up to two orders of magnitude faster than DP OP ( its imperative programming counterpart ) as well as solve some problems that DP OP fails to solve , due to memory limitations ; and ( 4 ) It demonstrates the applic ability of ASP in a wide array of multi - agent problems currently modeled as DC OP s . 1 Under consideration in Theory and Practice of Logic Programming ( T PL P ). KEY WOR DS : DC OP ; DP OP ; Logic Programming ; ASP 1 Introduction Dist ributed Con str aint Optim ization Problems ( DC OP s ) are optimization problems where agents need to coordinate the assignment of values to their âĢ ľ local âĢ Ŀ variables to maximize the overall sum of resulting constraint utilities ( Mod i et al . 2005 ; Pet cu and F alt ings 2005 a ; Ma iller and Less er 2004 ; Ye oh and Yok oo 2012 ). The process is subject to limitations on the communication capabilities of the agents ; in particular , each agent can only exchange information with neighboring agents within a given top ology . DC OP s are well - su ited for modeling multi - agent coordination and resource allocation problems , where the primary interactions are between local subs ets of agents . Researchers have used DC OP s to model various problems , such as the distributed scheduling of meetings ( Ma hes war an et al . 2004 ; Z ivan et al . 2014 ), distributed allocation of targets to sensors in a network ( Far inelli et al . 2008 ), distributed allocation of resources in disaster evacuation scenarios ( L ass et al . 2008 ), the distributed management of power distribution networks ( K umar et al . 2009 ; J ain et al . 2012 ), the distributed generation of coalition structures ( U eda et al . 2010 ) and the distributed coordination of logistics operations ( Le Ì ģ a ute Ì ģ and F alt ings 2011 ). 1 This article extends our previous conference paper ( Le et al . 2015 ) in the following manner : ( 1 ) It provides a more thorough description of the ASP - DP OP algorithm ; ( 2 ) It elabor ates on the algorithm âĢ Ļ s theoretical properties with complete proofs ; and ( 3 ) It includes additional experimental results . Č 2 Tie p Le , Tr an Cao Son , En ric o P onte lli , and William Ye oh The field has matured considerably over the past decade , since the seminal AD OP T paper ( Mod i et al . 2005 ), as researchers continue to develop more sophisticated solving algorithms . The majority of the DC OP resolution algorithms can be classified in one of three classes : ( 1 ) Search - based algorithms , like AD OP T ( Mod i et al . 2005 ) and its variants ( Ye oh et al . 2009 ; Ye oh et al . 2010 ; Gutierrez et al . 2011 ; Gutierrez et al . 2013 ), AFB ( G ers h man et al . 2009 ), and MGM ( Ma hes war an et al . 2004 ), where the agents enumer ate combinations of value assignments in a decentralized manner ; ( 2 ) In ference - based algorithms , like DP OP ( P etc u and F alt ings 2005 a ) and its variants ( P etc u and F alt ings 2005 b ; Pet cu and F alt ings 2007 ; Pet cu et al . 2007 ; Pet cu et al . 2008 ), max - sum ( Far inelli et al . 2008 ), and Action G DL ( V iny als et al . 2011 ), where the agents use dynamic programming techniques to propagate aggreg ated information to other agents ; and ( 3 ) Sam pling - based algorithms , like D UCT ( Ott ens et al . 2012 ) and D - G ib bs ( N guyen et al . 2013 ; Fi ore tto et al . 2014 ), where the agents sample the search space in a decentralized manner . The existing algorithms have been designed and developed almost exclusively using imperative programming techniques , where the algorithms define a control flow , that is , a sequence of commands to be executed . In addition , the local sol ver employed by each agent is an âĢ ľ ad - h oc âĢ Ŀ implementation . In this paper , we are interested in investigating the benefits of using decl ar ative programming techniques to solve DC OP s , along with the use of a general constraint sol ver , used as a black box , as each agent âĢ Ļ s local constraint sol ver . Specifically , we propose an integration of Dist ributed Pse udo - tree Optim ization Procedure ( DP OP ) ( P etc u and F alt ings 2005 a ), a popular DC OP algorithm , with Answer Set Programming ( AS P ) ( N iem ela Ì Ī 1999 ; Mare k and Tr us z cz yn Ì ģ ski 1999 ) as the local constraint sol ver of each agent . This paper provides the first step in brid ging the areas of DC OP s and ASP ; in the process , we offer novel contributions to both the DC OP field as well as the ASP field . For the DC OP community , we demonstrate that the use of ASP as a local constraint sol ver provides a number of benefits , including the ability to capitalize on ( i ) the highly expressive ASP language to more concise ly define input instances ( e . g ., by representing constraint utilities as implicit functions instead of explicitly enumer ating their extensions ) and ( ii ) the highly optimized ASP sol vers to exploit problem structure ( e . g ., propag ating hard constraints to ensure consistency ). For the ASP community , the paper makes the equally important contribution of increasing the applic ability of ASP to model and solve a wide array of multi - agent coordination and resource allocation problems , currently modeled as DC OP s . Furthermore , it also demonstrates that general , off - the - she lf ASP sol vers , which are continuously hon ed and improved , can be coupled with distributed message passing protocols to outper form specialized imperative sol vers . The paper is organized as follows . In Section 2 , we review the basic definitions of DC OP s , the DP OP algorithm , and ASP . In Section 3 , we describe in detail the structure of the novel ASP - based DC OP sol ver , called ASP - DP OP , and its implementation . Section 4 provides an analysis of the properties of ASP - DP OP , including proofs of sound ness and comple teness of ASP - DP OP . Section 5 provides some experimental results , while Section 6 reviews related work . Finally , Section 7 provides conclusions and indications for future work . Č S olving Dist ributed Con str aint Optim ization Problems Using Logic Programming 3 2 Background In this section , we present an overview of DC OP s , we describe DP OP , a complete distributed algorithm to solve DC OP s , and provide some fundamental definitions of ASP . 2 . 1 Dist ributed Con str aint Optim ization Problems A Dist ributed Con str aint Optim ization Problem ( DC OP ) ( Mod i et al . 2005 ; Pet cu and F alt ings 2005 a ; Ma iller and Less er 2004 ; Ye oh and Yok oo 2012 ) can be described as a tuple M = h X , D , F , A , Î± i where : âĢ¢ X = { x 1 , . . . , x n } is a finite set of ( dec ision ) variables ; âĢ¢ D = { D 1 , . . . , D n } is a set of finite domains , where Di is the domain of the variable x i âĪ Ī X , for 1 âī¤ i âī¤ n ; âĢ¢ F = { f 1 , . . . , f m } is a finite set of constraints , where f j is a k j - ary function f j : Dj 1 ÃĹ Dj 2 ÃĹ . . . ÃĹ Dj kj 7 âĨĴ R âĪ ª { âĪĴ âĪ ŀ } that specifies the utility of each combination of values of variables in its scope ; the scope is den oted by sc p ( f j ) = { x j 1 , . . . , x j kj }; 2 âĢ¢ A = { a 1 , . . . , ap } is a finite set of agents ; and âĢ¢ Î± : X 7 âĨĴ A maps each variable to an agent . We say that a variable x is owned by an agent a if Î± ( x ) = a . We denote with Î± i the set of all variables that are owned by an agent a i , i . e ., Î± i = { x âĪ Ī X | Î± ( x ) = a i } . Each constraint in F can be either hard , indicating that some value combinations result in a utility of âĪĴ âĪ ŀ and must be avoided , or soft , indicating that all value combinations result in a finite utility and need not be avoided . A value assignment is a ( partial or complete ) function x that maps variables of X to values in D such that , if x ( xi ) is defined , then x ( xi ) âĪ Ī Di for i = 1 , . . . , n . For the sake of simplicity , and with a slight abuse of notation , we will often denote x ( xi ) simply with x i . Given a constraint f j and a complete value assignment x for all decision variables , we denote with x f j the projection of x to the variables in sc p ( f j ); we refer to this as a partial value assignment for f j . For a DC OP M , we denote with #/s"
,,,,


## Further Examination of the Attributions

Next we might want to look in-depth about the attribution scores for each token of an example. We saved the attributions for the examples we looked at above, so we can easily retrieve the attributions. We also grab the examples because we want to know what tokens the attributions are associated with.

Both lists are of shape: (seq_len)

In [None]:
example = 891
attributions_sum = all_attributions[f"{example}"]
all_tokens2 = all_tokens[f"{example}"]

These functions return which words had the strongest (most positive and most negative) attributions. Change the number of tokens you wish to visualize for your needs. It takes in the attributions and the tokens we grabbed in the previous cell and returns 3 lists: the topk (or bottomk) attributions, their respective token and their position.

Note: Remember that the attributions are with respect to the positive class, so the most impact tokens that helped the model predict the negative class will be in the botk attributed tokens.

In [None]:
def get_topk_attributed_tokens(attrs, all_tokens, k=20):
    values, indices = torch.topk(attrs, k)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

In [None]:
def get_botk_attributed_tokens(attrs, all_tokens, k=20):
    values, indices = torch.topk(attrs, k, largest=False)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

Convert the values, index of the values, and the token into a pandas Dataframe for visualization. It will be sorted by highest value for attributions to lowest. Alternatively, if youre looking for the most negative attributions, it goes from lowest to highest.

In [None]:
top_words_start, top_words_val_start, top_word_ind_start = get_topk_attributed_tokens(attributions_sum.cpu(), all_tokens2)
bot_words_start, bot_words_val_start, bot_word_ind_start = get_botk_attributed_tokens(attributions_sum.cpu(), all_tokens2)

df_high = pd.DataFrame({'Word': top_words_start, 'index':top_word_ind_start, 'attribution': top_words_val_start})

df_low = pd.DataFrame({'Word': bot_words_start, 'index':bot_word_ind_start, 'attribution': bot_words_val_start})
# df_start.style.apply(['cell_ids: False'])

# ['{}({})'.format(token, str(i)) for i, token in enumerate(all_tokens)]

Here we display our top k positively and negatively attributed tokens for our example.

In [None]:
df_high['Word'] = df_high['Word'].str.replace('Ġ', '')
df_high

Unnamed: 0,Word,index,attribution
0,algorithms,720,0.377717
1,algorithms,695,0.280976
2,algorithms,808,0.227178
3,algorithms,704,0.206855
4,search,948,0.16128
5,Search,717,0.153568
6,solving,694,0.147625
7,researchers,688,0.135312
8,algorithm,616,0.135013
9,algorithms,908,0.118958


In [None]:
df_low['Word'] = df_low['Word'].str.replace('Ġ', '')
df_low

Unnamed: 0,Word,index,attribution
0,Programming,32,-0.336017
1,Programming,48,-0.268119
2,programs,178,-0.198517
3,Programming,136,-0.176398
4,Computer,68,-0.167329
5,Programming,286,-0.150111
6,programming,229,-0.146878
7,programming,200,-0.125192
8,of,30,-0.117794
9,Logic,47,-0.112482


In [None]:
d = {"tokens":all_tokens2, "attribution":attributions_sum[:len(all_tokens2)].cpu()}

We notice that there are many repeating tokens in each example that have different positions. While we might want to know how the position plays into the attributions, if we want to know strictly based on the tokens itself, we can add all the duplicate tokens together to get the aggregate attribution for each token. Therefore, we aggregate the attributions strictly based on token type.

In [None]:
df_attrib = pd.DataFrame(d)
aggregation_functions = {'attribution': 'sum'}
df_new = df_attrib.groupby(df_attrib['tokens']).aggregate(aggregation_functions)

In [None]:
highest_attrib_tokens = df_new.sort_values(by=['attribution'], ascending=False).reset_index()
highest_attrib_tokens['tokens'] = highest_attrib_tokens['tokens'].str.replace('Ġ', '')
highest_attrib_tokens[:10]

Unnamed: 0,tokens,attribution
0,algorithms,1.350942
1,algorithm,0.269707
2,search,0.16128
3,Search,0.153568
4,researchers,0.135312
5,solving,0.13013
6,agents,0.109145
7,and,0.082687
8,the,0.071416
9,experimental,0.071381


In [None]:
lowest_attrib_tokens = df_new.sort_values(by=['attribution']).reset_index()
lowest_attrib_tokens['tokens'] = lowest_attrib_tokens['tokens'].str.replace('Ġ', '')
lowest_attrib_tokens[:10]

Unnamed: 0,tokens,attribution
0,Programming,-1.018337
1,programming,-0.341622
2,Logic,-0.293834
3,programs,-0.198517
4,Computer,-0.167329
5,of,-0.117236
6,Practice,-0.089079
7,1,-0.085102
8,;,-0.081129
9,counterpart,-0.071546


## Masking the stopwords and non-alpha tokens

There may be some stopwords or punctuations in our top attributed tokens, so now that we have the list of the highest and lowest, we can identify interesting keywords.

In [None]:
import nltk
from transformers import AutoTokenizer
nltk.download('stopwords')
tokenizer2 = AutoTokenizer.from_pretrained('allenai/longformer-base-4096', add_prefix_space=True)

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


Downloading config.json:   0%|          | 0.00/694 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

In [None]:
from nltk.corpus import stopwords
all_stopwords = stopwords.words('english')
all_stopwords.append(" ")
stopwords = set(tokenizer2.tokenize(all_stopwords, is_split_into_words =True))
stopwords.update(all_stopwords)
print(stopwords)

{'Ġand', 'do', 'Ġhim', "you've", 'Ġve', 'Ġy', 'Ġwon', 'in', 'Ġhad', 'here', 'all', 'a', 'nor', 'Ġits', 'Ġthen', 'from', 'Ġbecause', 'Ġhimself', 'don', "needn't", 'Ġinto', 'Ġby', 'when', 'there', "'s", 'Ġfrom', 'Ġor', 'his', 'Ġtheirs', 'Ġvery', 'Ġdidn', 'Ġwho', 'Ġcan', 'between', 'Ġdid', "that'll", 'Ġhasn', "she's", 'been', 'Ġi', 'are', 'Ġof', 'on', 'Ġwhom', 've', 'y', "weren't", 'did', 'about', 'no', 'Ġbeing', 'each', 'Ġas', 'Ġwhile', 'Ġit', 'Ġyour', 'Ġain', "'ll", 'Ġall', 'Ġdo', 'some', 'Ġhis', 'isn', 'Ġown', 'Ġin', 'Ġd', "isn't", 'Ġneed', 'now', 'Ġno', 'Ġnow', 'where', 'Ġan', 'most', "mightn't", 'Ġdon', 'Ġmy', 'Ġthe', 'an', 'aren', 'to', 'Ġso', 'Ġshould', 'didn', 'Ġthem', "hadn't", 'have', 'Ġshe', "haven't", 'Ġdown', 'Ġcouldn', 'Ġbetween', 'Ġthey', 'Ġunder', 'Ġwhat', 'out', "you'd", 'doing', 'Ġare', 'during', 'into', 'as', 'theirs', 'through', 'Ġif', 'our', 'their', 'both', 'Ġduring', 'Ġhers', 'Ġm', 'ours', 'ourselves', 'further', "won't", 'before', 'yourself', 'were', 'Ġyou', 'will'

In [None]:
highest_attrib_tokens[(highest_attrib_tokens['tokens'].str.isalpha()) & ~(highest_attrib_tokens['tokens'].isin(stopwords)) & ~(highest_attrib_tokens['tokens']==0)][:10].reset_index(drop=True)

In [None]:
lowest_attrib_tokens[(lowest_attrib_tokens['tokens'].str.isalpha()) & ~(lowest_attrib_tokens['tokens'].isin(stopwords)) & ~(lowest_attrib_tokens['tokens']==0)][:10].reset_index(drop=True)

Unnamed: 0,tokens,attribution
0,Programming,-1.018337
1,programming,-0.341622
2,Logic,-0.293834
3,programs,-0.198517
4,Computer,-0.167329
5,Practice,-0.089079
6,counterpart,-0.071546
7,Set,-0.067201
8,Problems,-0.065523
9,OP,-0.064579


## Aggregate Token Attributions

Using this [notebook](https://colab.research.google.com/drive/1lktilbL1IY4nBanlzCdP8TLsBNfUsl_U?usp=sharing), we can get the files to view the aggregated attributions for the entire dataset for both the positive and negative classes. This means we summed up and averaged the attributions for every instance of any given token throughout the entire dataset (whether or not they have positive or negative attributions).

In [None]:
df_word = pd.read_csv("/content/drive/MyDrive/cogs402longformer/results/papers/papers_attributions/longformer_emb_papers.csv")

Here we see the highest attributions for the positive class, meaning that these tokens have the most influence when the model tries to predict positive. All of these words do have relevence to A.I. related topics.

In [None]:
df_word[:15]

Unnamed: 0,tokens,attribution
0,learning,0.270925
1,neural,0.159887
2,data,0.12888
3,.,0.108019
4,",",0.092537
5,AI,0.081176
6,training,0.080792
7,dataset,0.071329
8,the,0.067022
9,algorithms,0.058013


We see how the aggregate over the dataset looks without the stopwords and non alpha tokens

In [None]:
df_word[(df_word['tokens'].str.isalpha()) & ~(df_word['tokens'].isin(stopwords)) & ~(df_word['tokens']==0)][:10].reset_index(drop=True)

Unnamed: 0,tokens,attribution
0,learning,0.270925
1,neural,0.159887
2,data,0.12888
3,AI,0.081176
4,training,0.080792
5,dataset,0.071329
6,algorithms,0.058013
7,human,0.04878
8,intelligence,0.046269
9,datasets,0.042661


Here we see the largest attributions for the negative class, meaning that these tokens have the most influence when the model predicts negative.

In [None]:
df_word[:-15:-1].reset_index(drop=True)

Unnamed: 0,tokens,attribution
0,programming,-0.120735
1,program,-0.088943
2,programs,-0.082175
3,languages,-0.072262
4,language,-0.053761
5,code,-0.052443
6,software,-0.038568
7,compiler,-0.032076
8,Programming,-0.031495
9,syntax,-0.02902


We see how the aggregate over the dataset looks without the stopwords and non alpha tokens

In [None]:
df_word[(df_word['tokens'].str.isalpha()) & ~(df_word['tokens'].isin(stopwords)) & ~(df_word['tokens']==0)][:-11:-1].reset_index(drop=True)

Unnamed: 0,tokens,attribution
0,programming,-0.120735
1,program,-0.088943
2,programs,-0.082175
3,languages,-0.072262
4,language,-0.053761
5,code,-0.052443
6,software,-0.038568
7,compiler,-0.032076
8,Programming,-0.031495
9,syntax,-0.02902
