<a href="https://colab.research.google.com/github/danielhou13/cogs402longformer/blob/main/src/CaptumLongformerSequenceClassification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook adapts the [Captum tutorial for question answering](https://captum.ai/tutorials/Bert_SQUAD_Interpret) and refactors it into the longformer sequence classification task. Specifically, this notebook focuses on using the model's embeddings to get token attributions for the examples of your choice, or the entire dataset if needed. By doing so, we can visualize which tokens have the most influence in the model's prediction, and find out the k tokens with the most influence at helping the model predict correctly as well as incorrectly.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Import dependencies

In [None]:
pip install transformers --quiet

[K     |████████████████████████████████| 4.4 MB 10.0 MB/s 
[K     |████████████████████████████████| 101 kB 11.8 MB/s 
[K     |████████████████████████████████| 596 kB 69.4 MB/s 
[K     |████████████████████████████████| 6.6 MB 56.6 MB/s 
[?25h

In [None]:
pip install captum --quiet

[?25l[K     |▎                               | 10 kB 28.5 MB/s eta 0:00:01[K     |▌                               | 20 kB 34.1 MB/s eta 0:00:01[K     |▊                               | 30 kB 15.8 MB/s eta 0:00:01[K     |█                               | 40 kB 11.9 MB/s eta 0:00:01[K     |█▏                              | 51 kB 6.0 MB/s eta 0:00:01[K     |█▍                              | 61 kB 7.1 MB/s eta 0:00:01[K     |█▋                              | 71 kB 7.9 MB/s eta 0:00:01[K     |█▉                              | 81 kB 5.8 MB/s eta 0:00:01[K     |██                              | 92 kB 6.5 MB/s eta 0:00:01[K     |██▎                             | 102 kB 7.1 MB/s eta 0:00:01[K     |██▌                             | 112 kB 7.1 MB/s eta 0:00:01[K     |██▊                             | 122 kB 7.1 MB/s eta 0:00:01[K     |███                             | 133 kB 7.1 MB/s eta 0:00:01[K     |███▏                            | 143 kB 7.1 MB/s eta 0:00:01[K 

In [None]:
pip install datasets --quiet

[K     |████████████████████████████████| 362 kB 6.9 MB/s 
[K     |████████████████████████████████| 1.1 MB 87.2 MB/s 
[K     |████████████████████████████████| 140 kB 68.2 MB/s 
[K     |████████████████████████████████| 212 kB 53.7 MB/s 
[K     |████████████████████████████████| 127 kB 64.3 MB/s 
[K     |████████████████████████████████| 144 kB 26.2 MB/s 
[K     |████████████████████████████████| 271 kB 67.0 MB/s 
[K     |████████████████████████████████| 94 kB 3.9 MB/s 
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.[0m
[?25h

In [None]:
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

In [None]:
from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

import torch
import pandas as pd

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

## Import model

Here we are importing the model and tokenizer and letting the model use our GPU to run. Please change model path, and tokenizer to whichever one you wish to use.

In [None]:
from transformers import LongformerForSequenceClassification, LongformerTokenizer, LongformerConfig
# replace <PATH-TO-SAVED-MODEL> with the real path of the saved model
model_path = 'danielhou13/longformer-finetuned_papers_v2'
#model_path = 'danielhou13/longformer-finetuned-new-cogs402'

# load model
model = LongformerForSequenceClassification.from_pretrained(model_path, num_labels = 2)
model.to(device)
model.eval()
model.zero_grad()

# load tokenizer
tokenizer = LongformerTokenizer.from_pretrained("allenai/longformer-base-4096")

Downloading:   0%|          | 0.00/0.99k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/567M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/694 [00:00<?, ?B/s]

Create functions that give us the input ids and the position ids for the text we want to examine along with the baselines for integrated gradients.

In [None]:
ref_token_id = tokenizer.pad_token_id # A token used for generating token reference
sep_token_id = tokenizer.sep_token_id # A token used as a separator between question and text and it is also added to the end of the text.
cls_token_id = tokenizer.cls_token_id # A token used for prepending to the concatenated question-text word sequence

In [None]:
max_length = 2046
def construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id):

    text_ids = tokenizer.encode(text, truncation = True, add_special_tokens=False, max_length = max_length)
    # construct input token ids
    input_ids = [cls_token_id] + text_ids + [sep_token_id]
    # construct reference token ids 
    ref_input_ids = [cls_token_id] + [ref_token_id] * len(text_ids) + [sep_token_id]

    return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device), len(text_ids)

def construct_input_ref_pos_id_pair(input_ids):
    seq_length = input_ids.size(1)

    #taken from the longformer implementation
    mask = input_ids.ne(ref_token_id).int()
    incremental_indices = torch.cumsum(mask, dim=1).type_as(mask) * mask
    position_ids = incremental_indices.long().squeeze() + ref_token_id

    # we could potentially also use random permutation with `torch.randperm(seq_length, device=device)`
    ref_position_ids = torch.zeros(seq_length, dtype=torch.long, device=device)

    position_ids = position_ids.unsqueeze(0).expand_as(input_ids)
    position_ids = position_ids[:, :seq_length]
    ref_position_ids = ref_position_ids.unsqueeze(0).expand_as(input_ids)
    return position_ids, ref_position_ids
    
def construct_attention_mask(input_ids):
    return torch.ones_like(input_ids)

### Import Dataset

Here we import the papers dataset

In [None]:
from datasets import load_dataset
import numpy as np
cogs402_ds = load_dataset("danielhou13/cogs402dataset")["test"]

Downloading:   0%|          | 0.00/739 [00:00<?, ?B/s]

Using custom data configuration danielhou13--cogs402dataset-144b958ac1a53abb


Downloading and preparing dataset None/None (download: 157.87 MiB, generated: 311.56 MiB, post-processed: Unknown size, total: 469.43 MiB) to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8...


Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/33.6M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/132M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

0 tables [00:00, ? tables/s]

0 tables [00:00, ? tables/s]

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/danielhou13___parquet/danielhou13--cogs402dataset-144b958ac1a53abb/0.0.0/7328ef7ee03eaf3f86ae40594d46a1cec86161704e02dd19f232d81eee72ade8. Subsequent calls will reuse this data.


  0%|          | 0/2 [00:00<?, ?it/s]

Here we import the news dataset

In [None]:
# cogs402_ds = load_dataset("danielhou13/cogs402dataset2")["validation"]

A custom forward function that returns the softmaxed logits, which are the class probabilities that the model uses for prediction.

In [None]:
def predict(inputs, position_ids=None, attention_mask=None):
    output = model(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask)
    return output.logits

In [None]:
#set 1 if we are dealing with a positive class, and 0 if dealing with negative class
def custom_forward(inputs, position_ids=None, attention_mask=None):
    preds = predict(inputs,
                   position_ids=position_ids,
                   attention_mask=attention_mask
                   )
    return torch.softmax(preds, dim = 1)

A helper function to summarize attributions for each word token in the sequence.

In [None]:
def summarize_attributions(attributions):
    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.linalg.norm(attributions)
    return attributions

Perform Layer Integrated Gradients using the longformer's embeddings.

In [None]:
lig = LayerIntegratedGradients(custom_forward, model.longformer.embeddings)

This function will let us get the example and the baseline inputs in order to perform integrated gradients, and add the attributions to our visualization tool. Additionally, we will add the attributions and tokens for each example into an array so we can use them when we want to further examine the attributions scores for each example.

In [None]:
vis_data_records = []
all_attributions = {}
all_tokens = {}
all_deltas = {}

In [None]:
def get_token_attributions(dataset, example):
  text = cogs402_ds['text'][example]
  label = cogs402_ds['labels'][example]

  input_ids, ref_input_ids, sep_id = construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id)
  position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids)
  attention_mask = construct_attention_mask(input_ids)

  indices = input_ids[0].detach().tolist()
  all_tokens_curr = tokenizer.convert_ids_to_tokens(indices)

  all_tokens[str(example)] = all_tokens_curr

  attributions, delta = lig.attribute(inputs=input_ids,
                                    baselines=ref_input_ids,
                                    return_convergence_delta=True,
                                    additional_forward_args=(position_ids, attention_mask),
                                    target=1,
                                    n_steps=250,
                                    internal_batch_size = 2)

  attributions_sum = summarize_attributions(attributions)

  all_attributions[str(example)] = attributions_sum
  all_deltas[str(example)] = attributions_sum
  score = predict(input_ids, position_ids, attention_mask)

  # storing couple samples in an array for visualization purposes
  vis_data_records.append(viz.VisualizationDataRecord(
                        attributions_sum,
                        torch.softmax(score, dim = 1).max(),
                        torch.argmax(torch.softmax(score, dim = 1)),
                        label,
                        str(1),
                        attributions_sum.sum(),       
                        all_tokens_curr,
                        delta)
  )

Here we are taking some examples from the Papers datasets.

In [None]:
# get_token_attributions(cogs402_ds, 976)
# get_token_attributions(cogs402_ds, 891)
get_token_attributions(cogs402_ds, 589)
# get_token_attributions(cogs402_ds, 605)
# get_token_attributions(cogs402_ds, 148)

Here we are taking some examples from the Papers datasets.

In [None]:
# get_token_attributions(cogs402_ds, 102)
# get_token_attributions(cogs402_ds, 1168)
# # get_token_attributions(cogs402_ds, 2307)
# # get_token_attributions(cogs402_ds, 2359)

This function allows us to display our attributions in a manner that is easy to read. We can see the attributions of the word overlayed on top of their respective token. The green colour represents positive attributions (i.e. the model is attributing this token to influential for predicting the positive class) while the red colour represents negative attributions. 

In [None]:
# # storing couple samples in an array for visualization purposes
# score_vis = viz.VisualizationDataRecord(
#                         attributions_sum,
#                         torch.softmax(score, dim = 1).max(),
#                         torch.argmax(torch.softmax(score, dim = 1)),
#                         label,
#                         str(1),
#                         attributions_sum.sum(),       
#                         all_tokens,
#                         delta)

print('\033[1m', 'Visualization For Score', '\033[0m')
_ = viz.visualize_text(vis_data_records)

[1m Visualization For Score [0m


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,0 (1.00),1.0,-5.13,"#s RES EAR CH ĠREPORT ĠN Â° Ġ79 39 ĠApril Ġ2012 ĠProject - Te ams ĠCont raint es Ġ ĠIS RN ĠIN R IA / RR -- 79 39 -- FR + ENG Ġ ĠArm ando ĠGon Ã§ al ves Ġda ĠSilva ĠJunior , ĠPierre ĠDer ans art , ĠLuis ĠCarlos ĠM enez es , ĠMarcos ĠAur Ã© lio ĠAl me ida Ġda ĠSilva , ĠJacques ĠRobin Ġ ĠIS SN Ġ0 249 - 63 99 Ġ Ġar X iv : 12 04 . 52 80 v 1 Ġ[ cs . PL ] Ġ24 ĠApr Ġ2012 Ġ ĠTowards Ġa ĠGeneric ĠTrace Ġfor ĠRule ĠBased ĠCon str aint ĠReason ing Ġ Ġ Č Č T ow ards Ġa ĠGeneric ĠTrace Ġfor ĠRule ĠBased ĠCon str aint ĠReason ing ĠArm ando ĠGon Ã§ al ves Ġda ĠSilva ĠJunior âĪ Ĺ ĠâĢ ł Ġ, ĠPierre ĠDer ans art âĢ ¡ Ġ, ĠLuis ĠCarlos ĠM enez es Â§ Ġ, ĠMarcos ĠAur Ã© lio ĠAl me ida Ġda ĠSilva Â¶ Ġ, ĠJacques ĠRobin âĢ ĸ âĪ Ĺ ĠProject - Te ams ĠCont raint es ĠResearch ĠReport Ġn Â° Ġ79 39 ĠâĢĶ ĠApril Ġ2012 ĠâĢĶ Ġ39 Ġpages Ġ ĠAbstract : ĠCHR Ġis Ġa Ġvery Ġversatile Ġprogramming Ġlanguage Ġthat Ġallows Ġprogrammers Ġto Ġdecl ar atively Ġspecify Ġconstraint Ġsol vers . ĠAn Ġimportant Ġpart Ġof Ġthe Ġdevelopment Ġof Ġsuch Ġsol vers Ġis Ġin Ġtheir Ġtesting Ġand Ġdebugging Ġphases . ĠCurrent ĠCHR Ġimplementations Ġsupport Ġthose Ġphases Ġby Ġoffering Ġtracing Ġfacilities Ġwith Ġlimited Ġinformation . ĠIn Ġthis Ġreport , Ġwe Ġpropose Ġa Ġnew Ġtrace Ġfor ĠCHR Ġwhich Ġcontains Ġenough Ġinformation Ġto Ġanalyze Ġany Ġaspects Ġof ĠCHR âĪ ¨ Ġexecution Ġat Ġsome Ġuseful Ġabstract Ġlevel , Ġcommon Ġto Ġseveral Ġimplementations . ĠThis Ġapproach Ġis Ġbased Ġon Ġthe Ġidea Ġof Ġgeneric Ġtrace . ĠSuch Ġa Ġtrace Ġis Ġformally Ġdefined Ġas Ġan Ġextension Ġof Ġthe ĠÏ ī r âĪ ¨ Ġsemantics Ġof ĠCHR . ĠWe Ġshow Ġthat Ġit Ġcan Ġbe Ġderived Ġform Ġthe ĠSW I ĠPro log ĠCHR Ġtrace . ĠKey - words : ĠTrace , ĠCHR , ĠCHR âĪ ¨ Ġ, ĠTr acer , ĠGeneric ĠTrace , ĠAnalysis ĠTool , ĠObserv ational ĠSem antics , ĠDebug ging , ĠProgramming ĠEnvironment , ĠCon str aint ĠProgramming , ĠVal idation Ġ ĠâĪ Ĺ ĠâĢ ł ĠâĢ ¡ ĠÂ§ ĠÂ¶ ĠâĢ ĸ Ġ ĠFederal ĠUniversity Ġof ĠP ern amb u co , Ġag s j @ cin . uf pe . br ĠWork Ġdone Ġduring Ġinternship Ġof ĠArm ando ĠGon Ã§ al ves Ġda ĠSilva ĠJunior ĠIN R IA ĠParis - R oc qu enc ourt , ĠPierre . Der ans art @ in ria . fr ĠUniversity Ġof ĠP ern amb u co , ĠRec ife , Ġl c sm @ ec omp . p oli . br ĠUnivers itÃ© ĠPierre - Marie ĠCur ie , ĠParis , ĠFrance , Ġma ure lio 12 34 @ gmail . com ĠTh al Ã¨ s , ĠFrance , Ġj ac ques @ gmail . com Ġ ĠRES EAR CH ĠCENT RE ĠPAR IS ĠâĢĵ ĠR OC QU ENC OUR T Ġ ĠDom aine Ġde ĠVol uce au , Ġ- ĠRoc qu enc ourt ĠB . P . Ġ105 Ġ- Ġ78 153 ĠLe ĠChes n ay ĠCed ex Ġ Ġ Č Vers Ġune Ġtrace Ġg Ã©n Ã© rique Ġpour Ġdes Ġsolve urs Ġde Ġcontr aint es ĠÃł Ġbase Ġde Ġr Ã¨ g les ĠRÃ© sum Ã© Ġ: ĠCHR Ġ( Con str aint ĠHandling ĠRules ) Ġest Ġun Ġlang age Ġde Ġprogram m ation Ġadapt able Ġqui Ġper met Ġde Ġsp Ã© c ifier Ġtr Ã¨ s ĠdÃ© cl ar ative ment Ġdes Ġsolve urs Ġde Ġcontr aint es . ĠUn Ġaspect Ġimportant Ġde Ġle ur Ġm ise Ġau Ġpoint Ġconc er ne Ġle ur ĠdÃ© b og age . ĠLes Ġimplant ations Ġact ue ll es Ġde ĠCHR Ġoff rent Ġdes Ġposs ibl ilit Ã©s Ġde Ġtraces Ġa vec Ġrelative ment Ġpe u Ġd âĢ Ļ information . ĠD ans Ġce Ġrapport , Ġn ous Ġpropos ons Ġune Ġn ou vel le Ġtrace ĠCHR Ġqui Ġcont ient Ġsuff is am ment Ġd âĢ Ļ information Ġpour Ġanalys er Ġpotent iel lement Ġt ous Ġles Ġd Ã©t ails Ġd âĢ Ļ ex Ã© c ution Ġde ĠCHR âĪ ¨ Ġ, Ġcorrespond ant ĠÃł Ġun Ġn ive au Ġd âĢ Ļ analy se Ġab stra it Ġet Ġut ile , Ġcommun ĠÃł Ġdiff Ã© rent es Ġimpl Ã© ment ations . ĠC ette Ġappro che Ġest Ġfond Ã©e Ġsur Ġl âĢ Ļ id Ã©e Ġde Ġtrace Ġg Ã©n Ã© rique . ĠU ne Ġt elle Ġtrace Ġest ĠdÃ© fin ie Ġcomm e Ġune Ġextension Ġde Ġla Ġs Ã© m ant ique ĠÏ ī r âĪ ¨ Ġde ĠCHR . ĠOn Ġmont re Ġqu âĢ Ļ elle Ġpe ut ĠÃ ª tre ĠdÃ© riv Ã©e Ġde Ġla Ġtrace ĠCHR Ġde ĠSW I ĠPro log . ĠM ots - cl Ã©s Ġ: Ġtrace , ĠCHR , ĠCHR âĪ ¨ Ġ, Ġtrace ur , Ġtrace Ġg Ã©n Ã© rique , Ġanalyse ur , Ġs Ã© m ant ique Ġobservation nel le , ĠdÃ© b ogg age , Ġen viron n ement Ġde Ġprogram m ation , Ġprogram m ation Ġpar Ġcontr aint es , Ġvalidation Ġ Ġ Č T ow ards Ġa ĠGeneric ĠTrace Ġ Ġ1 Ġ Ġ3 Ġ ĠIntroduction Ġ ĠCHR Ġ( Con str aint ĠHandling ĠRules )[ 9 ] Ġis Ġa Ġuniquely Ġversatile Ġand Ġsem antically Ġwell - founded Ġprogramming Ġlanguage . ĠIt Ġallows Ġprogrammers Ġto Ġspecify Ġconstraint Ġsol vers Ġin Ġa Ġvery Ġdecl ar ative Ġway . ĠAn Ġimportant Ġpart Ġof Ġthe Ġdevelopment Ġof Ġsuch Ġsol vers Ġis Ġin Ġtheir Ġtesting Ġand Ġdebugging Ġphases . ĠCurrent ĠCHR Ġimplementations Ġsupport Ġthose Ġphases Ġby Ġoffering Ġtracing Ġfacilities Ġwith Ġlimited Ġinformation . ĠIn Ġthis Ġreport , Ġwe Ġpropose Ġa Ġnew Ġtrace Ġfor ĠCHR Ġwhich Ġcontains Ġenough Ġinformation , Ġincluding Ġsource Ġcode Ġones , Ġto Ġanalyze Ġany Ġaspects Ġof ĠCHR âĪ ¨ Ġexecution Ġat Ġsome Ġabstract Ġlevel , Ġgeneral Ġenough Ġto Ġcover Ġseveral Ġimplementations Ġand Ġsource Ġlevel Ġanalysis . ĠAlthough Ġthe Ġidea Ġof Ġformal Ġspecification Ġbased Ġtr acer Ġis Ġnot Ġnew Ġ( see Ġfor Ġexample Ġ[ 13 ]), Ġthe Ġmain Ġnovelty Ġlies Ġin Ġthe Ġgeneric Ġaspect Ġof Ġthe Ġtrace . ĠMost Ġof Ġthe Ġexisting Ġimplementations Ġof ĠCHR Ġlike Ġin Ġ[ 11 , Ġ12 , Ġ2 , Ġ19 ] Ġinclude Ġa Ġtr acer Ġwith Ġspecific ĠCHR Ġtrace Ġevents , Ġbut Ġwithout Ġformal Ġspecification , Ġnor Ġconsideration Ġwith Ġregards Ġto Ġdifferent Ġkind Ġof Ġus ages Ġother Ġthan Ġdebugging . ĠThe Ġnotion Ġof Ġgeneric Ġtrace Ġhas Ġbeen Ġinform ally Ġintroduced Ġand Ġused Ġfor Ġdefining Ġportable ĠCL P ( FD ) Ġtr acer Ġand Ġportable Ġapplications Ġ[ 1 , Ġ14 ]. ĠWe Ġpropose Ġhere Ġto Ġuse Ġthis Ġapproach Ġto Ġspecify Ġa Ġtr acer Ġfor Ġrule Ġbased Ġinference Ġengine Ġlike ĠCHR âĪ ¨ Ġ. ĠA Ġgeneric Ġtrace Ġhas Ġthree Ġmain Ġcharacteristics : Ġit Ġis ĠâĢ ľ high Ġlevel âĢ Ŀ Ġin Ġthe Ġsense Ġthat Ġit Ġis Ġindependent Ġfrom Ġparticular Ġimplementations Ġof ĠCHR , Ġit Ġhas Ġa Ġspecified Ġsemantics Ġ( Obs erv ational ĠSem antics ) Ġand Ġcan Ġbe Ġused Ġto Ġimplement Ġdebugging Ġtools Ġor Ġapplications . ĠAn Ġimportant Ġproperty Ġof Ġthe Ġproposed Ġgeneric Ġtrace Ġis Ġthat Ġit Ġcontains Ġas Ġmany Ġinformation Ġon Ġthe Ġsol ver Ġbehaviour Ġas Ġthe Ġone Ġcontained Ġin Ġthe Ġoperational Ġsemantics . ĠThis Ġproperty Ġis Ġcalled ĠâĢ ľ faith fulness âĢ Ŀ Ġof Ġthe Ġobservational Ġsemantics . ĠIn Ġthis Ġreport , Ġwe Ġpresent Ġa Ġgeneric Ġtrace Ġfor ĠCHR âĪ ¨ Ġbased Ġon Ġits Ġrefined Ġoperational Ġsemantics Ġ[ 5 ], Ġand Ġdescribe Ġa Ġfirst Ġprototype Ġdeveloped Ġfor ĠSW I - Pro log ĠCHR âĪ ¨ Ġengine . ĠThe Ġimplementation Ġconsists Ġof Ġcombining Ġthe Ġoriginal Ġtrace Ġof Ġthe ĠSW I Ġengine Ġwith Ġsource Ġcode Ġinformation Ġto Ġget Ġgeneric Ġtrace Ġevents , Ġand Ġthen , Ġallowing Ġthe Ġuser Ġto Ġfilter Ġthese Ġevents Ġusing Ġan ĠSQL - based Ġlanguage . ĠÏ ī r âĪ ¨ Ġ ĠThis Ġreport Ġis Ġorganized Ġas Ġfollows . ĠSection Ġ2 Ġgives Ġa Ġshort Ġintroduction Ġto Ġgeneric Ġtraces , Ġobservational Ġsemantics Ġand Ġfaith fulness . ĠSection Ġ3 Ġpresents ĠCHR âĪ ¨ Ġ, Ġthe Ġformal Ġspecification Ġof Ġits Ġoperational Ġsemantics , Ġbased Ġon Ġthe ĠÏ ī r âĪ ¨ Ġsemantics , Ġand Ġthe Ġrequirements Ġfor Ġthe Ġgeneric Ġtrace , Ġas Ġits Ġsyntax Ġas Ġwell . ĠSection Ġ4 Ġpresents Ġthe Ġobservational Ġsemantics Ġof ĠCHR âĪ ¨ Ġ, ĠOS - CHR âĪ ¨ Ġ, Ġdefining Ġformally Ġthe Ġgeneric Ġtrace , Ġand Ġshows Ġits Ġfaith fulness . ĠSection Ġ5 Ġintroduces Ġan Ġexecutable Ġoperational Ġsemantics Ġof ĠCHR âĪ ¨ Ġdefined Ġin ĠSW I ĠPro log Ġ( the Ġcode Ġis Ġin Ġthe Ġannex ) Ġand Ġused Ġto Ġtest Ġits Ġformal Ġsemantics . ĠSection Ġ6 Ġdescribes Ġthe ĠCHR - SW I - Pro log Ġbased Ġprototype Ġof Ġthe Ġgeneric Ġtrace . ĠSection Ġ7 Ġpresents Ġsome Ġexperimentation . ĠDiscussion Ġand Ġconclusions Ġare Ġin Ġthe Ġtwo Ġlast Ġsections . Ġ Ġ2 Ġ ĠGeneric ĠTrace , ĠObserv ational ĠSem antics Ġand ĠSub trace Ġ ĠThe Ġconcept Ġof Ġgeneric Ġtrace Ġhas Ġbeen Ġfirst Ġintroduced Ġin Ġ[ 14 ], Ġformally Ġdefined Ġin Ġ[ 6 , Ġ7 ], Ġand Ġa Ġfirst Ġapplication Ġto ĠCHR Ġpresented Ġin Ġ[ 17 ]. ĠA Ġgeneric Ġtrace Ġis Ġa Ġtrace Ġwith Ġa Ġspecification Ġbased Ġon Ġa Ġpartial Ġoperational Ġsemantics Ġapplicable Ġto Ġa Ġfamily Ġof Ġprocesses . ĠWe Ġgive Ġhere Ġits Ġmain Ġcharacteristics Ġand Ġthe Ġway Ġto Ġspecify Ġa Ġgeneric Ġtrace . Ġ Ġ2 . 1 Ġ ĠPrel im in aries Ġ ĠA Ġtrace Ġconsists Ġof Ġan Ġinitial Ġstate Ġs 0 Ġfollowed Ġby Ġan Ġordered Ġfinite Ġor Ġinfinite Ġsequence Ġof Ġtrace Ġevents , Ġden oted Ġ< Ġs 0 Ġ, Ġe Ġ> . ĠT Ġis Ġa Ġset Ġof Ġtraces Ġ( f inite Ġor Ġinfinite ). ĠA Ġprefix Ġ( f inite , Ġof Ġsize Ġt ) Ġof Ġa ĠRR Ġn Â° Ġ79 39 Ġ Ġ Č 4 Ġ ĠGon Ã§ al ves Ġ& ĠDer ans art Ġ& Ġothers Ġ ĠObs . ĠProcess Ġ ĠT ^ v Ġ ĠExtract or ĠE Ġ ĠT ^ w Ġ ĠRe builder Ġ ĠT ^ v Ġ ĠI Ġ ĠFigure Ġ1 : ĠExt raction , ĠReconstruction , ĠFaith fulness ĠProperty Ġtrace ĠT Ġ= < Ġs 0 Ġ, Ġen Ġ> Ġ( f inite Ġor Ġinfinite , Ġhere Ġof Ġsize Ġn Ġâī¥ Ġt ) Ġis Ġa Ġpartial Ġtrace ĠUt Ġ= < Ġs 0 Ġ, Ġet Ġ> Ġwhich Ġcorresponds Ġto Ġthe Ġt Ġfirst Ġevents Ġof ĠT Ġ, Ġwith Ġan Ġinitial Ġstate Ġat Ġthe Ġbeginning . ĠT Ġmay Ġcontain Ġany Ġprefix es Ġof Ġits Ġelements . ĠA Ġtrace Ġcan Ġbe Ġdecomp osed Ġinto Ġsegments Ġcontaining Ġtrace Ġevents Ġonly , Ġexcept Ġprefix es Ġwhich Ġstart Ġwith Ġa Ġstate . ĠAn Ġassoci ative Ġoperator Ġof Ġconc aten ation Ġwill Ġbe Ġused Ġto Ġdenote Ġsequences Ġconc aten ations Ġ( den oted Ġ++ ). ĠIt Ġwill Ġbe Ġomitted Ġif Ġthere Ġis Ġno Ġambiguity . ĠThe Ġneutral Ġelement Ġis Ġ[] Ġ( empty Ġsequence ). ĠA Ġsegment Ġ( or Ġprefix ) Ġof Ġsize Ġ0 Ġis Ġeither Ġan Ġempty Ġsequence Ġor Ġa Ġstate . ĠTr aces Ġare Ġused Ġto Ġrepresent Ġthe Ġevolution Ġof Ġsystems Ġby Ġdescribing Ġthe Ġevolution Ġof Ġtheir Ġstate . ĠA Ġstate Ġof Ġthe Ġsystem Ġis Ġdescribed Ġby Ġa Ġgiven Ġfinite Ġset Ġof Ġparameters Ġand Ġa Ġstate Ġcorresponds Ġto Ġa Ġset Ġof Ġvalues Ġof Ġparameters . ĠSuch Ġstates Ġwill Ġbe Ġsaid Ġvirtual Ġas Ġthey Ġcorrespond Ġto Ġstates Ġof Ġthe Ġobserved Ġsystem , Ġbut Ġthey Ġare Ġnot Ġactually Ġtraced . ĠWe Ġwill Ġthus Ġdistinguish Ġbetween Ġactual Ġand Ġvirtual Ġtraces . ĠâĢ¢ Ġthe Ġactual Ġtraces Ġ( T Ġw Ġ) Ġare Ġa Ġway Ġto Ġobserve Ġthe Ġevolution Ġof Ġa Ġsystem Ġby Ġgenerating Ġtraces . ĠThe Ġevents Ġof Ġan Ġactual Ġtrace Ġhave Ġthe Ġform Ġe Ġ= Ġ( a ) Ġwhere Ġa Ġis Ġan Ġactual Ġstate Ġdescribed Ġby Ġa Ġset Ġof Ġattributes Ġvalues . ĠAn Ġactual Ġstates Ġis Ġdescribed Ġby Ġa Ġfinite Ġset Ġof Ġattributes . ĠActual Ġtraces Ġcorresponds Ġto Ġsequences Ġof Ġevents Ġproduced Ġby Ġa Ġtr acer Ġof Ġan Ġobserved Ġsystem . ĠThey Ġusually Ġencode Ġvirtual Ġstates Ġchanges Ġin Ġa Ġsynthetic Ġmanner . ĠâĢ¢ Ġthe Ġvirtual Ġtraces Ġ( T Ġv Ġ) Ġcorresponds Ġto Ġthe Ġsequence Ġof Ġthe Ġvirtual Ġstates Ġsuch Ġthat Ġfor Ġeach Ġtransition Ġin Ġthe Ġsystem Ġbetween Ġtwo Ġvirtual Ġstates , Ġit Ġcorresponds Ġan Ġactual Ġtrace Ġevent . ĠThe Ġvirtual Ġtrace Ġevents Ġhave Ġthe Ġform Ġe Ġ= Ġ( r , Ġs ) Ġwhere Ġr Ġis Ġa Ġtype Ġof Ġaction Ġassociated Ġwith Ġa Ġstate Ġtransition Ġand Ġs , Ġcalled Ġvirtual Ġstate , #/s"
,,,,


Next we might want to look in-depth about the attribution scores for each token of an example. We saved the attributions for the examples we looked at above, so we can easily retrieve the attributions. We also grab the examples because we want to know what tokens the attributions are associated with.

In [None]:
example = 589
attributions_sum = all_attributions[f"{example}"]
all_tokens2 = all_tokens[f"{example}"]

These functions return which words had the strongest (most positive and most negative) attributions. Change the number of tokens you wish to visualize for your needs. It takes in the attributions and the tokens we grabbed in the previous cell and returns lists of the topk (or bottomk) attributions, their respective token and their position.

Note: Remember that the attributions are with respect to the positive class, so the most impact tokens that helped the model predict the negative class will be in the botk attributed tokens.

In [None]:
def get_topk_attributed_tokens(attrs, all_tokens, k=20):
    values, indices = torch.topk(attrs, k)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

In [None]:
def get_botk_attributed_tokens(attrs, all_tokens, k=20):
    values, indices = torch.topk(attrs, k, largest=False)
    top_tokens = [all_tokens[idx] for idx in indices]
    return top_tokens, values, indices

Convert the values, index of the values, and the token into a pandas Dataframe for visualization. It will be sorted by highest value for attributions to lowest. Alternatively, if youre looking for the most negative attributions, it goes from lowest to highest.

In [None]:
top_words_start, top_words_val_start, top_word_ind_start = get_topk_attributed_tokens(attributions_sum, all_tokens2)
bot_words_start, bot_words_val_start, bot_word_ind_start = get_botk_attributed_tokens(attributions_sum, all_tokens2)

df_high = pd.DataFrame({'Word(Index), Attribution': ["{} ({}), {}".format(word, pos, round(val.item(),2)) for word, pos, val in zip(top_words_start, top_word_ind_start, top_words_val_start)]})

df_low = pd.DataFrame({'Word(Index), Attribution': ["{} ({}), {}".format(word, pos, round(val.item(),2)) for word, pos, val in zip(bot_words_start, bot_word_ind_start, bot_words_val_start)]})
# df_start.style.apply(['cell_ids: False'])

# ['{}({})'.format(token, str(i)) for i, token in enumerate(all_tokens)]

In [None]:
df_high

Unnamed: 0,"Word(Index), Attribution"
0,". (1729), 0.18"
1,". (1633), 0.15"
2,". (916), 0.12"
3,". (486), 0.11"
4,". (994), 0.09"
5,"ĠProcess (1634), 0.08"
6,". (89), 0.08"
7,". (600), 0.08"
8,". (278), 0.07"
9,". (1531), 0.06"


In [None]:
df_low

Unnamed: 0,"Word(Index), Attribution"
0,". (620), -0.79"
1,". (1299), -0.19"
2,"ĠProgramming (355), -0.15"
3,". (1546), -0.11"
4,"ĠLes (621), -0.1"
5,"Ġprogrammers (202), -0.1"
6,"Ġprogramming (898), -0.09"
7,"Ġprogramming (198), -0.09"
8,"Ġimplementations (277), -0.09"
9,"Ġimplementations (232), -0.08"


In [None]:
d = {"tokens":all_tokens2, "attribution":attributions_sum[:len(all_tokens2)].cpu()}

We notice that there are many repeating tokens in each example that have different positions. While we might want to know how the position plays into the attributions, if we want to know strictly based on tokens, we can add all the duplicate tokens together to get the aggregate attribution for each token. Therefore, we aggregate the attributions strictly based on token type.

In [None]:
df_attrib = pd.DataFrame(d)
aggregation_functions = {'attribution': 'sum'}
df_new = df_attrib.groupby(df_attrib['tokens']).aggregate(aggregation_functions)

In [None]:
highest_attrib_tokens = df_new.sort_values(by=['attribution'], ascending=False)
highest_attrib_tokens[:10]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
ĠProcess,0.08302
gmail,0.031209
ĠThis,0.02868
ĠCed,0.027149
ĠSection,0.026751
Ġvirtual,0.023682
Ġcan,0.023172
).,0.022913
og,0.01986
raint,0.01682


In [None]:
lowest_attrib_tokens = df_new.sort_values(by=['attribution'])
lowest_attrib_tokens[:10]

Unnamed: 0_level_0,attribution
tokens,Unnamed: 1_level_1
.,-0.359076
Ġimplementations,-0.30494
ĠProgramming,-0.232389
Ġdebugging,-0.198335
Ġprogramming,-0.185789
Ġlanguage,-0.163241
Ġprogrammers,-0.159463
Ġsemantics,-0.130746
Ġa,-0.104461
Ġto,-0.102399


Using this [notebook](https://colab.research.google.com/drive/1lktilbL1IY4nBanlzCdP8TLsBNfUsl_U?usp=sharing), we can get the files to view the aggregated attributions for the entire dataset for both the positive and negative classes. This means we summed up and averaged the attributions for every instance of any given token throughout the entire dataset (whether or not they have positive or negative attributions).

In [None]:
df_word = pd.read_csv("/content/drive/MyDrive/cogs402longformer/results/papers/papers_attributions/longformer_emb_papers.csv")

Here we see the highest attributions for the positive class, meaning that these tokens have the most influence when the model tries to predict positive. All of these words do have relevence to A.I. related topics.

In [None]:
df_word[:15]

Unnamed: 0,tokens,attribution
0,Ġlearning,0.163092
1,.,0.145281
2,Ġneural,0.110611
3,Ġdata,0.097347
4,",",0.077573
5,Ġthe,0.072926
6,Ġtraining,0.052609
7,Ġdataset,0.050907
8,Ġalgorithms,0.048352
9,ĠAI,0.045684


Here we see the largest attributions for the negative class, meaning that these tokens have the most influence when the model predicts negative.

In [None]:
df_word[:-15:-1]

Unnamed: 0,tokens,attribution
30061,Ġprogramming,-0.121651
30060,Ġprogram,-0.085085
30059,Ġprograms,-0.078384
30058,Ġlanguages,-0.070023
30057,Ġlanguage,-0.054024
30056,Ġ.,-0.053213
30055,Ġcode,-0.049736
30054,Ġsoftware,-0.037241
30053,Ġcompiler,-0.030792
30052,ĠProgramming,-0.029799
