# Trying out the transformers_interpret package

From https://towardsdatascience.com/introducing-transformers-interpret-explainable-ai-for-transformers-890a403a9470

and https://github.com/cdpierse/transformers-interpret/blob/master/notebooks/multiclass_classification_example.ipynb

## Imports etc.

In [1]:
# depends on your environment if you need to do this
#!pip install transformers_interpret

In [2]:
import pandas as pd
import json
import torch
from transformers import AutoTokenizer, BertConfig
from transformers import BertForSequenceClassification

## Reading the genre class labels

In [3]:
def read_labels_set(labels_set_path):
    with open(labels_set_path, "r") as f:
        labels2id = json.load(f)
    return labels2id

labels_set = read_labels_set("./data/labels_set.json")
id2main = {labels_set['main2id'][k] : k for k in labels_set['main2id']}
labels_num = len(id2main)
genre_list = [v for k, v in id2main.items()]

## Loading the ECCO-BERT-seq model

In [4]:
model_path = "./model_dir/ecco_genre_main_ecco_bert_100_epoches.pt"

model = BertForSequenceClassification.from_pretrained("TurkuNLP/eccobert-base-cased-v1", num_labels=labels_num, id2label=id2main,
label2id=labels_set['main2id'])                                                     
tokenizer = AutoTokenizer.from_pretrained("TurkuNLP/eccobert-base-cased-v1", truncation=True)

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
checkpoint = torch.load(model_path, map_location=device)
model.load_state_dict(checkpoint['net'])

Some weights of the model checkpoint at TurkuNLP/eccobert-base-cased-v1 were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification wer

<All keys matched successfully>

## Generating explanations for the ECCO-BERT-seq model: a toy example

Here we generate word attributions for a very short text for demonstration purposes.

In [5]:
from transformers_interpret import SequenceClassificationExplainer
sample_text = """constitutional legitimacy; the procedures that 
are supposed to improve decisions; the right to a hearing; and the 
allocation of power between regulators and judges."""
multiclass_explainer =  SequenceClassificationExplainer(model=model, tokenizer=tokenizer)


In [6]:
word_attributions = multiclass_explainer(text=sample_text)
multiclass_explainer.predicted_class_name

'Law'

### A helper function to print words in order of attribution magnitude (or in original order)

In [7]:
def print_sorted_word_attributions(attrbtns, nmbr=20, order_by_value=True, desc=True):
    
    if order_by_value: # Sort by attribution magnitude    
        rev = True if desc else False # descending or ascending order
        print("Top", nmbr, "words and attribution scores in", "descending" if desc else "ascending", "order by attribution score")  
        for i, attr in enumerate(sorted(attrbtns, key=lambda x: x[1], reverse=rev)):
            print(f"{attr[0]:20} {round(attr[1], 2)}")
            if i >= nmbr and nmbr > 0: break
    else: # original word order
        print("First", nmbr, "words and their attribution scores of the text")  
        for i, attr in enumerate(attrbtns):
            print(f"{attr[0]:20} {round(attr[1], 2)}")
            if i >= nmbr and nmbr > 0: break
    print()
        
print_sorted_word_attributions(word_attributions)

print_sorted_word_attributions(word_attributions, order_by_value=False)


Top 20 words and attribution scores in descending order by attribution score
judges               0.44
##s                  0.33
and                  0.3
to                   0.27
to                   0.25
;                    0.22
of                   0.21
and                  0.2
##cy                 0.19
procedure            0.19
##ation              0.18
improve              0.17
between              0.15
regula               0.14
are                  0.14
decisions            0.13
;                    0.12
##tors               0.12
;                    0.12
the                  0.12
right                0.1

First 20 words and their attribution scores of the text
[CLS]                0.0
constitutional       0.06
legit                -0.01
##ima                0.08
##cy                 0.19
;                    0.22
the                  0.1
procedure            0.19
##s                  0.33
that                 0.07
are                  0.14
supposed             0.02
to          

In [13]:
html = multiclass_explainer.visualize("visualization1") # saves the viz as visualization1.html in the working dir

True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
5.0,Law (1.00),Law,4.72,[CLS] constitutional legit ##ima ##cy ; the procedure ##s that are supposed to improve decisions ; the right to a hearing ; and the alloc ##ation of power between regula ##tors and judges . [SEP]
,,,,


## Explanation for a class that was not predicted

In [14]:
word_attributions = multiclass_explainer(sample_text, class_name="Arts")
print_sorted_word_attributions(word_attributions)
html = multiclass_explainer.visualize()

('.', 0.34489730710743405)
('are', 0.2560707753308969)
('right', 0.1915353113352107)
('alloc', 0.15597880193287889)
('and', 0.1513442767781147)
('improve', 0.10795363395400095)
('that', 0.1077961412263607)
('hearing', 0.08734221821030504)
('supposed', 0.07923205118779657)
(';', 0.06708737047052309)
('regula', 0.04765547956402858)
('[CLS]', 0.0)
('[SEP]', 0.0)
('to', -0.02222835523773314)
('procedure', -0.07472547267044707)
('##ima', -0.07476006222629673)
('the', -0.07976866887568236)
('to', -0.08544080843238734)
('the', -0.09840353261668852)
('the', -0.09930399304257154)
('##tors', -0.10755627795462852)


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,Law (0.00),Arts,-1.85,[CLS] constitutional legit ##ima ##cy ; the procedure ##s that are supposed to improve decisions ; the right to a hearing ; and the alloc ##ation of power between regula ##tors and judges . [SEP]
,,,,


## Explanations for the first three docs of ECCO-BERT test dataset
### Some preliminary stuff

In [8]:
pd_test_chunks = pd.read_csv("data/ecco_bert_seq_test_set_first_chunks.csv", sep="\t")
def subwords_to_original(subword_text):
    out = ""
    for token in subword_text.split():
            if token in ["[CLS]", "[SEP]"]: continue
            if token.startswith("##"): 
                out += token.replace("##", "")
            else:
                out += " " + token
    return out.strip()

pd_test_chunks["text"] = pd_test_chunks["chunk_content"].apply(subwords_to_original)
chunk_texts = pd_test_chunks["text"].tolist()

chunk_texts2 = []
#for i, text in enumerate(chunk_texts):
for text in chunk_texts:
    chunk_texts2.append(" ".join([word for word in text.split() if len(word)>2 and not word.isnumeric() and word.upper() not in ["THE", "AND"]]))


### The explanations

In [9]:
attributions_list = []
for i, text_chunk in enumerate(chunk_texts2[:3]):
    word_attributions = multiclass_explainer(text=text_chunk, internal_batch_size=16)
    attributions_list.append(word_attributions)
    #print(multiclass_explainer.predicted_class_name)
    html = multiclass_explainer.visualize("eccobert_viz_"+str(i))


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
7.0,Politics (1.00),Politics,9.51,[CLS] COLLE ##CT ##ION PAR ##LIAM ##ENT ##ARY DE ##B AT ##E EN ##G LAND FROM YE ##AR LX ##VIII present III Printed Year DCC XXXIX CON ENT ##S Speech both houses November Page Inquiry Commons into abuses corruptions Report Committee appointed infpe Eaft India company books Speaker house Commons expelled for corruption Farther proceedings against bribe ##ry corruption Commons resolve impeach Duke Leeds State coin enquired into Parliament dissolved new one called King Speech both houses Nov Bill for regulating trials cafes high treason Farther debates regulating coin Debate concerning grants made Earl Port land King Speech address both houses dif covery assass ##ination plot Asso ##ciation bill for security his Majesty person King Speech Parliament Od ##lob ##er i6 ##96 Methods for farther remedy ##ing ill tR ##ate Coin restoring publick credit Sir John Fen ##wick attainted high treaf ##on Speeches for against bill Lords protest against bill King Speech both houses April Ditto December Resolutions Commons about disband ##ing army raising supplies for paying all arrears debt VOL III Abuses Abuses Exchequer bills enquired into Proceedings against immoral ##ity prophane ##nefs Proceedings relating Ea ##Jf India trade com pan ##is ##L Mol ##yne ##ux cafe Ireland censured Com mons Parliament dissolved new one meets Decem King Speech Commons resolve band army King ##paf ##fes ##thed ##ifb ##anding bill His Speech there Upon Address thanks from both houses army disband ##ed r2 ##3 King message Commons about his Dutch guards Their answer Parliament prorog ##ued meets Nove ##rs Kings Speech both houses ibid Commons address King answer DP ##bat ##es about forfeited estates Ireland Parliament difl ##olved Decem T6 ##99 new one meets Feb King Speech ibid Addresses from both houses [SEP]
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,Arts (1.00),Arts,2.99,[CLS] SC ##Q ##OO ##L FOR SC ##A PER ##FO ##R ##ME ##D FOUR ##TH EDITION ID ##U PRIN ##teD YE ##AR DCC LXX ##km ##j RO ##LO trit ##ten ##l GAR ##RIC ##1 ##K Sct ool ##for Sca nal ell be ##Jc ##hr Nee ##ids there aJ ##Ih ##eo ##l thb ##is modi ##b art teach jou need lion nom ##w knowing think mii ##lt well taught eat drink Cau ##s dearth scandal jhould vapours Dii ##refs our fair ores let tbem read papers Tht powe ##tr ##ful mixture ##s ##fu ##cb disorders hit Cra ##ve what they will there quantum fiff ##icit Lord cries Lady Woo ##rm ##wood who loves talt ##le tius much salt pepper her prat ##tle 7y ##1i ris noon all night cards when tbr ##e ##jb ##ing Strong tea andf ##can ##dal blefs nme how refre hing Give papers Lif ##p how free fits Last night Lord sis was eau ##g ##Lt zv ##itb Lady For ach ##ing heads what charming fal volatile tip ##s Mrs will ##Jl ##ill continue flu ##rti ##ng hope draw und ##raw certain Fine satire pc ##z public all abuse But ourselves fiis our prai ##Je can reft ##ife Now Lis ##p read ##ly ##ou tbere far ies certain Lord had be ##I be ##zar ##e Iho lives not twenty miles from Gr ##cJ ##venor Square For boul ##d Lady find wii ##ng Woo ##rm ##wood bitter that vi ##Le ##ain Throw behind thefir ##e never more Let that vile paper come within door Thus our ##friend ##s laugh who feel thi dart reach our feelings ourselves mun ##f fina ##rt our jou ##ng bard foy ##oung think that Can flop thef ##ull spring tie calumny Know ##s world ##fo little its trade Alas h ##Y ##b devil sooner rais than laid fira ##s raif ##t mor ##fler there gag ##ging Cut Sea ##ndl ##e fiill tongue [SEP]
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
3.0,Education (1.00),Education,5.9,[CLS] LETTERS WR ##ITT ##EN LAT ##E RIGHT HON ##OUR ##ABLE PHIL DOR ##M ST ##s J ##H ##OPE EARL CH ##ESTER ##FIELD II ##S SON PHILIP AN ##H ##OE ES ##Z LAT ##E EN ##VO ##Y EXT ##r ##A ##OR ##BI ##NA ##RY COURT DR ##ES ##DEN Together with SE ##VER ##AL OT ##HE ##R PI ##EC ##ES VA ##RI ##OUS SUB ##JEC ##TS FOUR VOLUME ##S PUB ##LI ##S IE ##D FR ##n ORI ##GIN ##ALS MRS EU ##GE ##NI ##A STA ##N ##H ##OPE ELE ##VEN ##TI EDITION which are ione ##rte ##d their proper places Several Letters that were wanting time first Publication VOL FIRST LONDON PRINTED FOR DO ##DS ##LEY PAL ##L MAL ##L DCC RIGHT HON ##OUR ##ABLE LORD NOR ##TH FIRST LORD COMM ##ISS ##ONE ##R TH ##F TRE ##AS ##URY CHA ##N ##CELL ##OR THI EX ##CHE ##QUE ##R CHA ##N ##CELL ##OR TI ##lE UN ##IVERS ##ITY OX ##FORD KN ##IG ##H ##T MI ##OST NO ##BLE ORDER GA ##iR ##TER LORD RES ##UM ##ING friendship with which your Lordship honoured earlier par our lives remembrance which shall ever retain with most lively real senti ments gratitude under fanc ##Eion your name beg leave introduce world following Letters hope your Lordship approbation work written late EARL CII ##ESTER ##FIELD important fubjeat Education will not fail secure that Public shall then feel myself happy allured merit us ##her ##ing world useful performance usual flyle Dc ##dic ##ations would confident unple ##afing your Lordship therefore therefore decline Merit conspicuous your requires panegyric only view dedi ##cat ##ing this work your Lord ##Ih ##ip that may lasting memorial how much how really charader Great Minister united that Virtuous Man refpea ##ded din ##fit ##eref ##led unprejud ##iced none more than LORD Your Lordship most obedient most humble Servant Golden Square March Ist EU ##GE ##NI ##A STA ##X ##H ##Or ##E ADV ##ER ADV ##ER ##TIS ##EMENT death late Earl CH ##ESTER ##FIELD recent his Family [SEP]
,,,,


### Print attribution scores 

In [10]:
for x in attributions_list:
    print_sorted_word_attributions(x, nmbr=10, order_by_value=False)
    print_sorted_word_attributions(x, nmbr=10)
    #print(sum([y[1] for y in x]))
    print()

First 10 words and their attribution scores of the text
[CLS]                0.0
COLLE                0.17
##CT                 0.08
##ION                0.13
PAR                  0.29
##LIAM               0.41
##ENT                0.2
##ARY                0.03
DE                   0.08
##B                  0.03
AT                   0.02

Top 10 words and attribution scores in descending order by attribution score
##LIAM               0.41
Commons              0.31
PAR                  0.29
bill                 0.27
Bill                 0.26
##ENT                0.2
bill                 0.18
COLLE                0.17
houses               0.16
Address              0.16
Commons              0.14


First 10 words and their attribution scores of the text
[CLS]                0.0
SC                   0.03
##Q                  0.01
##OO                 0.07
##L                  0.03
FOR                  -0.01
SC                   0.05
##A                  0.12
PER                  0.52
##FO 

### Multiple occurrences of a token get each an individual score

In [15]:
for a in attributions_list[0]: # The first text
    if a[0] == "Commons":
        print(a)


('Commons', 0.31470648441148935)
('Commons', 0.053644883696917195)
('Commons', 0.10049565962878419)
('Commons', 0.1410960342329182)
('Commons', 0.09881720486117523)
('Commons', 0.032593903464668915)
('Commons', 0.04972009496101735)
