In this notebook, we will finetune the transformer model 'RoBERTa' on the AIS dataset and keep track of the performance of the model for a number of 10 to 15 epochs in total. We will also examine the performance of different interpretability techiniques on RoBERTa. Lime was not included in the code.

In [None]:
#We first need to conect to our drive, in order to access the projects files and store results
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import sys
sys.path.append('/content/drive/MyDrive/Thesis')

In [None]:
#Now, it is time to install the appropriate version of the transformers library
!pip install transformers-interpret==0.5.2
!pip install transformers==4.15.0
!pip install lime==0.2.0.1 #this line is included in order for 'myExplainers.py' to load properly
!pip install transformers[torch]

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers-interpret==0.5.2
  Downloading transformers-interpret-0.5.2.tar.gz (29 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting transformers>=3.0.0 (from transformers-interpret==0.5.2)
  Downloading transformers-4.30.2-py3-none-any.whl (7.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m77.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting captum>=0.3.1 (from transformers-interpret==0.5.2)
  Downloading captum-0.6.0-py3-none-any.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m52.5 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.14.1 (from transformers>=3.0.0->transformers-interpret==0.5.2)
  Downloading huggingface_hub-0.15.1-py3-none-any.whl (236 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m236.8/236.8 kB[0m [31m27.4 MB/s[0m e

In [None]:
#Imports of libraries required for finetuning and explaining RoBERTa
from sklearn.metrics import confusion_matrix, f1_score, accuracy_score, precision_score, recall_score, average_precision_score
from sklearn.model_selection import train_test_split
from helper import print_results, print_results_ap
from sklearn.preprocessing import maxabs_scale
from myModel import MyModel, MyDataset
from myEvaluation import MyEvaluation
from myExplainers import MyExplainer
from scipy.special import softmax
from dataset import Dataset
import tensorflow as tf
from tqdm import tqdm
import pandas as pd
import numpy as np
import warnings
import datetime
import pickle
import torch
import time
import csv
import re

In [None]:
#defining the paths of the model and data
data_path = '/content/drive/MyDrive/Thesis/'
model_path = '/content/drive/MyDrive/Thesis/'
save_path = '/content/drive/MyDrive/Thesis/Results/'

Now, it is time to name the model and to define the parameters of 'MyModel'
class that loads transformer models.

In [None]:
model_name = 'roberta'
existing_rationales = False #no explanations
task = 'single_label' #single-label
sentence_level = False #token level
labels = 2 #two labels

Now, let us load the AIS dataset, through the 'dataset.py' file and the 'load_AIS' function. X: are the instances, y: are the labels. The 'Dataset' class of 'dataset.py' is utilized.

In [None]:
ais = Dataset(path = data_path) #Dataset class is in 'dataset.py': parameters (path, x=None, y=None, rationales=None ,label_names=None)
x, y, label_names = ais.load_AIS() #function in Dataset class to load AIS dataset
label_names = ['class a', 'class b'] #the names for each of the two labels

In [None]:
train_texts, test_texts, train_labels, test_labels = train_test_split(x, y, test_size=.2, random_state=42)

size = (0.1 * len(y)) / len(train_labels)
train_texts, validation_texts, train_labels, validation_labels = train_test_split(list(train_texts), train_labels, test_size=size, random_state=42)

Now the dataset is not in the appropriate form for the transformer to process. It is necessary to define the tokenizer of the model, so as to call 'myDataset' class in 'myModel.py'.

In [None]:
from transformers import RobertaTokenizerFast

#unlike BERT and Distilbert, RoBERTa does not contain 'cs'
tokenizer = RobertaTokenizerFast.from_pretrained('roberta-base')

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

Now, it is time to transform the train, test and validation sets to the
appropriate form. We will use 'MyDataset' class from 'myModel.py'.

In [None]:
train_dataset = MyDataset(train_texts, train_labels, tokenizer)
validation_dataset = MyDataset(validation_texts, validation_labels, tokenizer)
#test_dataset = MyDataset(test_texts, test_labels, tokenizer)

But before using 'MyModel' class from 'myModel.py', RoBERTa should be finetuned!

In [None]:
from transformers import Trainer, TrainingArguments
from myTransformer import RobertaForSequenceClassification as transformer_model


#calling the base pretrained RoBERTa model
model = transformer_model.from_pretrained('roberta-base',num_labels = len(label_names) , output_attentions=True,
                              output_hidden_states=True)

#the training arguments that we will pass to the trainer of the transformers. 15 epochs were used for training
training_arguments = TrainingArguments(evaluation_strategy='epoch', save_strategy='epoch', logging_strategy='epoch',
                                                log_level='critical', output_dir='./results', num_train_epochs=15,
                                                per_device_train_batch_size=8, per_device_eval_batch_size=8,
                                                warmup_steps=200, weight_decay=0.01, logging_dir='./logs')

#passing to the trainer the model, the arguments and all train and validation instances
trainer = Trainer(model=model, args=training_arguments, train_dataset=train_dataset, eval_dataset=validation_dataset)

#Let's train the model!
trainer.train()

Downloading model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.dense.weight', 'lm_head.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias', 'classifier.dense.weight']
You should pr

Epoch,Training Loss,Validation Loss
1,0.2631,0.149041
2,0.1717,0.167369
3,0.2136,0.161581
4,0.2395,0.160475
5,0.1207,0.161386
6,0.181,0.167841
7,0.2199,0.14746
8,0.1333,0.136501
9,0.093,0.123444
10,0.0833,0.123685


TrainOutput(global_step=3975, training_loss=0.13864070676407725, metrics={'train_runtime': 2766.9046, 'train_samples_per_second': 11.471, 'train_steps_per_second': 1.437, 'total_flos': 7372495104494400.0, 'train_loss': 0.13864070676407725, 'epoch': 15.0})

Now, it is time to save the model in 'roberta_ais' file.

In [None]:
trainer.model.save_pretrained('/content/drive/MyDrive/Thesis/roberta_ais')

Now, we can use 'MyModel' and make then make predictions.

In [None]:
#new model
model = MyModel(model_path,'roberta_ais', model_name, task, labels, 'cased')

#the maximum number of tokens a single sentence can have e.g. 512
max_sequence_len = model.tokenizer.max_len_single_sentence

#again the tokenizer is RobertaTokenizerFast, that is selected through 'MyModel' and '__load_model__' function
tokenizer = model.tokenizer

#gpu training
torch.cuda.is_available()
model.trainer.model.to('cuda')

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(50265, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-11): 12 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
             

Then, we measure the performance of the model using average precision score and f1 score (both macro).



In [None]:
predictions = []

#time for predictions
starting_prediction_time = time.time()

for test_text in test_texts:
    outputs = model.my_predict(test_text)
    predictions.append(outputs[0])

#printing the total time that predictions took
ending_prediction_time = time.time()
total_time = ending_prediction_time - starting_prediction_time
print('The total time for predictions is:' ,round(total_time,3),' seconds')

The total time for predictions is: 23.847  seconds


In [None]:
#labels of the predictions
pred_labels = []

for prediction in predictions:
    pred_labels.append(np.argmax(softmax(prediction)))

def average_precision_wrapper(y, y_pred, view):
    return average_precision_score(y, y_pred.toarray(), average=view)

#macro scores
p_s = f"Average precision score: {round(average_precision_score(test_labels, pred_labels, average='macro'),4)} %"
f1 = f"f1 score score: {round(f1_score(test_labels, pred_labels, average='macro'),4)} %"

#printing results
print(p_s)
print(f1)

Average precision score: 0.8748 %
f1 score score: 0.9605 %


We can also change the hyperparameters for training, but we notice that the performance of RoBERTa is already satisfactory and the focus should be shifted on the interpretations. Let's store the results in the 'Results' file.

In [None]:
#the data to write in the file
data = (p_s, f1)
now = datetime.datetime.now()
file_name = save_path + 'ROBERTA_AIS'+str(now.day) + '_' + str(now.month) + '_' + str(now.year)

#results in files
with open(file_name+ 'PERFORMANCE_ON_AIS.pickle', 'wb') as handle:
    pickle.dump(data, handle, protocol=pickle.HIGHEST_PROTOCOL) #data
    #pickle.dump(f1, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open(file_name+'TIME_ON_AIS.pickle', 'wb') as handle:
    pickle.dump(total_time, handle, protocol=pickle.HIGHEST_PROTOCOL)

Let's ensure that the results are properly loaded from the file that we stored them.

In [None]:
with open(file_name+'PERFORMANCE_ON_AIS.pickle', 'rb') as handle:
     performance = pickle.load(handle)
     for score in performance:
         print(score)

with open(file_name+'TIME_ON_AIS.pickle', 'rb') as handle:
     time = pickle.load(handle)
     print('The total time for predictions is:' ,round(time,3),' seconds')

Average precision score: 0.8748 %
f1 score score: 0.9605 %
The total time for predictions is: 23.847  seconds


Now, let us initialize the explainers and the evaluation module, as well as define the metrics that will be utilized. In this case, the following is true:
* F=Faithfulness
* FTP=RFT (Ranked Faithful Truthfulness)
* NZW=Complexity

In [None]:
#layers are 12 this time
my_explainers = MyExplainer(label_names, model, layers=12)

#complexity, faithfulness, RFT
my_evaluators = MyEvaluation(label_names, model.my_predict, False, True, tokenizer=tokenizer) #parameters: (label_names, predict, sentence_level, evaluation_level_all=True)
my_evaluatorsP = MyEvaluation(label_names, model.my_predict, False, False, tokenizer=tokenizer)

evaluation =  {'F':my_evaluators.faithfulness, 'FTP': my_evaluators.faithful_truthfulness_penalty,
          'NZW': my_evaluators.nzw}
evaluationP = {'F':my_evaluatorsP.faithfulness, 'FTP': my_evaluatorsP.faithful_truthfulness_penalty,
          'NZW': my_evaluatorsP.nzw}

We will now measure the performance of IG.

In [None]:
import time
with warnings.catch_warnings():

    #ignore the warnings
    warnings.simplefilter("ignore", category=RuntimeWarning)

    #date
    now = datetime.datetime.now()

    #saving results
    file_name = save_path + 'AIS_ROBERTA_IG_'+str(now.day) + '_' + str(now.month) + '_' + str(now.year)

    #metrics
    metrics = {'F':[], 'FTP':[], 'NZW':[]}
    metricsP = {'F':[], 'FTP':[], 'NZW':[]}

    #time_r = [[],[]]: sublists for each technique
    time_r = [ [] ] #now only ig is present

    #neighnbors
    #my_explainers.neighbours = 2000

    #ig
    techniques = [my_explainers.ig]

    #for each test instance
    for ind in tqdm(range(len(test_texts))): #progress bar

        #to not run out of memory
        torch.cuda.empty_cache()

        #the instance of test set
        instance = test_texts[ind]

        #reseting the state memory
        my_evaluators.clear_states()
        my_evaluatorsP.clear_states()

        #prediction, attention matrix and hidden states. Here we care about predictions
        prediction, _, _ = model.my_predict(instance)

        #RobetaTokenizerFast
        enc = model.tokenizer([instance,instance], truncation=True, padding=True)[0] #first element of output dict: input IDs

        #real tokens or padding: extracting the mask
        mask = enc.attention_mask

        #extract special tokens
        tokens = enc.tokens

        interpretations = []
        kk = 0

        #ig now. This piece of code did not change. because other techniques will be included later
        for technique in techniques:
            ts = time.time()

            #returns interpretations
            temp = technique(instance, prediction, tokens, enc.ids, _, _) #no attention and hidden states

            #normalization in interpretations
            interpretations.append([np.array(i)/np.max(abs(np.array(i))) for i in temp])

            #append the time it took
            time_r[kk].append(time.time()-ts)
            kk = kk + 1

        #'F','FTP','NZW'
        for metric in metrics.keys():
            evaluated = []
            for interpretation in interpretations:

                #all parameters: interpretation, tweaked_interpretation, instance, prediction, tokens, hidden_states, t_hidden_states, rationales
                evaluated.append(evaluation[metric](interpretation, _, instance, prediction, tokens, _, _, _))

            #save evaluations in dict
            metrics[metric].append(evaluated)

        #copy of saved state
        my_evaluatorsP.saved_state = my_evaluators.saved_state.copy()

        #clear again all states
        my_evaluators.clear_states()

        for metric in metrics.keys():
            evaluatedP = []
            for interpretation in interpretations:

                #in a similar way as 'evaluation'
                evaluatedP.append(evaluationP[metric](interpretation, _, instance, prediction, tokens, _, _, _))

            #save evaluations
            metricsP[metric].append(evaluatedP)

        #write results to files
        with open(file_name+'(A).pickle', 'wb') as handle:
            pickle.dump(metrics, handle, protocol=pickle.HIGHEST_PROTOCOL)
        with open(file_name+'(P).pickle', 'wb') as handle:
            pickle.dump(metricsP, handle, protocol=pickle.HIGHEST_PROTOCOL)
        with open(file_name+'_TIME.pickle', 'wb') as handle:
            pickle.dump(time_r, handle, protocol=pickle.HIGHEST_PROTOCOL)

time_r = np.array(time_r)
time_r.mean(axis=1)

100%|██████████| 605/605 [1:00:43<00:00,  6.02s/it]


array([1.19026279])

In [None]:
print(time_r)
print(time_r.mean(axis=1))

[[0.78159523 0.49816847 0.51699853 0.83777356 0.92657185 1.71087122
  2.58008814 1.08097816 1.1284318  1.25613499 1.12848997 2.09711313
  1.77950954 1.60280442 0.85066175 0.51620555 0.59091997 0.85279918
  0.86865735 3.19513083 1.59137106 1.32690096 0.73097229 2.67304301
  2.34632516 0.57950306 0.53350472 0.47746515 0.39296961 1.27008009
  0.88113761 0.50737739 0.69517303 0.63658786 1.77142572 0.5267024
  5.85346293 0.50705767 1.58502507 0.89132977 0.69068074 0.61295199
  1.59618998 0.51980233 2.41342187 0.70619917 0.57441783 1.41328502
  0.69363499 1.12785196 1.55663276 1.33428097 0.51152921 0.87528586
  0.69900227 0.51734018 0.51792431 0.98744965 0.50767899 0.60384536
  1.35306907 0.89589739 1.69043708 3.369138   0.51810765 1.25248051
  3.24055529 1.76160383 0.50752401 1.17659068 0.59715676 1.85795522
  0.78193307 0.57913446 1.75578499 0.86573219 0.61603332 4.00064564
  1.5807085  0.51495433 2.83167267 0.63396716 0.85435891 0.61470056
  0.5096612  0.51472521 0.70613289 0.47670889 0.6

Now, let us print the results for IG

In [None]:
print_results(file_name+'(A)', [' IG  '], metrics, label_names)

F
 IG    0.006949999835342169 | -0.00012 0.01402
FTP
 IG    0.01573 | 0.01573 0.01573
NZW
 IG    1.0 | 1.0 1.0


  avg = a.mean(axis)
  ret = ret.dtype.type(ret / rcount)


In [None]:
print_results(file_name+'(P)', [' IG  '], metricsP, label_names)

F
 IG    0.04995 | 0.0091 0.0908
FTP
 IG    0.03399 | 0.00762 0.06035
NZW
 IG    1.0 | 1.0 1.0


We will now experiment on various attention setups.

In [None]:
conf = []
#'Mean', 'Multi', 0, 1, 2, 3, 4, 5
for ci in ['Mean', 'Multi'] + list(range(12)):

    #'Mean', 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
    for ce in ['Mean'] + list(range(12)):

        # Matrix: From, To, MeanColumns, MeanRows, MaxColumns, MaxRows (rows?)
        for cp in ['From', 'To', 'MeanColumns', 'MaxColumns']:

            # Selection: True: select layers per head, False: do not
            for cl in [False]:
                conf.append([ci, ce, cp, cl])

len(conf) #8*13*4*1

728

In [None]:
import time
with warnings.catch_warnings():

    #ignore the warnings
    warnings.simplefilter("ignore", category=RuntimeWarning)

    #date
    now = datetime.datetime.now()

    #saving results
    file_name = save_path + 'AIS_ROBERTA_ATTENTION_'+str(now.day) + '_' + str(now.month) + '_' + str(now.year)

    #metrics
    metrics = {'FTP':[], 'F':[], 'NZW':[]}
    metricsP = {'FTP':[], 'F':[], 'NZW':[]}

    #times
    time_r = []
    time_b = []
    time_b2 = []

    #attentions setups
    for con in conf:
        time_r.append([])

    #for the first 10 instances
    for ind in tqdm(range(len(test_texts))):

        #to not run out of memory
        torch.cuda.empty_cache()

        #one instance
        instance = test_texts[ind]

        #clear states of evaluators
        my_evaluators.clear_states()
        my_evaluatorsP.clear_states()

        #save calculated configurations
        my_explainers.save_states = {}

        #prediction, attention matrix and hidden states. Here we care about predictions and attention.
        prediction, attention, _ = model.my_predict(instance)

        #RobertaTokenizerFast
        enc = model.tokenizer([instance,instance], truncation=True, padding=True)[0]

        #real tokens or padding: extracting the mask
        mask = enc.attention_mask

        #extract special tokens
        tokens = enc.tokens

        interpretations = []
        kk = 0
        for con in conf:

            #time
            ts = time.time()

            #set configuration
            my_explainers.config = con

            #returns interpretations
            temp = my_explainers.my_attention(instance, prediction, tokens, mask, attention, _) #no hidden states

            #scaling interpretations
            interpretations.append([maxabs_scale(i) for i in temp])

            #append time
            time_r[kk].append(time.time()-ts)
            kk = kk + 1

        #'F','FTP','NZW'
        for metric in metrics.keys():
            evaluated = []
            k = 0

            for interpretation in interpretations:
                tt = time.time()

                #all parameters: interpretation, tweaked_interpretation, instance, prediction, tokens, hidden_states, t_hidden_states, rationales
                evaluated.append(evaluation[metric](interpretation, _, instance, prediction, tokens, _, _, _))
                k = k + (time.time()-tt) #time
            if metric == 'FTP':
                time_b.append(k)
            metrics[metric].append(evaluated)

        my_evaluatorsP.saved_state = my_evaluators.saved_state.copy()

        for metricP in metricsP.keys():
            evaluated = []
            k = 0

            for interpretation in interpretations:
                tt = time.time()

                #all parameters: interpretation, tweaked_interpretation, instance, prediction, tokens, hidden_states, t_hidden_states, rationales
                evaluated.append(evaluationP[metricP](interpretation, _, instance, prediction, tokens, _, _, _))
                k = k + (time.time()-tt)

            if metricP == 'FTP':
                time_b2.append(k)
            metricsP[metricP].append(evaluated)

        if(ind != 0):
            with open(file_name+' (A).pickle', 'rb') as handle:
                old_metrics = pickle.load(handle)
            with open(file_name+' (P).pickle', 'rb') as handle:
                old_metricsP = pickle.load(handle)

            #append new results
            for key in metrics.keys():
                old_metrics[key].append(metrics[key][0])
                old_metricsP[key].append(metricsP[key][0])
        else:
            old_metrics = metrics
            old_metricsP = metricsP

        #save metrics as below
        with open(file_name+' (A).pickle', 'wb') as handle:
            pickle.dump(old_metrics, handle, protocol=pickle.HIGHEST_PROTOCOL)
        with open(file_name+' (P).pickle', 'wb') as handle:
            pickle.dump(old_metricsP, handle, protocol=pickle.HIGHEST_PROTOCOL)
        with open(file_name+'_TIME.pickle', 'wb') as handle:
            pickle.dump(time_r, handle, protocol=pickle.HIGHEST_PROTOCOL)

        del old_metrics,old_metricsP
        metrics = {'FTP':[], 'F':[], 'NZW':[]}
        metricsP = {'FTP':[], 'F':[], 'NZW':[]}

#times
time_r = np.array(time_r)
time_r.mean(axis=1).min(),time_r.mean(axis=1).max(), time_r.mean(axis=1).mean(), time_r.sum(axis=1).mean(), np.mean(time_b), np.mean(time_b2)

100%|██████████| 605/605 [6:02:58<00:00, 36.00s/it]


(0.0018939014308708758,
 0.006336707517135242,
 0.0023449649982713954,
 1.4187038239541945,
 17.65660125834883,
 6.502975096190271)

In [None]:
print(time_r)
print(time_r.mean(axis=1).min())
time_r.mean(axis=1).max()
time_r.sum(axis=1).mean()
print(time_b)
np.mean(time_b)
print(time_b2)
np.mean(time_b2)

[[0.00175571 0.00143337 0.00200272 ... 0.00522447 0.00221515 0.00157142]
 [0.00107288 0.00098825 0.00141764 ... 0.002563   0.0013926  0.00088835]
 [0.00103211 0.00089455 0.00139713 ... 0.00242925 0.00138664 0.00091553]
 ...
 [0.00077558 0.00069785 0.00081992 ... 0.0028944  0.00122213 0.00064015]
 [0.00078058 0.00078249 0.00071907 ... 0.00273681 0.00110555 0.00072479]
 [0.00072169 0.00070405 0.00077486 ... 0.00229907 0.00101161 0.00069737]]
0.0018939014308708758
[7.987764120101929, 6.540454864501953, 6.010181665420532, 8.181306838989258, 11.642895936965942, 30.464173793792725, 58.58109664916992, 13.80558967590332, 14.400312662124634, 16.43560266494751, 14.037477970123291, 37.34131455421448, 32.230727672576904, 23.11101794242859, 7.975375413894653, 3.7676632404327393, 4.6281492710113525, 8.091620922088623, 8.615951538085938, 72.5359537601471, 24.046855211257935, 16.94494652748108, 7.151939630508423, 58.36054301261902, 44.167465686798096, 3.9738316535949707, 3.6704206466674805, 2.75102877

6.502975096190271

In [None]:
#print_results(file_name+' (A)', conf, metrics, label_names)

with open(file_name+' (A).pickle', 'rb') as handle:
    metrics = pickle.load(handle)

In [None]:
#print_results(file_name+' (P)', conf, metricsP, label_names)

with open(file_name+' (P).pickle', 'rb') as handle:
    metricsP = pickle.load(handle)

We calculate the best attention setup using Optimus variations (we do not use the Optimus implementation at this step).

In [None]:
print_results_ap(metrics, label_names, conf)

  avg = a.mean(axis)
  ret = ret.dtype.type(ret / rcount)


Baseline: -1.5827614384920918e-08  and NZW: 1.0
Max Across: -3.571144974224042e-09  and NZW: 1.0
Per Label Per Instance: 0.022535362451564796  and NZW:  0.9960016206138744
Per Instance: 2.5943074867182392e-08  and NZW:  0.9953173985173064


In [None]:
print_results_ap(metricsP, label_names, conf)

Baseline: 0.09533634569648497  and NZW: 1.0
Max Across: 0.09877549364675645  and NZW: 1.0
Per Label Per Instance: 0.11612263414100682  and NZW:  0.9978710198795792
Per Instance: 0.11612263414100682  and NZW:  0.9978710198795792


We repeat the process with Attention Scores with negative values (A*), thus by skipping the Softmax function. In the attention setups, we exclude the multiplication option in heads and layers, as a few combinations reach +/-inf.

In [None]:
conf = []
for ci in ['Mean'] + list(range(12)):
    for ce in ['Mean'] + list(range(12)):
        for cp in ['From', 'To', 'MeanColumns', 'MaxColumns']: # Matrix: From, To, MeanColumns, MeanRows, MaxColumns, MaxRows
            for cl in [False]: # Selection: True: select layers per head, False: do not
                conf.append([ci, ce, cp, cl])
len(conf)

676

In [None]:
import time
import math
with warnings.catch_warnings():

    warnings.simplefilter("ignore", category=RuntimeWarning)

    now = datetime.datetime.now()

    file_name = save_path + 'AIS_ROBERTA_A_ATTENTION_NO_SOFTMAX_'+str(now.day) + '_' + str(now.month) + '_' + str(now.year)

    metrics = {'FTP':[], 'F':[], 'NZW':[]}
    metricsP = {'FTP':[], 'F':[], 'NZW':[]}

    time_r = []
    time_b = []
    time_b2 = []

    for con in conf:
        time_r.append([])

    for ind in tqdm(range(len(test_texts))):
        torch.cuda.empty_cache()

        instance = test_texts[ind]

        my_evaluators.clear_states()
        my_evaluatorsP.clear_states()

        my_explainers.save_states = {}

        prediction, _, hidden_states = model.my_predict(instance)

        enc = model.tokenizer([instance,instance], truncation=True, padding=True)[0]

        mask = enc.attention_mask

        tokens = enc.tokens

        attention = []

        for la in range(12):
            our_new_layer = []
            bob = model.trainer.model.roberta.encoder.layer[la].attention
            has = hidden_states[la]
            aaa = bob.self.key(torch.tensor(has).to('cuda'))
            bbb = bob.self.query(torch.tensor(has).to('cuda'))
            for he in range(12):
                attention_scores = torch.matmul(bbb[:,he*64:(he+1)*64], aaa[:,he*64:(he+1)*64].transpose(-1, -2))
                attention_scores = attention_scores / math.sqrt(64)
                our_new_layer.append(attention_scores.cpu().detach().numpy())
            attention.append(our_new_layer)
        attention = np.array(attention)

        interpretations = []
        kk = 0
        for con in conf:
            ts = time.time()
            my_explainers.config = con
            temp = my_explainers.my_attention(instance, prediction, tokens, mask, attention, _)
            interpretations.append([maxabs_scale(i) for i in temp])
            time_r[kk].append(time.time()-ts)
            kk = kk + 1
        for metric in metrics.keys():
            evaluated = []
            k = 0
            for interpretation in interpretations:
                tt = time.time()
                evaluated.append(evaluation[metric](interpretation, _, instance, prediction, tokens, _, _, _))
                k = k + (time.time()-tt)
            if metric == 'FTP':
                time_b.append(k)
            metrics[metric].append(evaluated)
        my_evaluatorsP.saved_state = my_evaluators.saved_state.copy()
        for metric in metrics.keys():
            evaluated = []
            k = 0
            for interpretation in interpretations:
                tt = time.time()
                evaluated.append(evaluationP[metric](interpretation, _, instance, prediction, tokens, _, _, _))
                k = k + (time.time()-tt)
            if metric == 'FTP':
                time_b2.append(k)
            metricsP[metric].append(evaluated)
        with open(file_name+' (A).pickle', 'wb') as handle:
            pickle.dump(metrics, handle, protocol=pickle.HIGHEST_PROTOCOL)
        with open(file_name+' (P).pickle', 'wb') as handle:
            pickle.dump(metricsP, handle, protocol=pickle.HIGHEST_PROTOCOL)
        with open(file_name+'_TIME.pickle', 'wb') as handle:
            pickle.dump(time_r, handle, protocol=pickle.HIGHEST_PROTOCOL)
time_r = np.array(time_r)
time_r.mean(axis=1).min(),time_r.mean(axis=1).max(), time_r.mean(axis=1).mean(), time_r.sum(axis=1).mean(), np.mean(time_b), np.mean(time_b2)

100%|██████████| 605/605 [4:45:54<00:00, 28.35s/it]


(0.002224014219173715,
 0.004559555526607292,
 0.0023184191044446417,
 1.4026435581890084,
 15.354257265595365,
 5.426142300456023)

In [None]:
print_results(file_name+' (A)', conf, metrics, label_names)

FTP
['Mean', 'Mean', 'From', False]  -0.0 | -0.01272 0.01272
['Mean', 'Mean', 'To', False]  -0.0 | -0.0016 0.0016
['Mean', 'Mean', 'MeanColumns', False]  -0.0 | -0.01345 0.01345
['Mean', 'Mean', 'MaxColumns', False]  -0.0 | -0.00818 0.00818
['Mean', 0, 'From', False]  -0.0 | -0.01991 0.01991
['Mean', 0, 'To', False]  -0.0 | -0.00143 0.00143
['Mean', 0, 'MeanColumns', False]  -0.0 | -0.01814 0.01814
['Mean', 0, 'MaxColumns', False]  -0.0 | -0.00719 0.00719
['Mean', 1, 'From', False]  -0.0 | -0.00607 0.00607
['Mean', 1, 'To', False]  -0.0 | 0.00185 -0.00185
['Mean', 1, 'MeanColumns', False]  -0.0 | -0.01405 0.01405
['Mean', 1, 'MaxColumns', False]  -0.0 | -0.01394 0.01394
['Mean', 2, 'From', False]  -0.0 | -0.01014 0.01014
['Mean', 2, 'To', False]  -0.0 | -0.0012 0.0012


  avg = a.mean(axis)
  ret = ret.dtype.type(ret / rcount)


['Mean', 2, 'MeanColumns', False]  -0.0 | -0.01596 0.01596
['Mean', 2, 'MaxColumns', False]  -0.0 | -0.00231 0.00231
['Mean', 3, 'From', False]  -0.0 | -0.01024 0.01024
['Mean', 3, 'To', False]  -0.0 | -0.00164 0.00164
['Mean', 3, 'MeanColumns', False]  -0.0 | -0.00854 0.00854
['Mean', 3, 'MaxColumns', False]  -0.0 | -0.01024 0.01024
['Mean', 4, 'From', False]  -0.0 | -0.00869 0.00869
['Mean', 4, 'To', False]  -0.0 | -0.00027 0.00027
['Mean', 4, 'MeanColumns', False]  -0.0 | -0.01161 0.01161
['Mean', 4, 'MaxColumns', False]  -0.0 | -0.0173 0.0173
['Mean', 5, 'From', False]  -0.0 | -0.01258 0.01258
['Mean', 5, 'To', False]  -0.0 | -0.00521 0.00521
['Mean', 5, 'MeanColumns', False]  -0.0 | -0.00674 0.00674
['Mean', 5, 'MaxColumns', False]  -0.0 | -0.00406 0.00406
['Mean', 6, 'From', False]  -0.0 | -0.01008 0.01008
['Mean', 6, 'To', False]  -0.0 | -0.00115 0.00115
['Mean', 6, 'MeanColumns', False]  -0.0 | -0.01378 0.01378
['Mean', 6, 'MaxColumns', False]  -0.0 | -0.01191 0.01191
['Mean', 

In [None]:
print_results(file_name+' (P)', conf, metricsP, label_names)

FTP
['Mean', 'Mean', 'From', False]  0.09108 | 0.01528 0.16689
['Mean', 'Mean', 'To', False]  0.03121 | 0.008 0.05442
['Mean', 'Mean', 'MeanColumns', False]  0.09269 | 0.01504 0.17034
['Mean', 'Mean', 'MaxColumns', False]  0.08151 | 0.01688 0.14614
['Mean', 0, 'From', False]  0.09386 | 0.00895 0.17877
['Mean', 0, 'To', False]  0.02681 | 0.00681 0.04681
['Mean', 0, 'MeanColumns', False]  0.08364 | 0.00757 0.15971
['Mean', 0, 'MaxColumns', False]  0.05398 | 0.00941 0.09855
['Mean', 1, 'From', False]  0.05963 | 0.01226 0.107
['Mean', 1, 'To', False]  0.02773 | 0.01037 0.04508
['Mean', 1, 'MeanColumns', False]  0.08752 | 0.01286 0.16217
['Mean', 1, 'MaxColumns', False]  0.06963 | 0.00746 0.1318
['Mean', 2, 'From', False]  0.05297 | 0.00615 0.09978
['Mean', 2, 'To', False]  0.01866 | 0.00454 0.03279
['Mean', 2, 'MeanColumns', False]  0.07183 | 0.00613 0.13754
['Mean', 2, 'MaxColumns', False]  0.03087 | 0.00718 0.05455
['Mean', 3, 'From', False]  0.08348 | 0.01543 0.15154
['Mean', 3, 'To', F

We calculate the best attention setup using Optimus variations.

In [None]:
print_results_ap(metrics, label_names, conf)

Baseline: -1.4386499090981997e-08  and NZW: 1.0
Max Across: 1.144073020695191e-08  and NZW: 1.0
Per Label Per Instance: 0.04489818362654563  and NZW:  1.0
Per Instance: 7.275825034992744e-08  and NZW:  1.0


In [None]:
print_results_ap(metricsP, label_names, conf)

Baseline: 0.09108471210156674  and NZW: 1.0
Max Across: 0.09523445617607847  and NZW: 1.0
Per Label Per Instance: 0.11587942421468539  and NZW:  1.0
Per Instance: 0.11587942421468539  and NZW:  1.0
