# Interpretation of BertForSequenceClassification in captum

In [1]:
from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

from detector import Detector
import torch

In [2]:
# import sys
#
# print(sys.executable)
# model2 = BertForSequenceClassification.from_pretrained('../model/')


In [3]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [4]:

# load model
model = Detector("./base.pt")
model.model.to(device)
model.model.eval()
model.model.zero_grad()

# load tokenizer
tokenizer = model.tokenizer

Initializing Detector...


In [5]:
def predict(inputs):
    return model.model(inputs)[0]

In [6]:
ref_token_id = tokenizer.pad_token_id # A token used for generating token reference
sep_token_id = tokenizer.sep_token_id # A token used as a separator between question and text and it is also added to the end of the text.
cls_token_id = tokenizer.cls_token_id # A token used for prepending to the concatenated question-text word sequence

In [7]:
def construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id):

    text_ids = tokenizer.encode(text, add_special_tokens=False)
    # construct input token ids
    input_ids = [cls_token_id] + text_ids + [sep_token_id]
    # construct reference token ids 
    ref_input_ids = [cls_token_id] + [ref_token_id] * len(text_ids) + [sep_token_id]

    return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device), len(text_ids)

def construct_input_ref_token_type_pair(input_ids, sep_ind=0):
    seq_len = input_ids.size(1)
    token_type_ids = torch.tensor([[0 if i <= sep_ind else 1 for i in range(seq_len)]], device=device)
    ref_token_type_ids = torch.zeros_like(token_type_ids, device=device)# * -1
    return token_type_ids, ref_token_type_ids

def construct_input_ref_pos_id_pair(input_ids):
    seq_length = input_ids.size(1)
    position_ids = torch.arange(seq_length, dtype=torch.long, device=device)
    # we could potentially also use random permutation with `torch.randperm(seq_length, device=device)`
    ref_position_ids = torch.zeros(seq_length, dtype=torch.long, device=device)

    position_ids = position_ids.unsqueeze(0).expand_as(input_ids)
    ref_position_ids = ref_position_ids.unsqueeze(0).expand_as(input_ids)
    return position_ids, ref_position_ids
    
def construct_attention_mask(input_ids):
    return torch.ones_like(input_ids)

def custom_forward(inputs):
    preds = predict(inputs)
    return torch.softmax(preds, dim = 1)[0][0].unsqueeze(-1)

def summarize_attributions(attributions):
    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.norm(attributions)
    return attributions

lig = LayerIntegratedGradients(custom_forward, model.model.roberta.embeddings)

In [13]:
# model.model(input_ids)

In [14]:
# predict(input_ids)

tensor([0.0032], device='cuda:0', grad_fn=<UnsqueezeBackward0>)

In [8]:
NUM_OF_TEXT = 10
MUTATION, REAL, SYNTHETIC = 0, 1, 2
FILE_TYPE = SYNTHETIC
# DATA_FILE = './data/Test_WikiHumanQuarterSet.json'
# DATA_FILE = './data/Test_WikiMutationFullReplaceAntonyms.json'
# DATA_FILE = './data/Test_WikiMutationFullReplaceRandomWords.json'
# DATA_FILE = './data/Test_WikiMutationFullReplaceSynonyms.json'
# DATA_FILE = './data/Test_WikiMutationFullSet.json'
# DATA_FILE = './data/Test_WikiMutationFullSetDeleteArticles.json'
# DATA_FILE = './data/Test_WikiMutationFullSetMisspellings.json'
# DATA_FILE = './data/Test_WikiMutationFullSetReplaceAE.json'
# DATA_FILE = './data/Test_WikiMutationQuarterSet.json'
# DATA_FILE = './data/Test_WikiSyntheticFullSet.json'
DATA_FILE = './data/Test_WikiSyntheticQuarterSet.json'

In [4]:
from utils2 import load_standard_json
import random
text_list = load_standard_json(DATA_FILE, True)
print(text_list[0])
random.shuffle(text_list)

Guinea has an economy that is mainly based on agriculture and its large population AIDS-Free Life expectancy at birth is currently among the lowest in the world. 


In [9]:
text_list = ["Text generation has been an area of research that has gained significant attention in recent years due to the advancement of natural language processing and machine learning techniques. Text generation refers to the automated process of generating written content, such as articles, stories, or even chatbot responses, without the need for human intervention. The potential applications of text generation are vast, ranging from aiding in content creation for businesses to developing chatbots that can simulate human-like conversations.", #write me a thesis introduction on text generation
"Despite the significant progress made in text generation, challenges remain in creating text that is not only grammatically correct but also semantically meaningful and contextually appropriate. Furthermore, ethical considerations surrounding the potential misuse of text generation, such as the creation of fake news and impersonation, have brought attention to the need for responsible use and development of text generation models.",
"This thesis aims to explore the current state of text generation research, including the techniques used, evaluation metrics, and potential applications. The study will also examine the ethical implications of text generation and provide recommendations for responsible use and development of text generation models. By providing insights into the current state and future direction of text generation research, this thesis aims to contribute to the advancement of this field and promote the development of responsible and ethical text generation applications",
"Title: A Systematic Literature Review on Deepfake Detection Techniques\nIntroduction:\nDeepfakes are a type of synthetic media created using artificial intelligence (AI) algorithms that generate realistic images, videos, or audio that manipulate the perception of reality. Deepfake technology has raised serious concerns over its potential use in spreading disinformation, defamation, and propaganda. Consequently, researchers and experts have focused on developing effective deepfake detection techniques to combat these issues. This systematic literature review (SLR) aims to provide an overview of the current state-of-the-art deepfake detection techniques and their performance metrics.", #Write me an SLR on deep fake detection
"Methodology:\nThis SLR followed a systematic approach, including the identification of relevant literature, selection of primary studies, data extraction, and synthesis of results. We used a combination of search terms related to deepfake detection, including \"deepfake,\" \"deep learning,\" \"fake media,\" \"image manipulation,\" and \"video manipulation,\" to search the databases. We included peer-reviewed articles, conference papers, and preprints published between 2017 and 2022. We excluded articles that did not focus on deepfake detection or did not propose any detection techniques."]
text_list = [text[:512] for text in text_list]

In [10]:
#Green most important, red least important
print('\033[1m', 'Visualization For Score', '\033[0m')
for text in text_list[:NUM_OF_TEXT]:
    input_ids, ref_input_ids, sep_id = construct_input_ref_pair(text, ref_token_id, sep_token_id, cls_token_id)
    token_type_ids, ref_token_type_ids = construct_input_ref_token_type_pair(input_ids, sep_id)
    position_ids, ref_position_ids = construct_input_ref_pos_id_pair(input_ids)
    attention_mask = construct_attention_mask(input_ids)

    indices = input_ids[0].detach().tolist()
    all_tokens = tokenizer.convert_ids_to_tokens(indices)

    custom_forward(input_ids)

    attributions, delta = lig.attribute(inputs=input_ids,
                                        baselines=ref_input_ids,
                                        return_convergence_delta=True,
                                        internal_batch_size=1)

    score = predict(input_ids)

    attributions_sum = summarize_attributions(attributions)

    # storing couple samples in an array for visualization purposes
    all_tokens = [token.replace("Ġ", "") for token in all_tokens]
    score_vis = viz.VisualizationDataRecord(
                            attributions_sum,
                            torch.softmax(score, dim = 1)[0][0],
                            torch.argmax(torch.softmax(score, dim = 1)[0]),
                            FILE_TYPE,
                            text,
                            attributions_sum.sum(),
                            all_tokens,
                            delta)

    viz.visualize_text([score_vis])

[1m Visualization For Score [0m


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
2.0,2 (0.00),"Text generation has been an area of research that has gained significant attention in recent years due to the advancement of natural language processing and machine learning techniques. Text generation refers to the automated process of generating written content, such as articles, stories, or even chatbot responses, without the need for human intervention. The potential applications of text generation are vast, ranging from aiding in content creation for businesses to developing chatbots that can simulate",1.37,"#s Text generation has been an area of research that has gained significant attention in recent years due to the advancement of natural language processing and machine learning techniques . Text generation refers to the automated process of generating written content , such as articles , stories , or even chat bot responses , without the need for human intervention . The potential applications of text generation are vast , ranging from aiding in content creation for businesses to developing chat bots that can simulate #/s"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
2.0,1 (0.00),"Despite the significant progress made in text generation, challenges remain in creating text that is not only grammatically correct but also semantically meaningful and contextually appropriate. Furthermore, ethical considerations surrounding the potential misuse of text generation, such as the creation of fake news and impersonation, have brought attention to the need for responsible use and development of text generation models.",-0.04,"#s Despite the significant progress made in text generation , challenges remain in creating text that is not only gram matically correct but also sem antically meaningful and context ually appropriate . Furthermore , ethical considerations surrounding the potential misuse of text generation , such as the creation of fake news and imperson ation , have brought attention to the need for responsible use and development of text generation models . #/s"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
2.0,2 (0.00),"This thesis aims to explore the current state of text generation research, including the techniques used, evaluation metrics, and potential applications. The study will also examine the ethical implications of text generation and provide recommendations for responsible use and development of text generation models. By providing insights into the current state and future direction of text generation research, this thesis aims to contribute to the advancement of this field and promote the development of respo",-1.65,"#s This thesis aims to explore the current state of text generation research , including the techniques used , evaluation metrics , and potential applications . The study will also examine the ethical implications of text generation and provide recommendations for responsible use and development of text generation models . By providing insights into the current state and future direction of text generation research , this thesis aims to contribute to the advancement of this field and promote the development of resp o #/s"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
2.0,2 (0.00),"Title: A Systematic Literature Review on Deepfake Detection Techniques Introduction: Deepfakes are a type of synthetic media created using artificial intelligence (AI) algorithms that generate realistic images, videos, or audio that manipulate the perception of reality. Deepfake technology has raised serious concerns over its potential use in spreading disinformation, defamation, and propaganda. Consequently, researchers and experts have focused on developing effective deepfake detection techniques to comba",-0.23,"#s Title : A System atic Literature Review on Deep fake Detection Techniques Ċ Introduction : Ċ Deep f akes are a type of synthetic media created using artificial intelligence ( AI ) algorithms that generate realistic images , videos , or audio that manipulate the perception of reality . Deep fake technology has raised serious concerns over its potential use in spreading disinformation , defamation , and propaganda . Consequently , researchers and experts have focused on developing effective deep fake detection techniques to comb a #/s"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
2.0,2 (0.00),"Methodology: This SLR followed a systematic approach, including the identification of relevant literature, selection of primary studies, data extraction, and synthesis of results. We used a combination of search terms related to deepfake detection, including ""deepfake,"" ""deep learning,"" ""fake media,"" ""image manipulation,"" and ""video manipulation,"" to search the databases. We included peer-reviewed articles, conference papers, and preprints published between 2017 and 2022. We excluded articles that did not f",1.49,"#s Method ology : Ċ This SL R followed a systematic approach , including the identification of relevant literature , selection of primary studies , data extraction , and synthesis of results . We used a combination of search terms related to deep fake detection , including "" deep fake ,"" "" deep learning ,"" "" fake media ,"" "" image manipulation ,"" and "" video manipulation ,"" to search the databases . We included peer - reviewed articles , conference papers , and pre prints published between 2017 and 2022 . We excluded articles that did not f #/s"
,,,,
