Please remember to run  **pip3 install textattack[tensorflow]**  in your notebook enviroment before the following codes:

# Explain Attacking BERT models using CAptum

Captum is a PyTorch library to explain neural networks
Here we show a minimal example using Captum to explain BERT models from TextAttack

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/Example_5_Explain_BERT.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/Example_5_Explain_BERT.ipynb)

In [1]:
import torch
import pandas as pd
from copy import deepcopy

from BERTweet.TweetNormalizer import *

In [2]:
from textattack.datasets import Dataset
from textattack.models.wrappers import HuggingFaceModelWrapper
from textattack.models.wrappers import ModelWrapper
from transformers import AutoModelForSequenceClassification, AutoTokenizer

In [3]:
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients, LayerDeepLiftShap, InternalInfluence, LayerGradientXActivation
from captum.attr import visualization as viz

In [4]:
if torch.cuda.is_available():
    device = torch.device("cuda:0")
else:
    device = torch.device("cpu")

print(device)

cuda:0


In [14]:
df = pd.read_csv("test_generalization.csv")
df["tweet"] = normalizeTweet(df["tweet"].values)

In [6]:
dataset = Dataset(df[["tweet", "label"]].values)
original_model = AutoModelForSequenceClassification.from_pretrained("results/model_28/")
original_tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-large", use_fast=False)
model = HuggingFaceModelWrapper(original_model,original_tokenizer)

In [8]:
def get_text(tokenizer,input_ids,token_type_ids,attention_mask):
    list_of_text = []
    number = input_ids.size()[0]
    for i in range(number):
        ii = input_ids[i,].cpu().numpy()
        tt = token_type_ids[i,]
        am = attention_mask[i,]
        txt = tokenizer.decode(ii, skip_special_tokens=True)
        list_of_text.append(txt)
    return list_of_text

sel = 2
batch_encoded = model.tokenizer([dataset[i][0]['text'] for i in range(sel)], padding=True, return_tensors="pt")
batch_encoded.to(device)

labels = [dataset[i][1] for i in range(sel)]

clone = deepcopy(model)
clone.model.to(device)

def calculate(input_ids, attention_mask):
    #convert back to list of text
    return clone.model(input_ids, attention_mask=attention_mask)[0]

# x = calculate(**batch_encoded)

lig = LayerIntegratedGradients(calculate, clone.model.roberta.embeddings)
# lig = InternalInfluence(calculate, clone.model.bert.embeddings)
# lig = LayerGradientXActivation(calculate, clone.model.bert.embeddings)

bsl = torch.zeros(batch_encoded['input_ids'].size()).type(torch.LongTensor).to(device)
labels = torch.tensor(labels).to(device)

attributions,delta = lig.attribute(inputs=batch_encoded['input_ids'],
                              baselines=bsl,
                              additional_forward_args=(batch_encoded['attention_mask']),
                              n_steps = 10,
                              target = labels,
                              return_convergence_delta=True
                              )
atts = attributions.sum(dim=-1).squeeze(0)
atts = atts / torch.norm(atts)

In [9]:
atts = attributions.sum(dim=-1).squeeze(0)
atts = atts / torch.norm(atts)

In [10]:
from textattack.attack_recipes import PWWSRen2019
attack = PWWSRen2019.build(model)

[nltk_data] Downloading package omw-1.4 to /home/alexlu/nltk_data...
textattack: Unknown if model of class <class 'transformers.models.roberta.modeling_roberta.RobertaForSequenceClassification'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.


In [11]:
from textattack import Attacker, AttackArgs

attack_args = AttackArgs(
    num_examples=-1,
)


attacker = Attacker(attack, dataset, attack_args)
attacker.attack_dataset()

Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  weighted-saliency
  )
  (goal_function):  UntargetedClassification
  (transformation):  WordSwapWordNet
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
) 



[Succeeded / Failed / Skipped / Total] 0 / 1 / 0 / 1:  10%|█         | 1/10 [00:11<01:45, 11.76s/it]

--------------------------------------------- Result 1 ---------------------------------------------

@USER My step-dad had Alzheimer 's for 14 year . Biden talks and acts just like him . They are very similar in character and he got very mean and short tempered when he recognized his inability to speak clearly , say the words he was trying to remember and when he was embarrassed .




[Succeeded / Failed / Skipped / Total] 0 / 2 / 0 / 2:  20%|██        | 2/10 [00:22<01:29, 11.24s/it]

--------------------------------------------- Result 2 ---------------------------------------------

What others are saying about the course : " " I 'm very glad I found this program . It has made my attitude toward care giving more positive and hopeful . It has created a more pleasant environment for me and my husband . " " #dementia #DementiaCare thedawnmethod.com/training/ HTTPURL




[Succeeded / Failed / Skipped / Total] 0 / 3 / 0 / 3:  30%|███       | 3/10 [00:26<01:01,  8.75s/it]

--------------------------------------------- Result 3 ---------------------------------------------

@USER @USER Dementia and senility is the equivalent of returning to a child like state so his comment is in keeping with his acuity .




[Succeeded / Failed / Skipped / Total] 0 / 3 / 0 / 3:  40%|████      | 4/10 [00:32<00:48,  8.13s/it]

--------------------------------------------- Result 4 ---------------------------------------------


[Succeeded / Failed / Skipped / Total] 1 / 3 / 0 / 4:  40%|████      | 4/10 [00:32<00:49,  8.21s/it]


@USER I live in the US where there are SO many Spanish speakers . I initially chose Japanese , but quickly changed course because : 1 . Spanish makes most sense career wise 2 . My Main goal was to just be bilingual ( I do n't [[want]] alzheimer 's ) 3 . I want my daughter to speak Spanish

@USER I live in the US where there are SO many Spanish speakers . I initially chose Japanese , but quickly changed course because : 1 . Spanish makes most sense career wise 2 . My Main goal was to just be bilingual ( I do n't [[privation]] alzheimer 's ) 3 . I want my daughter to speak Spanish




[Succeeded / Failed / Skipped / Total] 1 / 4 / 0 / 5:  50%|█████     | 5/10 [00:37<00:37,  7.49s/it]

--------------------------------------------- Result 5 ---------------------------------------------

@USER @USER @USER Please let 's be serious the poor man has inherited his father 's gen " Alzheimer 's " He is next walking in his underwear and talking to the man in the mirror




[Succeeded / Failed / Skipped / Total] 1 / 5 / 0 / 6:  60%|██████    | 6/10 [00:43<00:29,  7.28s/it]

--------------------------------------------- Result 6 ---------------------------------------------

@USER Glenn Campbell . I met he and his daughter at an awards banquet in Minnesota . It was the last show he did with his daughter because of his Alzheimer 's . They sang 5 beautiful songs of his . Tears everywhere .




[Succeeded / Failed / Skipped / Total] 2 / 5 / 0 / 7:  70%|███████   | 7/10 [00:52<00:22,  7.50s/it]

--------------------------------------------- Result 7 ---------------------------------------------

@USER @USER @USER 1 / My mum has late [[stage]] Vascular-Dem . That ep broke me but also it “ did ” [[dementia]] without the saccharine tranquillisers usually served with it in media / fiction . Bojack is a sublime show cos of shit like this . It gives you takes on * existence * as it * is * . No three act resolution

@USER @USER @USER 1 / My mum has late [[represent]] Vascular-Dem . That ep broke me but also it “ did ” [[dementedness]] without the saccharine tranquillisers usually served with it in media / fiction . Bojack is a sublime show cos of shit like this . It gives you takes on * existence * as it * is * . No three act resolution




[Succeeded / Failed / Skipped / Total] 3 / 5 / 0 / 8:  80%|████████  | 8/10 [00:57<00:14,  7.13s/it]

--------------------------------------------- Result 8 ---------------------------------------------

[[Dad]] I 'm fighting for the support we should have received all along . No [[one]] should face dementia alone but sadly we did . I wo n't [[let]] this happen again . #alzheimers #dementia #fairdementia

[[dada]] I 'm fighting for the support we should have received all along . No [[unrivaled]] should face dementia alone but sadly we did . I wo n't [[Lashkar-e-Toiba]] this happen again . #alzheimers #dementia #fairdementia




[Succeeded / Failed / Skipped / Total] 3 / 6 / 0 / 9:  90%|█████████ | 9/10 [01:08<00:07,  7.57s/it]

--------------------------------------------- Result 9 ---------------------------------------------

@USER @USER Thats y every one in this earth hates your kind coz you fucker are alz trying to impose yourselves onto others . By the way did your dad tell tou that you all too were Hindus ' who converted in a fear of death . You guys are funny ... :pig_face: Have this it will help you get well soon




[Succeeded / Failed / Skipped / Total] 4 / 6 / 0 / 10: 100%|██████████| 10/10 [01:16<00:00,  7.66s/it]

--------------------------------------------- Result 10 ---------------------------------------------

@USER Sorry to hear that it 's getting tough for you Janice . My [[dad]] was [[blessed]] in that sense as he did n't [[get]] to this point of his #Dementia . It was only after he had been in hospital after they had been on a lockdown that he deteriorated , could n't swallow , walk etc etc . Take care xx

@USER Sorry to hear that it 's getting tough for you Janice . My [[pa]] was [[blasted]] in that sense as he did n't [[suffer]] to this point of his #Dementia . It was only after he had been in hospital after they had been on a lockdown that he deteriorated , could n't swallow , walk etc etc . Take care xx



+-------------------------------+--------+
| Attack Results                |        |
+-------------------------------+--------+
| Number of successful attacks: | 4      |
| Number of failed attacks:     | 6      |
| Number of skipped attacks:    | 0      |
| Original accuracy:   




[<textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fc6d45b3d60>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fc7c85b85b0>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fc6ebac0910>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fc6dce73f10>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fc6ebbec220>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fc6ebd12be0>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fc6ebc7d970>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fc6ebc7d8b0>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fc6ebacd220>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fc6eb8796d0>]