# Integrated Gradient
https://towardsdatascience.com/interpreting-the-prediction-of-bert-model-for-text-classification-5ab09f8ef074

In [1]:
import torch
import torch.nn as nn
import pandas as pd
from transformers import AutoTokenizer,BertModel
from transformers import EarlyStoppingCallback

In [2]:
model_version = None
tokenizer = AutoTokenizer.from_pretrained(model_version)
bert = BertModel.from_pretrained(model_version)
df = pd.read_csv('../data/BERT/articles_2015_2019_nd_sample.csv')

Some weights of the model checkpoint at CVs/full/bert-base-uncased/fold_full-bert-base-uncased/ were not used when initializing BertModel: ['classifier.bias', 'classifier.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [3]:
# Class of model architecture
class BertClassifier(nn.Module):

    def __init__(self):

        super(BertClassifier, self).__init__()

        self.bert = bert
        self.linear = nn.Linear(768, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, input_id, mask = None):

        _, pooled_output = self.bert(input_ids= input_id, attention_mask=mask,return_dict=False)
        # dropout_output = self.dropout(pooled_output)
        linear_output = self.linear(pooled_output)
        final_layer = self.sigmoid(linear_output)

        return final_layer

### Captum

In [4]:
from captum.attr import LayerIntegratedGradients
from captum.attr import visualization as viz

model = BertClassifier()

# Define model output
def model_output(inputs):
    return model(inputs)[0]

# Define model input
model_input = model.bert.embeddings
lig = LayerIntegratedGradients(model_output, model_input)

In [5]:
def construct_input_and_baseline(text):

    max_length = 510
    baseline_token_id = tokenizer.pad_token_id 
    sep_token_id = tokenizer.sep_token_id 
    cls_token_id = tokenizer.cls_token_id 

    text_ids = tokenizer.encode(text, max_length=max_length, truncation=True, add_special_tokens=False)
   
    input_ids = [cls_token_id] + text_ids + [sep_token_id]
    token_list = tokenizer.convert_ids_to_tokens(input_ids)

    baseline_input_ids = [cls_token_id] + [baseline_token_id] * len(text_ids) + [sep_token_id]
    return torch.tensor([input_ids], device='cpu'), torch.tensor([baseline_input_ids], device='cpu'), token_list

def summarize_attributions(attributions):

    attributions = attributions.sum(dim=-1).squeeze(0)
    attributions = attributions / torch.norm(attributions)
    
    return attributions

def interpret_text(text, true_class):
    input_ids, baseline_input_ids, all_tokens = construct_input_and_baseline(text)
    attributions, delta = lig.attribute(
                                        inputs= input_ids,
                                        baselines= baseline_input_ids,
                                        return_convergence_delta=True,
                                        internal_batch_size=1,
                                        n_steps=50
                                    )
    
    attributions_sum = summarize_attributions(attributions)
    score_vis = viz.VisualizationDataRecord(
                                        word_attributions = attributions_sum,
                                        pred_prob = torch.max(model(input_ids)[0]),
                                        pred_class = torch.argmax(model(input_ids)[0]).numpy(),
                                        true_class = true_class,
                                        attr_class = text,
                                        attr_score = attributions_sum.sum(),       
                                        raw_input_ids = all_tokens,
                                        convergence_score = delta)

    viz.visualize_text([score_vis])
    
for text, sentiment in df.values[:10]:
    true_class = 0
    if sentiment == 'positive':
        true_class = 1
    interpret_text(text, true_class)

True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,0 (0.44),"As prime minister Narendra Modiponders his New Year’s resolutions, one stands out: redoubling India’s efforts to rival China by turning into a global manufacturing hub. Few goals are as important to his country’s future. But few are likely to be as difficult to achieve. @ India’s manufacturing frailty is well documented. At just 15 per cent of gross domestic product, the sector is less than half the size of China’s. No poor Asian country has risen to middle-income status with such feeble figures — hence the urgency behind Mr Modi’s “Make in India” drive, launched amid much hoopla in September. @ India excels at some high-tech manufacturing. The likes of Ford and Hyundai run world-class local factories, packed with whirring robots. Many global carmakers see India as a crucial export base. But lower skilled, labour-intensive industries such as clothes manufacturing and electronics do less well, causing alarm in a nation that must create 12m new jobs a year until 2030 to meet a looming demographic bulge.",-0.05,"[CLS] As prime minister Na ##ren ##dra Mo ##di ##po ##nders his New Year ’ s resolutions , one stands out : red ##ou ##bling India ’ s efforts to rival China by turning into a global manufacturing hub . Few goals are as important to his country ’ s future . But few are likely to be as difficult to achieve . @ India ’ s manufacturing f ##rail ##ty is well documented . At just 15 per cent of gross domestic product , the sector is less than half the size of China ’ s . No poor Asian country has risen to middle - income status with such fee ##ble figures — hence the urgency behind Mr Mo ##di ’ s “ Make in India ” drive , launched amid much ho ##op ##la in September . @ India ex ##cel ##s at some high - tech manufacturing . The likes of Ford and H ##yun ##dai run world - class local factories , packed with w ##hir ##ring robots . Many global car ##makers see India as a crucial export base . But lower skilled , labour - intensive industries such as clothes manufacturing and electronics do less well , causing alarm in a nation that must create 12 ##m new jobs a year until 203 ##0 to meet a looming demographic b ##ulge . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,0 (0.61),"Oil prices are on course for their largest annual slide since 2008, capping another dire year for commodities, as crude fell again yesterday to hover at close to half its level of six months ago. @ Brent crude’s 49 per cent plummet since June — alongside a near halving of iron ore prices and sharp drops in coal and copper — has helped drag the Bloomberg Commodity index down 15.6 per cent in 2014 to a five-year low. @ While the international benchmark’s price plunge could prove a major boon for the global economy, it has thrown big oil exporters such as Russia and Venezuela into disarray and forced oil companies to re-examine their investment plans and look for ways to reduce costs.",1.93,"[CLS] Oil prices are on course for their largest annual slide since 2008 , cap ##ping another dire year for commodities , as crude fell again yesterday to ho ##ver at close to half its level of six months ago . @ Brent crude ’ s 49 per cent p ##lum ##met since June — alongside a near ha ##lving of iron ore prices and sharp drops in coal and copper — has helped drag the Bloomberg Co ##mm ##od ##ity index down 15 . 6 per cent in 2014 to a five - year low . @ While the international bench ##mark ’ s price p ##lung ##e could prove a major b ##oon for the global economy , it has thrown big oil export ##ers such as Russia and Venezuela into di ##sar ##ray and forced oil companies to re - examine their investment plans and look for ways to reduce costs . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,0 (0.61),"One of China’s richest cities has issued restrictions on new car sales, highlighting another constraint on growth for an industry already feeling the effects of the country’s slowing economy. @ Shenzhen, in southern Guangdong province, was the last of China’s four “tier-one” cities — large urban clusters with high per capita gross domestic product figures — to limit the issuance of new licence plates, following moves by Beijing, Shanghai and Guangzhou. @ Under the rules, announced on Monday and effective immediately, only 100,000 licence plates will be issued annually in Shenzhen via auctions and lotteries. About 3.1m private cars are registered in the city of 15m people.",1.78,"[CLS] One of China ’ s richest cities has issued restrictions on new car sales , highlighting another con ##stra ##int on growth for an industry already feeling the effects of the country ’ s slowing economy . @ Shen ##zhen , in southern Guangdong province , was the last of China ’ s four “ tier - one ” cities — large urban clusters with high per capita gross domestic product figures — to limit the is ##su ##ance of new licence plates , following moves by Beijing , Shanghai and Guangzhou . @ Under the rules , announced on Monday and effective immediately , only 100 , 000 licence plates will be issued annually in Shen ##zhen via auction ##s and lot ##ter ##ies . About 3 . 1 ##m private cars are registered in the city of 15 ##m people . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,0 (0.61),"The euro crisis is back. An election in Greece next month and the probable victory of Syriza, a far-left party, will frighten politicians and investors. Once again they will be engaged in a grim discussion of a connected series of possible horrors: debt-default, bank runs, bailouts, social unrest and the possible ejection of Greece from the eurozone. @ It is somehow fitting that this crisis should break out at the very end of a year in which markets were lulled into believing that the euro crisis was essentially over. The cost of borrowing of debtor nations in Europe had fallen sharply, reflecting the widespread belief that the European Central Bank’s famous pledge to do “whatever it takes” to save the single currency has removed the risk of euro collapse. @ That idea was always naive, as events in Greece are now illustrating. The weak link in the theory was European politics – and, specifically, the risk that voters would revolt against economic austerity and cast their ballots for “anti-system” parties that reject the European consensus on how to keep the single currency together.",1.31,"[CLS] The euro crisis is back . An election in Greece next month and the probable victory of S ##yr ##iza , a far - left party , will f ##right ##en politicians and investors . Once again they will be engaged in a grim discussion of a connected series of possible horror ##s : debt - default , bank runs , bail ##outs , social unrest and the possible e ##jection of Greece from the euro ##zone . @ It is somehow fitting that this crisis should break out at the very end of a year in which markets were l ##ull ##ed into believing that the euro crisis was essentially over . The cost of borrow ##ing of debt ##or nations in Europe had fallen sharply , reflecting the widespread belief that the European Central Bank ’ s famous pledge to do “ whatever it takes ” to save the single currency has removed the risk of euro collapse . @ That idea was always naive , as events in Greece are now ill ##ust ##rating . The weak link in the theory was European politics – and , specifically , the risk that voters would revolt against economic au ##ster ##ity and cast their ballots for “ anti - system ” parties that reject the European consensus on how to keep the single currency together . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,0 (0.62),"At the start of his premiership, Li Keqiang drew on an ancient Chinese proverb to explain the task ahead. A Chinese warrior, having been bitten by a snake, cuts off his hand in order to save his body. China’s reform process will be “very painful and even feel like cutting one’s wrist”, Li warned. @ Pain has certainly been part of 2014 for the Chinese economy. To a large extent, it has been self-inflicted. Measures to deleverage the shadow financing system, for example, led to a sharp slowdown in credit, which in turn contributed to a drop in home sales. This has resulted in slower growth in industrial output, as well as weaker consumer purchases of cars and white goods. Meanwhile, anti-corruption campaigns have hit spending on luxury goods and services and led to delays in the approval of new projects by local officials. @ If 2014 was a year of painful surgery for the Chinese economy, will 2015 be a year of rejuvenation? On balance, our analysis suggests that this is unlikely.",2.31,"[CLS] At the start of his premiership , Li Ke ##qi ##ang drew on an ancient Chinese prove ##rb to explain the task ahead . A Chinese warrior , having been bitten by a snake , cuts off his hand in order to save his body . China ’ s reform process will be “ very painful and even feel like cutting one ’ s wrist ” , Li warned . @ Pain has certainly been part of 2014 for the Chinese economy . To a large extent , it has been self - inflicted . Me ##asures to del ##ever ##age the shadow financing system , for example , led to a sharp slow ##down in credit , which in turn contributed to a drop in home sales . This has resulted in slower growth in industrial output , as well as weaker consumer purchases of cars and white goods . Meanwhile , anti - corruption campaigns have hit spending on luxury goods and services and led to delays in the approval of new projects by local officials . @ If 2014 was a year of painful surgery for the Chinese economy , will 2015 be a year of re ##ju ##ven ##ation ? On balance , our analysis suggests that this is unlikely . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,0 (0.61),"Not That Healthy : Poor governance, marked by rising income inequality, corruption and lack of accountability, additionally challenges the sustainability of Cuba's universal health-care scheme. Havana ranks 75th in the annually-published Global Peace Index, a metric that attempts to quantify effective governance in a variety of sectors, putting it on par with troubled countries like Djibouti, Nepal and Malawi.",0.99,"[CLS] Not That Health ##y : Poor governance , marked by rising income inequality , corruption and lack of account ##ability , additionally challenges the sustainability of Cuba ' s universal health - care scheme . Havana ranks 75 ##th in the annually - published Global Peace Index , a metric that attempts to q ##uant ##ify effective governance in a variety of sectors , putting it on par with troubled countries like D ##ji ##bo ##uti , Nepal and Malawi . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,0 (0.27),A New Chapter in U.S.-Cuba Relations : It is time to reject the failed policies of the past and chart a new course that will better serve America's interests and improve the lives of Cubans and their families. And President Obama's decision to normalize relations with Cuba is the right approach.,-4.15,[CLS] A New Chapter in U . S . - Cuba Relations : It is time to reject the failed policies of the past and chart a new course that will better serve America ' s interests and improve the lives of Cuban ##s and their families . And President Obama ' s decision to normal ##ize relations with Cuba is the right approach . [SEP]
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,0 (0.25),"President Right to Normalize Relations With Cuba: Best Hope for Ending Repressive Regime : Normalizing both economic and diplomatic relations with Havana should be seen not as a victory for the Castro government, but for the people of Cuba. Liberty will come to that land. The only question is when. Expanding relations should help speed the process.",-4.03,"[CLS] President Right to Normal ##ize Relations With Cuba : Best Hope for End ##ing Rep ##ress ##ive Reg ##ime : Normal ##izing both economic and diplomatic relations with Havana should be seen not as a victory for the Castro government , but for the people of Cuba . Liberty will come to that land . The only question is when . Ex ##pan ##ding relations should help speed the process . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,0 (0.59),"Russian police briefly detained opposition leader Alexei Navalny and arrested dozens of his supporters on Tuesday in a crackdown that highlights jitters in the Kremlin that the slumping economy could undermine the rule of President Vladimir Putin. @ The arrests came hours after a Moscow court gave Mr Navalny a suspended prison sentence of three years and six months but sentenced his brother Oleg to the same time in prison. The brothers were found guilty of defrauding Yves Rocher, the French cosmetics firm, charges which their supporters and civil rights activists say are politically motivated. @ The brothers’ lawyers said they would appeal against the verdict.",1.48,"[CLS] Russian police briefly detained opposition leader Alexei Naval ##ny and arrested dozens of his supporters on Tuesday in a crack ##down that highlights ji ##tters in the K ##rem ##lin that the s ##lump ##ing economy could under ##mine the rule of President Vladimir Putin . @ The arrests came hours after a Moscow court gave Mr Naval ##ny a suspended prison sentence of three years and six months but sentenced his brother Ole ##g to the same time in prison . The brothers were found guilty of def ##ra ##uding Yves Roche ##r , the French co ##sm ##etics firm , charges which their supporters and civil rights activists say are politically motivated . @ The brothers ’ lawyers said they would appeal against the verdict . [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,0 (0.26),"6 Things That Barack Obama Did for P-20 Education in 2014 : Given the nature and sheer number of challenges, his administration has done a great deal to foster positive change and progress. In a bid to build upon his already stellar record on education Obama implemented a number of education reform initiatives in 2014.",-4.65,"[CLS] 6 Things That Barack Obama Did for P - 20 Education in 2014 : Given the nature and sheer number of challenges , his administration has done a great deal to foster positive change and progress . In a bid to build upon his already stellar record on education Obama implemented a number of education reform initiatives in 2014 . [SEP]"
,,,,
