# Abordagem randômica

Usando a abordagem randômica para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística *Vocabullary* com o teste **MFT**.

As etapas desta abordagem são:

1. Quebrar as instâncias em sentenças
2. Selecionar uma amostra de *K* sentenças de forma aleatória
3. Rankear as palavras de cada sentença
4. Realizar as predições de cada sentença usando o *Oráculo*
5. Substituir as palavras relevantes por máscaras

In [1]:
%config Completer.use_jedi = False
import sys
import random

sys.path.append('../')
random.seed(220)

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [2]:
import pandas as pd
from datasets import load_dataset

pd.set_option('display.max_colwidth', None)

dataset = load_dataset("rotten_tomatoes")
dataset.set_format("pandas")
df = dataset["test"].shuffle(seed=42)[:100]
df

Unnamed: 0,text,label
0,"unpretentious , charming , quirky , original",1
1,"a film really has to be exceptional to justify a three hour running time , and this isn't .",0
2,working from a surprisingly sensitive script co-written by gianni romoli . . . ozpetek avoids most of the pitfalls you'd expect in such a potentially sudsy set-up .,1
3,"it may not be particularly innovative , but the film's crisp , unaffected style and air of gentle longing make it unexpectedly rewarding .",1
4,"such a premise is ripe for all manner of lunacy , but kaufman and gondry rarely seem sure of where it should go .",0
...,...,...
95,"ice age is the first computer-generated feature cartoon to feel like other movies , and that makes for some glacial pacing early on .",0
96,there's no denying that burns is a filmmaker with a bright future ahead of him .,1
97,it collapses when mr . taylor tries to shift the tone to a thriller's rush .,0
98,"there's a great deal of corny dialogue and preposterous moments . and yet , it still works .",1


In [3]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
rotten_tomatoes_models = {
    'bert': 'textattack/bert-base-uncased-rotten-tomatoes', 
    'albert': 'textattack/albert-base-v2-rotten-tomatoes', 
    'distilbert': 'textattack/distilbert-base-uncased-rotten-tomatoes', 
    'roberta': 'textattack/roberta-base-rotten-tomatoes', 
    'xlnet': 'textattack/xlnet-base-cased-rotten-tomatoes', 
    
}

In [4]:
m1 = load_model(rotten_tomatoes_models['albert'])
m2 = load_model(rotten_tomatoes_models['distilbert'])
m3 = load_model(rotten_tomatoes_models['roberta'])
m4 = load_model(rotten_tomatoes_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(rotten_tomatoes_models['bert'])

Loading model textattack/albert-base-v2-rotten-tomatoes...
Loading model textattack/distilbert-base-uncased-rotten-tomatoes...
Loading model textattack/roberta-base-rotten-tomatoes...


Some weights of the model checkpoint at textattack/roberta-base-rotten-tomatoes were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Loading model textattack/xlnet-base-cased-rotten-tomatoes...
Loading model textattack/bert-base-uncased-rotten-tomatoes...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [5]:
from template_generator_static_lex.tasks.sentiment_analisys import PosNegTemplateGeneratorRandom

tg = PosNegTemplateGeneratorRandom(model, models)

### Número inicial de instâncias: 5

In [6]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [7]:
templates = tg.generate_templates(instances, n_masks=2, k_templates=1)

Converting texts to sentences...
:: 5 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 5 instâncias: 0m 3.1s

In [8]:
tg.to_dataframe()


Unnamed: 0,label,original_text,masked_text,template_text
0,0,"cherry orchard is badly edited , often awkwardly directed and suffers from the addition of a wholly unnecessary pre-credit sequence designed to give some of the characters a 'back story . '","cherry orchard is badly edited , often awkwardly {mask} and {mask} from the addition of a wholly unnecessary pre - credit sequence designed to give some of the characters a ' back story . '","cherry orchard is badly edited , often awkwardly {neg_verb} and {neg_verb} from the addition of a wholly unnecessary pre - credit sequence designed to give some of the characters a ' back story . '"


In [9]:
tg.lexicons

{'pos_verb': ['enjoy', 'carries', 'evoked', 'loved', 'shares', 'celebrating'],
 'neg_verb': ['suffers',
  'dismissed',
  'avoids',
  'squanders',
  'undermines',
  'removed',
  'directed',
  'withered',
  'simpering'],
 'pos_adj': ['timeless',
  'wonderful',
  'tender',
  'powerful',
  'engrossing',
  'brilliant',
  'funny',
  'terrific',
  'breathtaking',
  'successful',
  'passionate',
  'phenomenal',
  'notch',
  'worth'],
 'neg_adj': ['bad',
  'undeterminable',
  'drunk',
  'cumbersome',
  'nasty',
  'dumb',
  'psychopathic',
  'manipulative',
  'off',
  'preposterous',
  'tiresome',
  'stupid',
  'sensitive',
  'unpretentious']}

### Número inicial de instâncias: 100

In [10]:
# Using all 100 instances
instances = [x for x in df['text'].values]

In [11]:
%%time
# 35.4s
tg = PosNegTemplateGeneratorRandom(model, models)
templates = tg.generate_templates(instances, n_masks=2, k_templates=18)

Converting texts to sentences...
:: 103 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.
CPU times: user 3min 41s, sys: 601 ms, total: 3min 41s
Wall time: 23.5 s


#### Tempo de execução para 100 instâncias: 11.4s

In [12]:
tg.to_dataframe()


Unnamed: 0,label,original_text,masked_text,template_text
0,0,leaves viewers out in the cold and undermines some phenomenal performances .,leaves viewers out in the cold and {mask} some {mask} performances .,leaves viewers out in the cold and {neg_verb} some {pos_adj} performances .
1,0,one gets the impression the creators of don't ask don't tell laughed a hell of a lot at their own jokes . too bad none of it is funny .,one {mask} the impression the creators of do n't ask do n't tell laughed a hell of a lot at their own jokes . too {mask} none of it is funny .,one {pos_verb} the impression the creators of do n't ask do n't tell laughed a hell of a lot at their own jokes . too {neg_adj} none of it is funny .
2,1,connoisseurs of chinese film will be pleased to discover that tian's meticulous talent has not withered during his enforced hiatus .,connoisseurs of chinese film will be pleased to {mask} that tian 's meticulous talent has not {mask} during his enforced hiatus .,connoisseurs of chinese film will be pleased to {pos_verb} that tian 's meticulous talent has not {neg_verb} during his enforced hiatus .
3,1,a moving and not infrequently breathtaking film .,a {mask} and not infrequently {mask} film .,a {pos_adj} and not infrequently {pos_adj} film .
4,1,caine makes us watch as his character awakens to the notion that to be human is eventually to have to choose .,caine makes us watch as his character {mask} to the notion that to be human is eventually to {mask} to choose .,caine makes us watch as his character {pos_verb} to the notion that to be human is eventually to {neg_verb} to choose .
5,0,nothing but an episode of smackdown ! in period costume and with a bigger budget .,nothing but an episode of smackdown ! in period costume and with a {mask} budget .,nothing but an episode of smackdown ! in period costume and with a {pos_adj} budget .
6,0,"yes they can swim , the title is merely anne-sophie birot's off-handed way of saying girls find adolescence difficult to wade through .","yes they can swim , the title is merely anne - sophie birot 's {mask} - {mask} way of saying girls find adolescence difficult to wade through .","yes they can swim , the title is merely anne - sophie birot 's {neg_adj} - {neg_adj} way of saying girls find adolescence difficult to wade through ."
7,0,"a rip-off twice removed , modeled after [seagal's] earlier copycat under siege , sometimes referred to as die hard on a boat .","a rip - off twice {mask} , {mask} after [ seagal 's ] earlier copycat under siege , sometimes referred to as die hard on a boat .","a rip - off twice {neg_verb} , {neg_verb} after [ seagal 's ] earlier copycat under siege , sometimes referred to as die hard on a boat ."
8,1,"jolie gives it that extra little something that makes it worth checking out at theaters , especially if you're in the mood for something more comfortable than challenging .","jolie {mask} it that extra little something that makes it worth checking out at theaters , especially if you 're in the mood for something more comfortable than {mask} .","jolie {pos_verb} it that extra little something that makes it worth checking out at theaters , especially if you 're in the mood for something more comfortable than {pos_adj} ."
9,1,"a dreadful day in irish history is given passionate , if somewhat flawed , treatment .","a dreadful day in irish history is {mask} {mask} , if somewhat flawed , treatment .","a dreadful day in irish history is {neg_verb} {pos_adj} , if somewhat flawed , treatment ."


In [13]:
print(tg.lexicons)

{'pos_verb': ['enjoy', 'coming', 'carries', 'evoked', 'knows', 'loved', 'gives', 'shares', 'awakens', 'discover', 'celebrating', 'gets'], 'neg_verb': ['modeled', 'have', 'dismissed', 'tells', 'avoids', 'squanders', 'given', 'undermines', 'removed', 'withered', 'simpering'], 'pos_adj': ['powerful', 'engrossing', 'brilliant', 'funny', 'challenging', 'terrific', 'successful', 'notch', 'worth', 'moving', 'bigger', 'interpersonal', 'wonderful', 'passionate', 'timeless', 'delicate', 'tender', 'breathtaking', 'phenomenal'], 'neg_adj': ['entire', 'bad', 'undeterminable', 'drunk', 'cumbersome', 'nasty', 'dumb', 'psychopathic', 'manipulative', 'off', 'preposterous', 'handed', 'tiresome', 'stupid', 'sensitive', 'unpretentious']}


# Usando os templates gerados pelo TemplateGenerator no CheckList

In [14]:
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [15]:
lexicons = tg.lexicons
templates = tg.template_texts
masked = tg.masked_texts
labels = [sent.prediction.label for sent in tg.sentences]

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [16]:
for template, label, i in zip(templates, labels, range(len(templates))):
    t = editor.template(template, remove_duplicates=True, labels=int(label))

    suite.add(MFT(
        data=t.data,
        labels=label,
        capability="Vocabullary", 
        name=f"Test: MFT with vocabullary - template{i+1}",
        description="Checking if the model can handle vocabullary"))

In [17]:
suite.run(model.predict, overwrite=True)
suite.save('./suites/posneg-random.suite')

Running Test: MFT with vocabullary - template1
Predicting 209 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


Running Test: MFT with vocabullary - template2
Predicting 192 examples
Running Test: MFT with vocabullary - template3
Predicting 132 examples
Running Test: MFT with vocabullary - template4
Predicting 19 examples
Running Test: MFT with vocabullary - template5
Predicting 132 examples
Running Test: MFT with vocabullary - template6
Predicting 19 examples
Running Test: MFT with vocabullary - template7
Predicting 16 examples
Running Test: MFT with vocabullary - template8
Predicting 11 examples
Running Test: MFT with vocabullary - template9
Predicting 228 examples
Running Test: MFT with vocabullary - template10
Predicting 209 examples
Running Test: MFT with vocabullary - template11
Predicting 176 examples
Running Test: MFT with vocabullary - template12
Predicting 16 examples
Running Test: MFT with vocabullary - template13
Predicting 19 examples
Running Test: MFT with vocabullary - template14
Predicting 19 examples
Running Test: MFT with vocabullary - template15
Predicting 192 examples
Running

# Carregando suite de teste

In [18]:
from checklist.test_suite import TestSuite
suite = TestSuite.from_file('./suites/posneg-random.suite')

# suite.visual_summary_table()

In [19]:
passed = 0
failed = 0
for test_name in suite.tests:
    table = suite.visual_summary_by_test(test_name)
    
    failed += table.stats['nfailed']    
    passed += table.stats['npassed']
    assert table.stats['nfailed'] + table.stats['npassed'] == len(table.filtered_testcases)

print(f"{failed = } ({(failed/(passed+failed))*100:.2f}%)")
print(f"{passed = } ({(passed/(passed+failed))*100:.2f}%)")
print(f"total = {passed+failed}")
print("templates:", len(suite.tests))

failed = 337 (18.84%)
passed = 1452 (81.16%)
total = 1789
templates: 18


In [20]:
table = suite.visual_summary_by_test('Test: MFT with vocabullary - template2')

for item in table.candidate_testcases:
    print(item['examples'][0]['new']['text'])