# Abordagem randômica

Usando a abordagem randômica para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística *Vocabullary* com o teste **MFT**.

As etapas desta abordagem são:

1. Quebrar as instâncias em sentenças
2. Selecionar uma amostra de *K* sentenças de forma aleatória
3. Rankear as palavras de cada sentença
4. Realizar as predições de cada sentença usando o *Oráculo*
5. Substituir as palavras relevantes por máscaras

In [1]:
%config Completer.use_jedi = False
import sys
import random

sys.path.append('../')
random.seed(220)

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [2]:
import pandas as pd
from datasets import load_dataset

pd.set_option('display.max_colwidth', None)

dataset = load_dataset("rotten_tomatoes")
dataset.set_format("pandas")
df = dataset["test"].shuffle(seed=42)[:100]
df

Unnamed: 0,text,label
0,"unpretentious , charming , quirky , original",1
1,"a film really has to be exceptional to justify a three hour running time , and this isn't .",0
2,working from a surprisingly sensitive script co-written by gianni romoli . . . ozpetek avoids most of the pitfalls you'd expect in such a potentially sudsy set-up .,1
3,"it may not be particularly innovative , but the film's crisp , unaffected style and air of gentle longing make it unexpectedly rewarding .",1
4,"such a premise is ripe for all manner of lunacy , but kaufman and gondry rarely seem sure of where it should go .",0
...,...,...
95,"ice age is the first computer-generated feature cartoon to feel like other movies , and that makes for some glacial pacing early on .",0
96,there's no denying that burns is a filmmaker with a bright future ahead of him .,1
97,it collapses when mr . taylor tries to shift the tone to a thriller's rush .,0
98,"there's a great deal of corny dialogue and preposterous moments . and yet , it still works .",1


In [3]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
rotten_tomatoes_models = {
    'bert': 'textattack/bert-base-uncased-rotten-tomatoes', 
    'albert': 'textattack/albert-base-v2-rotten-tomatoes', 
    'distilbert': 'textattack/distilbert-base-uncased-rotten-tomatoes', 
    'roberta': 'textattack/roberta-base-rotten-tomatoes', 
    'xlnet': 'textattack/xlnet-base-cased-rotten-tomatoes', 
    
}

In [4]:
m1 = load_model(rotten_tomatoes_models['albert'])
m2 = load_model(rotten_tomatoes_models['distilbert'])
m3 = load_model(rotten_tomatoes_models['roberta'])
m4 = load_model(rotten_tomatoes_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(rotten_tomatoes_models['bert'])

Loading model textattack/albert-base-v2-rotten-tomatoes...
Loading model textattack/distilbert-base-uncased-rotten-tomatoes...
Loading model textattack/roberta-base-rotten-tomatoes...


Some weights of the model checkpoint at textattack/roberta-base-rotten-tomatoes were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Loading model textattack/xlnet-base-cased-rotten-tomatoes...
Loading model textattack/bert-base-uncased-rotten-tomatoes...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [5]:
from template_generator.tasks.sentiment_analisys import PosNegTemplateGeneratorRandom

tg = PosNegTemplateGeneratorRandom(model, models)

### Número inicial de instâncias: 5

In [6]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [7]:
templates = tg.generate_templates(instances, n_masks=2, k_templates=1)

Converting texts to sentences...
:: 6 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 5 instâncias: 0m 3.1s

In [8]:
tg.to_dataframe()


Unnamed: 0,label,original_text,masked_text,template_text
0,1,"it may not be particularly innovative , but the film's crisp , unaffected style and air of gentle longing make it unexpectedly rewarding .","it {mask} not be particularly innovative , but the film 's crisp , unaffected style and air of gentle longing make it unexpectedly {mask} .","it {neg_verb} not be particularly innovative , but the film 's crisp , unaffected style and air of gentle longing make it unexpectedly {pos_verb} ."


In [9]:
tg.lexicons

{'pos_verb': ['rewarding'], 'neg_verb': ['may'], 'pos_adj': [], 'neg_adj': []}

### Número inicial de instâncias: 100

In [10]:
# Using all 100 instances
instances = [x for x in df['text'].values]

In [11]:
%%time

tg = PosNegTemplateGeneratorRandom(model, models)
templates = tg.generate_templates(instances, n_masks=2, k_templates=18)

Converting texts to sentences...
:: 138 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.
CPU times: user 1min 45s, sys: 0 ns, total: 1min 45s
Wall time: 10.8 s


#### Tempo de execução para 100 instâncias: 11.4s

In [12]:
tg.to_dataframe()


Unnamed: 0,label,original_text,masked_text,template_text
0,1,an odd drama set in the world of lingerie models and bar dancers in the midwest that held my interest precisely because it didn't try to .,an odd drama set in the world of lingerie models and bar dancers in the midwest that {mask} my interest precisely because it did n't {mask} to .,an odd drama set in the world of lingerie models and bar dancers in the midwest that {neg_verb} my interest precisely because it did n't {neg_verb} to .
1,1,"twohy knows how to inflate the mundane into the scarifying , and gets full mileage out of the rolling of a stray barrel or the unexpected blast of a phonograph record .","twohy {mask} how to inflate the mundane into the scarifying , and {mask} full mileage out of the rolling of a stray barrel or the unexpected blast of a phonograph record .","twohy {pos_verb} how to inflate the mundane into the scarifying , and {pos_verb} full mileage out of the rolling of a stray barrel or the unexpected blast of a phonograph record ."
2,0,.,.,.
3,0,""" what really happened ? """,`` what really {mask} ? ``,`` what really {neg_verb} ? ``
4,1,"yes , soar .","yes , {mask} .","yes , {neg_verb} ."
5,0,you can practically hear george orwell turning over .,you can practically {mask} george orwell {mask} over .,you can practically {pos_verb} george orwell {neg_verb} over .
6,0,when the twist endings were actually surprising ?,when the twist endings {mask} actually {mask} ?,when the twist endings {neg_verb} actually {pos_adj} ?
7,1,"a dreadful day in irish history is given passionate , if somewhat flawed , treatment .","a dreadful day in irish history {mask} {mask} passionate , if somewhat flawed , treatment .","a dreadful day in irish history {pos_verb} {neg_verb} passionate , if somewhat flawed , treatment ."
8,0,charly comes off as emotionally manipulative and sadly imitative of innumerable past love story derisions .,charly comes off as emotionally {mask} and sadly {mask} of innumerable past love story derisions .,charly comes off as emotionally {neg_adj} and sadly {pos_adj} of innumerable past love story derisions .
9,0,the rules of attraction gets us too drunk on the party favors to sober us up with the transparent attempts at moralizing .,the rules of attraction {mask} us too {mask} on the party favors to sober us up with the transparent attempts at moralizing .,the rules of attraction {pos_verb} us too {neg_verb} on the party favors to sober us up with the transparent attempts at moralizing .


In [13]:
print(tg.lexicons)

{'pos_verb': ['leaps', 'hear', 'breathtaking', 'is', 'knows', 'gets', 'rewarding'], 'neg_verb': ['drunk', 'held', 'happened', 'try', 'hoping', 'may', 'were', 'was', 'soar', 'given', 'turning'], 'pos_adj': ['uncanny', 'surprising', 'imitative'], 'neg_adj': ['hotter-two-years-ago', 'national', 'manipulative', 'few', 'laudable', 'faux-urban', 'most']}


# Usando os templates gerados pelo TemplateGenerator no CheckList

In [14]:
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [15]:
lexicons = tg.lexicons
templates = tg.template_texts
masked = tg.masked_texts
labels = [sent.prediction.label for sent in tg.sentences]

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [16]:
for template, label, i in zip(templates, labels, range(len(templates))):
    t = editor.template(template, remove_duplicates=True, labels=int(label))

    suite.add(MFT(
        data=t.data,
        labels=label,
        capability="Vocabullary", 
        name=f"Test: MFT with vocabullary - template{i+1}",
        description="Checking if the model can handle vocabullary"))

In [17]:
suite.run(model.predict, overwrite=True)
suite.save('./suites/posneg-random.suite')

Running Test: MFT with vocabullary - template1
Predicting 11 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


Running Test: MFT with vocabullary - template2
Predicting 7 examples
Running Test: MFT with vocabullary - template3
Predicting 1 examples
Running Test: MFT with vocabullary - template4
Predicting 11 examples
Running Test: MFT with vocabullary - template5
Predicting 11 examples
Running Test: MFT with vocabullary - template6
Predicting 77 examples
Running Test: MFT with vocabullary - template7
Predicting 33 examples
Running Test: MFT with vocabullary - template8
Predicting 77 examples
Running Test: MFT with vocabullary - template9
Predicting 21 examples
Running Test: MFT with vocabullary - template10
Predicting 77 examples
Running Test: MFT with vocabullary - template11
Predicting 77 examples
Running Test: MFT with vocabullary - template12
Predicting 49 examples
Running Test: MFT with vocabullary - template13
Predicting 21 examples
Running Test: MFT with vocabullary - template14
Predicting 11 examples
Running Test: MFT with vocabullary - template15
Predicting 7 examples
Running Test: MFT

In [18]:
suite.visual_summary_table()

Please wait as we prepare the table data...


SuiteSummarizer(stats={'npassed': 0, 'nfailed': 0, 'nfiltered': 0}, test_infos=[{'name': 'Test: MFT with vocab…

# Carregando suite de teste

In [19]:
from checklist.test_suite import TestSuite
suite = TestSuite.from_file('./suites/posneg-random.suite')

# suite.visual_summary_table()

In [20]:
passed = 0
failed = 0
for test_name in suite.tests:
    table = suite.visual_summary_by_test(test_name)
    
    failed += table.stats['nfailed']    
    passed += table.stats['npassed']
    assert table.stats['nfailed'] + table.stats['npassed'] == len(table.filtered_testcases)

print(f"{failed = } ({(failed/(passed+failed))*100:.2f}%)")
print(f"{passed = } ({(passed/(passed+failed))*100:.2f}%)")
print(f"total = {passed+failed}")
print("templates:", len(suite.tests))

failed = 51 (8.17%)
passed = 573 (91.83%)
total = 624
templates: 18


In [21]:
table = suite.visual_summary_by_test('Test: MFT with vocabullary - template2')

for item in table.candidate_testcases:
    print(item['examples'][0]['new']['text'])

` ensemble massacres will throughout .
` ensemble massacres shown throughout .
` ensemble massacres justify throughout .
` ensemble massacres captured throughout .
` ensemble massacres check throughout .
` ensemble massacres heard throughout .
` ensemble massacres be throughout .
` ensemble massacres does throughout .
` ensemble massacres seen throughout .
` matrix'-style massacres will throughout .
` matrix'-style massacres captured throughout .
` fascinating massacres labored throughout .
` fascinating massacres will throughout .
` fascinating massacres erupt throughout .
` fascinating massacres shown throughout .
` fascinating massacres hammer throughout .
` fascinating massacres justify throughout .
` fascinating massacres captured throughout .
` fascinating massacres check throughout .
` fascinating massacres heard throughout .
` fascinating massacres be throughout .
` fascinating massacres does throughout .
` fascinating massacres seen throughout .
` fascinating massacres tends t

In [22]:
passed = 0
failed = 0
for i in range(len(suite.tests)):
    table = suite.visual_summary_by_test(f'Test: MFT with vocabullary - template{i+1}')
    failed = failed + len(table.candidate_testcases)    
    passed = passed + len(table.filtered_testcases)

print(f"{failed=}", f"{passed=}", f"{passed+failed=}", sep="\n")

failed=111
passed=602
passed+failed=713
