# Abordagem randômica

Usando a abordagem randômica para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística *Vocabullary* com o teste **MFT**.

As etapas desta abordagem são:

1. Quebrar as instâncias em sentenças
2. Selecionar uma amostra de *K* sentenças de forma aleatória
3. Rankear as palavras de cada sentença
4. Realizar as predições de cada sentença usando o *Oráculo*
5. Substituir as palavras relevantes por máscaras

In [1]:
%config Completer.use_jedi = False
import sys
import random

sys.path.append('../')
random.seed(220)

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [27]:
import pandas as pd
pd.set_option('display.max_colwidth', None)
from datasets import load_dataset

dataset = load_dataset("amazon_polarity")
dataset.set_format("pandas")
df = dataset["test"].shuffle(seed=42)[:100]
df = df.rename(columns={"content": "text"}).drop(columns=["title"])
df

Unnamed: 0,label,text
0,1,"The product works fine. I ordered the more exprensive one after I read reviews from others on Amazon. My husband likes the presser. It does a good job pressing his pants. However, it was damaged in the box when we received it. We decided it was too much trouble to send it back. The box was torn and the presser had a chuck knocked out of it."
1,0,"This book is so useless that I feel compelled to write a review to warn others to stay away from this book. A good tutorial should inspire the user on what he/she can do with the product. This book leads you to believe that without talent, the only thing you can do with Illustrator is to draw circles and squares. The book is a disservice to both the reader and to Adobe Illustrator."
2,0,The authors attempt an ambitious goal of covering many SOA topics - but their resulting text come across as scattered - vague - and lacking a coherent and practical application.Thomas Erl's books are much better written - and have a coherent approch to buliding a solid body of knowledge.For a manager / salesperson wanting a broad overview of SOA - they might be better served by reading Service Oriented Architecture For DummiesService Oriented Architecture For Dummies (For Dummies (Computer/Tech))
3,0,I ordered this product and did recieve then a couple months later it broke. Now Ive done everything I was told to do by by shipping back for a replacement and nothing. They wont return Emails i havent received the replacement part.
4,0,"I hated this movie. It was so silly. The girl made the cult look more stupid than they already were. Come on? She was from the future??? I can't stop laughing. Maybe, I missed something. I don't think I did. When it first started, I said to myself: What am I watching this for? I thought it was stupid, stupid and then more stupid. I kept watching, trying to make sense of it, but to no avail. I didn't want to waste my $1.00 rental fee."
...,...,...
95,1,"What a gloriously funny book! Even the recipies were funny, and well, how funny did you think a recipie could be?! I ""discovered"" this book en route to Jamaica back in May--the stranger next to me read it all the way there. Well, the cover just grabbed me and I HAD to have it. It was a quick, light read that had a very wise and uplifting last chapter. Oh, and for those who are clueless like me in the beginning, this is not a fiction novel, but a wacky manual about life, love and other good stuff that we should all follow to the hilt!"
96,0,"If you want Harman Kardon receivers it's ok. Even most of the DVD's. I own a 22 and a 31 and I also got this one which is really annoying.Issues:- it does not save caption settings- it does not save video settings; even after I set it up to be 16:9 1080i default it always reverted to 720p.- after a period of time the DVD unit itself refused to read discsI returned to HK, got a replacement and I'm testing it to see if there are any improvements, but... I think this is unacceptable for HK. After all I did not buy an 80$ Sony, and if I bought HK I bought it for the name which supposedley means quality."
97,0,"Same problems as everybody else. 14 months after purchase it ate the card. Tried 2 different cards, no dice for either. From love to hate. Dang. Also Canon's support website/acknowledgement of this problem is non-existent. It was hard enough to navigate their site, but it's impossible to find anything relevant."
98,0,I can be tough on safety glasses so it may be no fault of the mfg but IMO the lenses scuffed and scratched rather quickly.


In [28]:
df.insert(1, "word_count", df["text"].map(lambda x: len(x.split(" "))) )

df.to_clipboard()

In [3]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
rotten_tomatoes_models = {
    'bert': 'pig4431/amazonPolarity_BERT_5E', 
    'distilbert': 'pig4431/amazonPolarity_DistilBERT_5E', 
    'roberta': 'pig4431/amazonPolarity_roBERTa_5E', 
    'albert': 'pig4431/amazonPolarity_ALBERT_5E',
    'xlnet': 'pig4431/amazonPolarity_XLNET_5E', 
}

In [4]:
m1 = load_model(rotten_tomatoes_models['albert'])
m2 = load_model(rotten_tomatoes_models['distilbert'])
m3 = load_model(rotten_tomatoes_models['roberta'])
m4 = load_model(rotten_tomatoes_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(rotten_tomatoes_models['bert'])

Loading model pig4431/amazonPolarity_ALBERT_5E...
Loading model pig4431/amazonPolarity_DistilBERT_5E...
Loading model pig4431/amazonPolarity_roBERTa_5E...
Loading model pig4431/amazonPolarity_XLNET_5E...
Loading model pig4431/amazonPolarity_BERT_5E...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [5]:
from template_generator.tasks.sentiment_analisys import PosNegTemplateGeneratorRandom

tg = PosNegTemplateGeneratorRandom(model, models)

### Número inicial de instâncias: 5

In [6]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [7]:
templates = tg.generate_templates(instances, n_masks=2, k_templates=1)

Converting texts to sentences...
:: 22 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 5 instâncias: 0m 3.1s

In [8]:
tg.to_dataframe()


Unnamed: 0,label,original_text,masked_text,template_text
0,0,This hand-vac is a definite inconvenience instead of connivence.,This hand-vac {mask} a {mask} inconvenience instead of connivence .,This hand-vac {neg_verb} a {neg_adj} inconvenience instead of connivence .


In [9]:
tg.lexicons

{'pos_verb': [], 'neg_verb': ['is'], 'pos_adj': [], 'neg_adj': ['definite']}

### Número inicial de instâncias: 100

In [10]:
# Using all 100 instances
instances = [x for x in df['text'].values]

In [11]:
%%time

tg = PosNegTemplateGeneratorRandom(model, models)
templates = tg.generate_templates(instances, n_masks=2, k_templates=18)

Converting texts to sentences...
:: 467 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.
CPU times: user 1min 56s, sys: 0 ns, total: 1min 56s
Wall time: 13 s


#### Tempo de execução para 100 instâncias: 11.4s

In [12]:
tg.to_dataframe()


Unnamed: 0,label,original_text,masked_text,template_text
0,0,I am running windows 7 ultimate and a microsoft product is incompatible with it.,I am running windows 7 {mask} and a microsoft product is {mask} with it .,I am running windows 7 {pos_adj} and a microsoft product is {neg_adj} with it .
1,1,"If you think the image is too bright, you can always turn down the brightness in the game.","If you think the image is too {mask} , you can always {mask} down the brightness in the game .","If you think the image is too {pos_adj} , you can always {neg_verb} down the brightness in the game ."
2,1,I ordered a book on March 14 for a birthday present.,I {mask} a book on March 14 for a {mask} present .,I {pos_verb} a book on March 14 for a {neg_adj} present .
3,1,"I think it would have been helpful to show how some of the earrings are actually worn, either on a mannequin head or a real person.","I think it {mask} {mask} been helpful to show how some of the earrings are actually worn , either on a mannequin head or a real person .","I think it {neg_verb} {pos_verb} been helpful to show how some of the earrings are actually worn , either on a mannequin head or a real person ."
4,1,Grooves are solid on here and Instant Funk is a Band that is slept on.,Grooves are {mask} on here and Instant Funk is a Band that {mask} slept on .,Grooves are {pos_adj} on here and Instant Funk is a Band that {neg_verb} slept on .
5,0,"Unfortunately, it hasn't been worth the purchase AT ALL!","Unfortunately , it {mask} n't been {mask} the purchase AT ALL !","Unfortunately , it {pos_verb} n't been {pos_adj} the purchase AT ALL !"
6,0,I really do not feel that this tub deserves the one star.,I really {mask} not feel that this tub {mask} the one star .,I really {neg_verb} not feel that this tub {pos_verb} the one star .
7,1,I like how you can use multiple switches to make even more that 10 different effects.,I like how you {mask} use multiple switches to make even more that 10 {mask} effects .,I like how you {pos_verb} use multiple switches to make even more that 10 {pos_adj} effects .
8,1,"Simple, you say?","Simple , you {mask} ?","Simple , you {neg_verb} ?"
9,0,This set is a box of useless trash.,This set {mask} a box of {mask} trash .,This set {neg_verb} a box of {neg_adj} trash .


In [13]:
print(tg.lexicons)

{'pos_verb': ['has', 'ordered', 'have', 'can', 'deserves', 'teach'], 'neg_verb': ['prove', 'is', 'say', 'hate', 'turn', 'wrote', 'been', 'would', 'was', 'do'], 'pos_adj': ['worth', 'ultimate', 'different', 'third', 'educational', 'bright', 'sure', 'solid', 'good'], 'neg_adj': ['disapointed', 'birthday', 'useless', 'incompatible', 'late']}


# Usando os templates gerados pelo TemplateGenerator no CheckList

In [14]:
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [15]:
lexicons = tg.lexicons
templates = tg.template_texts
masked = tg.masked_texts
labels = [sent.prediction.label for sent in tg.sentences]

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [16]:
for template, label, i in zip(templates, labels, range(len(templates))):
    t = editor.template(template, remove_duplicates=True, labels=int(label))

    suite.add(MFT(
        data=t.data,
        labels=label,
        capability="Vocabullary", 
        name=f"Test: MFT with vocabullary - template{i+1}",
        description="Checking if the model can handle vocabullary"))

In [17]:
suite.run(model.predict, overwrite=True)
suite.save('./suites/posneg-random.suite')

Running Test: MFT with vocabullary - template1
Predicting 45 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


Running Test: MFT with vocabullary - template2
Predicting 90 examples
Running Test: MFT with vocabullary - template3
Predicting 30 examples
Running Test: MFT with vocabullary - template4
Predicting 60 examples
Running Test: MFT with vocabullary - template5
Predicting 90 examples
Running Test: MFT with vocabullary - template6
Predicting 54 examples
Running Test: MFT with vocabullary - template7
Predicting 60 examples
Running Test: MFT with vocabullary - template8
Predicting 54 examples
Running Test: MFT with vocabullary - template9
Predicting 10 examples
Running Test: MFT with vocabullary - template10
Predicting 50 examples
Running Test: MFT with vocabullary - template11
Predicting 90 examples
Running Test: MFT with vocabullary - template12
Predicting 90 examples
Running Test: MFT with vocabullary - template13
Predicting 10 examples
Running Test: MFT with vocabullary - template14
Predicting 90 examples
Running Test: MFT with vocabullary - template15
Predicting 50 examples
Running Test: 

In [18]:
suite.visual_summary_table()

Please wait as we prepare the table data...


SuiteSummarizer(stats={'npassed': 0, 'nfailed': 0, 'nfiltered': 0}, test_infos=[{'name': 'Test: MFT with vocab…

# Carregando suite de teste

In [19]:
from checklist.test_suite import TestSuite
suite = TestSuite.from_file('./suites/posneg-random.suite')

# suite.visual_summary_table()

In [20]:
passed = 0
failed = 0
for test_name in suite.tests:
    table = suite.visual_summary_by_test(test_name)
    
    failed += table.stats['nfailed']    
    passed += table.stats['npassed']
    assert table.stats['nfailed'] + table.stats['npassed'] == len(table.filtered_testcases)

print(f"{failed = } ({(failed/(passed+failed))*100:.2f}%)")
print(f"{passed = } ({(passed/(passed+failed))*100:.2f}%)")
print(f"total = {passed+failed}")
print("templates:", len(suite.tests))

failed = 211 (19.76%)
passed = 857 (80.24%)
total = 1068
templates: 18


In [21]:
table = suite.visual_summary_by_test('Test: MFT with vocabullary - template2')

for item in table.candidate_testcases:
    print(item['examples'][0]['new']['text'])

` ensemble massacres will throughout .
` ensemble massacres shown throughout .
` ensemble massacres justify throughout .
` ensemble massacres captured throughout .
` ensemble massacres check throughout .
` ensemble massacres heard throughout .
` ensemble massacres be throughout .
` ensemble massacres does throughout .
` ensemble massacres seen throughout .
` matrix'-style massacres will throughout .
` matrix'-style massacres captured throughout .
` fascinating massacres labored throughout .
` fascinating massacres will throughout .
` fascinating massacres erupt throughout .
` fascinating massacres shown throughout .
` fascinating massacres hammer throughout .
` fascinating massacres justify throughout .
` fascinating massacres captured throughout .
` fascinating massacres check throughout .
` fascinating massacres heard throughout .
` fascinating massacres be throughout .
` fascinating massacres does throughout .
` fascinating massacres seen throughout .
` fascinating massacres tends t