# Abordagem randômica

Usando a abordagem randômica para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística *Vocabullary* com o teste **MFT**.

As etapas desta abordagem são:

1. Quebrar as instâncias em sentenças
2. Selecionar uma amostra de *K* sentenças de forma aleatória
3. Rankear as palavras de cada sentença
4. Realizar as predições de cada sentença usando o *Oráculo*
5. Substituir as palavras relevantes por máscaras

In [1]:
%config Completer.use_jedi = False
import sys
import random

sys.path.append('../')
random.seed(220)

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [2]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

imdb_df = pd.read_csv('./data/imdb_sampled/data-100samples.csv')
imdb_df.head(5)

Unnamed: 0,label,text,words
0,1,"Christian Duguay directed this tidy little espionage thriller early in his career. It plays on TV pretty regularly, albeit with some terrific scenes of violence and sex unfortunately trimmed. I finally got around to seeing the theatrical version on a $3 tape from the local video store. Naval officer Aidan Quinn is recruited to impersonate the notorious Carlos the Jackal, and gets a little too caught up in the role. Donald Sutherland Ben Kingsley play Quinn's superiors, with Sutherland a true zealot and Kingsley as the more level-headed one. The first half of this fun flick shows Quinn being trained and indoctrinated. The second half has him out in the field, making love to the Jackal's woman and shooting it out with sundry enemies. The idea is to make the Jackal look like a turncoat to the Russians, and let them take care of the world's most notorious assassin. Things don't exactly play out as planned. At times, I almost expected the cast to break out laughing at some of the corny dialogue, but they all play it very straight. In the end, this is one terrific little thriller that deserves your attention. The Jackal's former mistress teaching the highly proper and very married Quinn to rough her up, lick blood from her face, and then go down on her, alone is worth the price of admission.",227
1,1,"New Yorkers contemporaneous with this film will recall how reflective of its time it is and how well cast and crew captured America, New York City of that era.<br /><br />Norman Wexler's script delineates the different worlds the various sub groupings live in and Avildsen's direction brings out phenomenal performances all around. Peter Boyle's prodigious talent is on display as never before nor since. Clearly it is the best character portrayal the always likable Dennis Patrick ever accomplished.<br /><br />What I will always remember about JOE is the feeling of having been in a virtual state of shock coming out of the theater. Knowing that what the screen portrayed was seething under the surface in neighborhoods throughout the five boroughs of the City of New York.<br /><br />This film needs to be remembered.",133
2,0,"I love oddball animation, I love a lot of Asian films, but I didn't love this particular product of Japan. The Fuccons are supposedly an American family (they're all mannequins) who have moved to Japan, and they're somewhat a 50's sitcom type family, with slightly more modern sensibilities at times. The DVD features several very short episodes (like less than 5 minutes each?) and I did not find it to be either funny or entertaining, not even in a weird way. I'm not sure what the appeal is of this. I did pick up on some satire here and there, gosh, who wouldn't, but satire is usually somewhat humorous, isn't it? And nothing I saw or heard rated even a little smirk. I picked this up used and it certainly SOUNDED appealing, but I guess either I'm missing the point or it's just plain LAME. The box even says it's Fuccon hilarious, right there on the front, but I beg to differ. 2 out of 10.",166
3,1,"I have seen this film probably a dozen times since it was originally released theatrically. Anyone who calls this movie trash or horrible just doesn't understand action films or recognize a good one. Perhaps to some the incidents and outcomes may seem far fetched, but in my opinion screenwriter Shane Black ( Lethal Weapon/ Kiss Kiss Bang Bang) crafted one of the most well thought out action adventures you will ever come across. Over the top or not this film flows like clockwork and the action just keeps coming. The final action sequence is one of the best I have ever seen in any film. The cast in this film crackles. Genna Davis gave a tremendous performance and its a damn shame there was never a ""LKG"" sequel. Samuel L. Jackson is hilarious as her sidekick Mitch a down on his luck private eye trying to help her discover her lost past and make a few bucks. If Baffles me how anyone could not like this film. It packs so many thrills and its so funny. The wisecracks in this film still make me laugh just as hard 10 years later. In my mind the first Matrix film and the Long Kiss Goodnight were easily 2 of the best and most original action flicks of the 90's. Incidentally Shane Black made a fortune when he sold this script. At the time it was the highest selling screenplay and its worth every penny. It's so sad that audiences never gave this movie a chance, cause they would have witnessed Renny Harlins best film and Genna Davis like you have never seen her before. Long live ""The Long Kiss Goodnight""!!",278
4,1,"Don't mind what this socially retarded person above says, this show is hilarious. It shows how a lot of single men are in a bar atmosphere, and also shows that women are not as gullible as men think they are. <br /><br />The contest aspect of the how is really cool and original. Its not the standard reality show that we are all used to now a days.<br /><br />Give it a chance everyone, we are only one episode in, we finally have some Canadian programming that isn't absolute crap. As Canadians what do we normally get, Bon Cop, Bad Cop, or Corner Gas. Come on people show that we are all not as prudish as the previous reviewer.<br /><br />Way to go Comedy Network, giving a new show a chance. The panel is funny and the contestants so far are pretty good.",143


In [3]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
imdb_models = {
    'bert': 'textattack/bert-base-uncased-imdb', 
    'albert': 'textattack/albert-base-v2-imdb', 
    'distilbert': 'textattack/distilbert-base-uncased-imdb', 
    'roberta': 'textattack/roberta-base-imdb', 
    'xlnet': 'textattack/xlnet-base-cased-imdb'
}

In [4]:
m1 = load_model(imdb_models['albert'])
m2 = load_model(imdb_models['distilbert'])
m3 = load_model(imdb_models['roberta'])
m4 = load_model(imdb_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(imdb_models['bert'])

Loading model textattack/albert-base-v2-imdb...
Loading model textattack/distilbert-base-uncased-imdb...
Loading model textattack/roberta-base-imdb...


Some weights of the model checkpoint at textattack/roberta-base-imdb were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Loading model textattack/xlnet-base-cased-imdb...
Loading model textattack/bert-base-uncased-imdb...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [5]:
from template_generator_static_lex.tasks.sentiment_analisys import PosNegTemplateGeneratorRandom

tg = PosNegTemplateGeneratorRandom(model, models)

### Número inicial de instâncias: 5

In [6]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = imdb_df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [7]:
templates = tg.generate_templates(instances, n_masks=2, k_templates=1)

Converting texts to sentences...
:: 40 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 5 instâncias: 0m 3.1s

In [8]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text
0,1,Lady and the Tramp II: Scamp's Adventure is a cool movie that many kids today can really relate to.,Lady and the Tramp II : Scamp 's Adventure is a {mask} movie that {mask} kids today can really relate to .,Lady and the Tramp II : Scamp 's Adventure is a {pos_adj} movie that {neg_adj} kids today can really relate to .


In [9]:
tg.lexicons

{'pos_verb': [], 'neg_verb': [], 'pos_adj': ['cool'], 'neg_adj': ['many']}

### Número inicial de instâncias: 100

In [10]:
# Using all 100 instances
instances = [x for x in imdb_df['text'].values]

In [11]:
%%time
# 2min 1.1s
tg = PosNegTemplateGeneratorRandom(model, models)
templates = tg.generate_templates(instances, n_masks=2, k_templates=18)

Converting texts to sentences...
:: 852 sentences were generated.
Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Word ranking done.
Predicting inputs...
:: Sentence predictions done.
CPU times: user 18min 12s, sys: 1.64 s, total: 18min 14s
Wall time: 2min 1s


#### Tempo de execução para 100 instâncias: 1m 10.3s

In [12]:
df_templates = tg.to_dataframe()
df_templates.insert(0, "template_index", df_templates.index.map(lambda x: int(x)+1))
df_templates

Unnamed: 0,template_index,label,original_text,masked_text,template_text
0,1,1,/><br />'Mildred Roper' (Yootha Joyce) is keen to ascertain whether or not her slovenly husband 'George' (Brian Murphy) has remembered their wedding anniversary.,/><br />'Mildred Roper ' ( Yootha Joyce ) is keen to ascertain whether or not her {mask} husband ' George ' ( Brian Murphy ) has {mask} their wedding anniversary .,/><br />'Mildred Roper ' ( Yootha Joyce ) is keen to ascertain whether or not her {neg_adj} husband ' George ' ( Brian Murphy ) has {pos_verb} their wedding anniversary .
1,2,0,This movie purports to be a character study of perversion.,This movie {mask} to be a character study of perversion .,This movie {neg_verb} to be a character study of perversion .
2,3,0,"This is the worst horror movie I've seen in a long time, and I've watched a lot of horror movies.","This is the {mask} horror movie I 've {mask} in a long time , and I 've watched a lot of horror movies .","This is the {neg_adj} horror movie I 've {neg_verb} in a long time , and I 've watched a lot of horror movies ."
3,4,0,"Lisa looks like she'd rather be anywhere else, and since she wasn't any talent, I wonder why they kept her.","Lisa {mask} like she 'd rather be anywhere else , and since she was n't any talent , I {mask} why they kept her .","Lisa {neg_verb} like she 'd rather be anywhere else , and since she was n't any talent , I {neg_verb} why they kept her ."
4,5,0,"Didn't this guy, this director, if you can call him that, realize that the first Problem Child was bad enough?","Did n't this guy , this director , if you can call him that , {mask} that the first Problem Child was {mask} enough ?","Did n't this guy , this director , if you can call him that , {pos_verb} that the first Problem Child was {neg_adj} enough ?"
5,6,0,How pathetic....,How {mask} ....,How {neg_adj} ....
6,7,1,"To a degree this evoked Michael Moore's recent work (although Nossiter operates in a more subtle way), but probably the roots of the film go back to Marcel Ophuls' ""The Sorrow and the Pity"", both in the way the film is constructed and in the emergence of 'salt of the earth' French peasants as the stars.","To a degree this {mask} Michael Moore 's recent work ( although Nossiter operates in a more subtle way ) , but probably the roots of the film go back to Marcel Ophuls ' "" The Sorrow and the Pity "" , both in the way the film is constructed and in the emergence of ' salt of the earth ' {mask} peasants as the stars .","To a degree this {pos_verb} Michael Moore 's recent work ( although Nossiter operates in a more subtle way ) , but probably the roots of the film go back to Marcel Ophuls ' "" The Sorrow and the Pity "" , both in the way the film is constructed and in the emergence of ' salt of the earth ' {pos_adj} peasants as the stars ."
7,8,1,"But it's pure fun, it looks great, and remains light without mocking itself.","But it 's {mask} fun , it looks great , and remains light without {mask} itself .","But it 's {pos_adj} fun , it looks great , and remains light without {neg_verb} itself ."
8,9,0,/>The simple fact is that Saw is riddled with plot holes.,/>The {mask} fact is that Saw is {mask} with plot holes .,/>The {pos_adj} fact is that Saw is {neg_verb} with plot holes .
9,10,1,... a recommendation!,... a recommendation !,... a recommendation !


In [13]:
df_templates.to_csv("generated_templates/generated_templates_random.csv")

In [14]:
print(tg.lexicons)

{'pos_verb': ['ended', 'remembered', 'evoked', 'realize'], 'neg_verb': ['change', 'looks', 'wonder', 'have', 'seen', 'mocking', 'see', "wasn't.<br", 'purports', 'think', 'are', 'appeared', 'riddled'], 'pos_adj': ['pure', 'simple', 'rare', 'best', 'French'], 'neg_adj': ['worst', 'bad', 'racial', 'pathetic', 'disappointing', 'funny', 'slovenly', 'many']}


# Usando os templates gerados pelo TemplateGenerator no CheckList

In [5]:
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [6]:
lexicons = {'pos_verb': ['evoked', 'celebrating', 'loved', 'shares', 'carries', 'enjoy'],
 'neg_verb': ['withered',
  'simpering',
  'squanders',
  'dismissed',
  'removed',
  'undermines',
  'avoids'],
 'pos_adj': ['wonderful',
  'notch',
  'timeless',
  'phenomenal',
  'powerful',
  'terrific',
  'funny',
  'engrossing',
  'passionate',
  'tender',
  'worth',
  'breathtaking',
  'successful',
  'brilliant'],
 'neg_adj': ['undeterminable',
  'preposterous',
  'drunk',
  'tiresome',
  'bad',
  'stupid',
  'psychopathic',
  'unpretentious',
  'dumb',
  'cumbersome',
  'manipulative',
  'nasty',
  'sensitive',
  'off']}

In [7]:
df = pd.read_csv("generated_templates/generated_templates_random.csv")

templates = df["template_text"].to_list()
masked = df["masked_text"].to_list()
labels = df["label"].to_list()

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [8]:
for template, label, i in zip(templates, labels, range(len(templates))):
    t = editor.template(template, remove_duplicates=True, labels=int(label))

    suite.add(MFT(
        data=t.data,
        labels=label,
        capability="Vocabullary", 
        name=f"Test: MFT with vocabullary - template{i+1}",
        description="Checking if the model can handle vocabullary"))

In [9]:
suite.run(model.predict, overwrite=True)
suite.save('./suites/posneg-random.suite')

Running Test: MFT with vocabullary - template1
Predicting 84 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


Running Test: MFT with vocabullary - template2
Predicting 7 examples
Running Test: MFT with vocabullary - template3
Predicting 98 examples
Running Test: MFT with vocabullary - template4
Predicting 7 examples
Running Test: MFT with vocabullary - template5
Predicting 84 examples
Running Test: MFT with vocabullary - template6
Predicting 14 examples
Running Test: MFT with vocabullary - template7
Predicting 84 examples
Running Test: MFT with vocabullary - template8
Predicting 98 examples
Running Test: MFT with vocabullary - template9
Predicting 98 examples
Running Test: MFT with vocabullary - template10
Predicting 1 examples
Running Test: MFT with vocabullary - template11
Predicting 14 examples
Running Test: MFT with vocabullary - template12
Predicting 196 examples
Running Test: MFT with vocabullary - template13
Predicting 7 examples
Running Test: MFT with vocabullary - template14
Predicting 14 examples
Running Test: MFT with vocabullary - template15
Predicting 7 examples
Running Test: MFT 

# Carregando suite de teste

In [10]:
from checklist.test_suite import TestSuite
suite = TestSuite.from_file('./suites/posneg-random.suite')

suite.visual_summary_table()

Please wait as we prepare the table data...


SuiteSummarizer(stats={'npassed': 0, 'nfailed': 0, 'nfiltered': 0}, test_infos=[{'name': 'Test: MFT with vocab…

In [11]:
passed = 0
failed = 0
for test_name in suite.tests:
    table = suite.visual_summary_by_test(test_name)
    
    failed += table.stats['nfailed']    
    passed += table.stats['npassed']
    assert table.stats['nfailed'] + table.stats['npassed'] == len(table.filtered_testcases)

print(f"{failed = } ({(failed/(passed+failed))*100:.2f}%)")
print(f"{passed = } ({(passed/(passed+failed))*100:.2f}%)")
print(f"total = {passed+failed}")
print("templates:", len(suite.tests))

failed = 149 (15.52%)
passed = 811 (84.48%)
total = 960
templates: 18


In [21]:
passed = 0
failed = 0
for test_name in suite.tests:
    table = suite.visual_summary_by_test(test_name)
    
    failed += table.stats['nfailed']    
    passed += table.stats['npassed']
    assert table.stats['nfailed'] + table.stats['npassed'] == len(table.filtered_testcases)

print(f"{failed = } ({(failed/(passed+failed))*100:.2f}%)")
print(f"{passed = } ({(passed/(passed+failed))*100:.2f}%)")
print(f"total = {passed+failed}")
print("templates:", len(suite.tests))

failed = 128 (21.30%)
passed = 473 (78.70%)
total = 601
templates: 18
