# Abordagem 2

Usando a abordagem 2 para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística *Vocabullary* com o teste **MFT**.

As etapas desta abordagem são:

1. Rankear as palavras das instâncias completas
2. Quebrar as instâncias em sentenças
3. Filtrar as sentenças que contêm ao menos uma das palavras mais bem rankeadas na etapa anterior
4. Rankear as palavras de cada sentença
5. Filtrar as sentenças com palavras relevantes (adjetivos ou verbos)
6. Classificar as sentenças usando o *Oráculo*
7. Substituir as palavras relevantes por máscaras

In [1]:
%config Completer.use_jedi = False
import sys
sys.path.append('../')

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [2]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

imdb_df = pd.read_csv('./data/imdb_sampled/data-100samples.csv')
imdb_df.head(5)

Unnamed: 0,label,text,words
0,1,"Christian Duguay directed this tidy little espionage thriller early in his career. It plays on TV pretty regularly, albeit with some terrific scenes of violence and sex unfortunately trimmed. I finally got around to seeing the theatrical version on a $3 tape from the local video store. Naval officer Aidan Quinn is recruited to impersonate the notorious Carlos the Jackal, and gets a little too caught up in the role. Donald Sutherland Ben Kingsley play Quinn's superiors, with Sutherland a true zealot and Kingsley as the more level-headed one. The first half of this fun flick shows Quinn being trained and indoctrinated. The second half has him out in the field, making love to the Jackal's woman and shooting it out with sundry enemies. The idea is to make the Jackal look like a turncoat to the Russians, and let them take care of the world's most notorious assassin. Things don't exactly play out as planned. At times, I almost expected the cast to break out laughing at some of the corny dialogue, but they all play it very straight. In the end, this is one terrific little thriller that deserves your attention. The Jackal's former mistress teaching the highly proper and very married Quinn to rough her up, lick blood from her face, and then go down on her, alone is worth the price of admission.",227
1,1,"New Yorkers contemporaneous with this film will recall how reflective of its time it is and how well cast and crew captured America, New York City of that era.<br /><br />Norman Wexler's script delineates the different worlds the various sub groupings live in and Avildsen's direction brings out phenomenal performances all around. Peter Boyle's prodigious talent is on display as never before nor since. Clearly it is the best character portrayal the always likable Dennis Patrick ever accomplished.<br /><br />What I will always remember about JOE is the feeling of having been in a virtual state of shock coming out of the theater. Knowing that what the screen portrayed was seething under the surface in neighborhoods throughout the five boroughs of the City of New York.<br /><br />This film needs to be remembered.",133
2,0,"I love oddball animation, I love a lot of Asian films, but I didn't love this particular product of Japan. The Fuccons are supposedly an American family (they're all mannequins) who have moved to Japan, and they're somewhat a 50's sitcom type family, with slightly more modern sensibilities at times. The DVD features several very short episodes (like less than 5 minutes each?) and I did not find it to be either funny or entertaining, not even in a weird way. I'm not sure what the appeal is of this. I did pick up on some satire here and there, gosh, who wouldn't, but satire is usually somewhat humorous, isn't it? And nothing I saw or heard rated even a little smirk. I picked this up used and it certainly SOUNDED appealing, but I guess either I'm missing the point or it's just plain LAME. The box even says it's Fuccon hilarious, right there on the front, but I beg to differ. 2 out of 10.",166
3,1,"I have seen this film probably a dozen times since it was originally released theatrically. Anyone who calls this movie trash or horrible just doesn't understand action films or recognize a good one. Perhaps to some the incidents and outcomes may seem far fetched, but in my opinion screenwriter Shane Black ( Lethal Weapon/ Kiss Kiss Bang Bang) crafted one of the most well thought out action adventures you will ever come across. Over the top or not this film flows like clockwork and the action just keeps coming. The final action sequence is one of the best I have ever seen in any film. The cast in this film crackles. Genna Davis gave a tremendous performance and its a damn shame there was never a ""LKG"" sequel. Samuel L. Jackson is hilarious as her sidekick Mitch a down on his luck private eye trying to help her discover her lost past and make a few bucks. If Baffles me how anyone could not like this film. It packs so many thrills and its so funny. The wisecracks in this film still make me laugh just as hard 10 years later. In my mind the first Matrix film and the Long Kiss Goodnight were easily 2 of the best and most original action flicks of the 90's. Incidentally Shane Black made a fortune when he sold this script. At the time it was the highest selling screenplay and its worth every penny. It's so sad that audiences never gave this movie a chance, cause they would have witnessed Renny Harlins best film and Genna Davis like you have never seen her before. Long live ""The Long Kiss Goodnight""!!",278
4,1,"Don't mind what this socially retarded person above says, this show is hilarious. It shows how a lot of single men are in a bar atmosphere, and also shows that women are not as gullible as men think they are. <br /><br />The contest aspect of the how is really cool and original. Its not the standard reality show that we are all used to now a days.<br /><br />Give it a chance everyone, we are only one episode in, we finally have some Canadian programming that isn't absolute crap. As Canadians what do we normally get, Bon Cop, Bad Cop, or Corner Gas. Come on people show that we are all not as prudish as the previous reviewer.<br /><br />Way to go Comedy Network, giving a new show a chance. The panel is funny and the contestants so far are pretty good.",143


In [3]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
imdb_models = {
    'bert': 'textattack/bert-base-uncased-imdb', 
    'albert': 'textattack/albert-base-v2-imdb', 
    'distilbert': 'textattack/distilbert-base-uncased-imdb', 
    'roberta': 'textattack/roberta-base-imdb', 
    'xlnet': 'textattack/xlnet-base-cased-imdb'
}

In [4]:
m1 = load_model(imdb_models['albert'])
m2 = load_model(imdb_models['distilbert'])
m3 = load_model(imdb_models['roberta'])
m4 = load_model(imdb_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(imdb_models['bert'])

Loading model textattack/albert-base-v2-imdb...
Loading model textattack/distilbert-base-uncased-imdb...
Loading model textattack/roberta-base-imdb...


Some weights of the model checkpoint at textattack/roberta-base-imdb were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Loading model textattack/xlnet-base-cased-imdb...
Loading model textattack/bert-base-uncased-imdb...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [5]:
from template_generator.tasks.sentiment_analisys import PosNegTemplateGeneratorApp2

tg = PosNegTemplateGeneratorApp2(model, models)

### Número inicial de instâncias: 5

In [6]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = imdb_df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [7]:
templates = tg.generate_templates(instances)

Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


Converting texts to sentences...
:: 37 sentences were generated.
Filtering instances by contaning ranked words...
:: 16 sentences remaining.
Ranking words using Replace-1 Score...
:: Word ranking done.
Filtering instances by relevant words...
:: 4 sentences remaining.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 5 instâncias: 6m 19.8s
filipe: 2m 39.8


In [8]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text
0,1,"Anyhow, this is a great study of a fascinating musician, woefully underknown, full of great stories, greater music, and it could have been 3 hours longer and I'd have loved it even more.","Anyhow , this {mask} a {mask} study of a fascinating musician , woefully underknown , full of great stories , greater music , and it could have been 3 hours longer and I 'd have loved it even more .","Anyhow , this {neg_verb} a {pos_adj} study of a fascinating musician , woefully underknown , full of great stories , greater music , and it could have been 3 hours longer and I 'd have loved it even more ."
1,1,"Saw it at the American Cinemateque Mods & Rockers Festival at the Aero Theatre in Santa Monica, where it played to a packed house.","Saw it at the {mask} Cinemateque Mods & Rockers Festival at the Aero Theatre in Santa Monica , where it played to a {mask} house .","Saw it at the {pos_adj} Cinemateque Mods & Rockers Festival at the Aero Theatre in Santa Monica , where it played to a {pos_verb} house ."
2,1,"Just forget the itty-bitty disappointments, like the fact that there were only adults in this movie based on a pup's point of view, because that's just 0.5% or less of the movie's wonderful effect on the viewer.","Just {mask} the itty-bitty disappointments , like the fact that there were only adults in this movie based on a pup 's point of view , because that 's just 0.5 % or less of the movie 's {mask} effect on the viewer .","Just {neg_verb} the itty-bitty disappointments , like the fact that there were only adults in this movie based on a pup 's point of view , because that 's just 0.5 % or less of the movie 's {pos_adj} effect on the viewer ."
3,1,Anyone who calls this movie trash or horrible just doesn't understand action films or recognize a good one.,Anyone who calls this movie trash or {mask} just does n't understand action films or {mask} a good one .,Anyone who calls this movie trash or {neg_adj} just does n't understand action films or {neg_verb} a good one .


In [9]:
tg.lexicons

{'pos_verb': ['packed'],
 'neg_verb': ['recognize', 'is', 'forget'],
 'pos_adj': ['great', 'American', 'wonderful'],
 'neg_adj': ['horrible']}

### Número inicial de instâncias: 100

In [10]:
# Using all 100 instances
instances = [x for x in imdb_df['text'].values]

In [11]:
templates = tg.generate_templates(instances)

Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


Converting texts to sentences...
:: 795 sentences were generated.
Filtering instances by contaning ranked words...
:: 346 sentences remaining.
Ranking words using Replace-1 Score...
:: Word ranking done.
Filtering instances by relevant words...
:: 35 sentences remaining.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 100 instâncias: 132m 12.5s
61m 2.8s

In [12]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text
0,1,"New Yorkers contemporaneous with this film will recall how reflective of its time it is and how well cast and crew captured America, New York City of that era.<br /><br />Norman Wexler's script delineates the different worlds the various sub groupings live in and Avildsen's direction brings out phenomenal performances all around.","New Yorkers contemporaneous with this film will recall how reflective of its time it is and how well cast and crew captured America , New York City of that era. < br / > < br / > Norman Wexler 's script delineates the different {mask} the various sub groupings live in and Avildsen 's direction brings out {mask} performances all around .","New Yorkers contemporaneous with this film will recall how reflective of its time it is and how well cast and crew captured America , New York City of that era. < br / > < br / > Norman Wexler 's script delineates the different {pos_verb} the various sub groupings live in and Avildsen 's direction brings out {pos_adj} performances all around ."
1,1,Anyone who calls this movie trash or horrible just doesn't understand action films or recognize a good one.,Anyone who calls this movie trash or {mask} just does n't understand action films or {mask} a good one .,Anyone who calls this movie trash or {neg_adj} just does n't understand action films or {neg_verb} a good one .
2,1,"Don't mind what this socially retarded person above says, this show is hilarious.","Do n't mind what this socially retarded person above {mask} , this show is {mask} .","Do n't mind what this socially retarded person above {neg_verb} , this show is {pos_adj} ."
3,0,"This is a like a championship sports team fielding all substitutes except one.<br /><br />Brynner is good, once again: fun to watch, fun to hear with that distinctive deep voice of his, but the story, not just the rest of the crew, is lame.","This is a like a championship sports team fielding all substitutes except one. < br / > < br / > Brynner is good , once again : fun to watch , fun to hear with that distinctive deep voice of his , but the story , not just the rest of the crew , {mask} {mask} .","This is a like a championship sports team fielding all substitutes except one. < br / > < br / > Brynner is good , once again : fun to watch , fun to hear with that distinctive deep voice of his , but the story , not just the rest of the crew , {neg_verb} {neg_adj} ."
4,1,"Naturally, they find no gold down there but one very hungry monster that slithers along in search of prey.<br /><br />While I have to be honest and admit I found it dull at first (I personally prefer the thematically similar ""The Boogens""), it actually grew on me as it went along.","Naturally , they find no gold down there but one very hungry monster that slithers along in search of prey. < br / > < br / > While I have to be honest and admit I {mask} it {mask} at first ( I personally prefer the thematically similar `` The Boogens '' ) , it actually grew on me as it went along .","Naturally , they find no gold down there but one very hungry monster that slithers along in search of prey. < br / > < br / > While I have to be honest and admit I {pos_verb} it {neg_verb} at first ( I personally prefer the thematically similar `` The Boogens '' ) , it actually grew on me as it went along ."
5,0,"The writer played by effects man Mark Sawicki wears thin quickly.<br /><br />It begins in a comfortably predictable enough way, with a nighttime set piece in which two victims are claimed to get things off to an acceptable start.","The writer played by effects man Mark Sawicki wears thin quickly. < br / > < br / > It {mask} in a comfortably {mask} enough way , with a nighttime set piece in which two victims are claimed to get things off to an acceptable start .","The writer played by effects man Mark Sawicki wears thin quickly. < br / > < br / > It {neg_verb} in a comfortably {neg_adj} enough way , with a nighttime set piece in which two victims are claimed to get things off to an acceptable start ."
6,1,"Just forget the itty-bitty disappointments, like the fact that there were only adults in this movie based on a pup's point of view, because that's just 0.5% or less of the movie's wonderful effect on the viewer.","Just {mask} the itty-bitty disappointments , like the fact that there were only adults in this movie based on a pup 's point of view , because that 's just 0.5 % or less of the movie 's {mask} effect on the viewer .","Just {neg_verb} the itty-bitty disappointments , like the fact that there were only adults in this movie based on a pup 's point of view , because that 's just 0.5 % or less of the movie 's {pos_adj} effect on the viewer ."
7,0,"Not everything goes to plan, and the movie is about them winging it.","Not everything goes to {mask} , and the movie is about them {mask} it .","Not everything goes to {neg_verb} , and the movie is about them {neg_verb} it ."
8,1,Perhaps a great white?,Perhaps a {mask} {mask} ?,Perhaps a {pos_adj} {neg_adj} ?
9,1,"It might as well been called ""National Lampoon's Sexy-N-Loose.""",It might as well been {mask} `` National Lampoon 's {mask} . '',It might as well been {neg_verb} `` National Lampoon 's {pos_adj} . ''


In [13]:
tg.lexicons

{'pos_verb': ['believe',
  'finds',
  'loved',
  'kept',
  'watched',
  'felt',
  'packed',
  'drawing',
  'found',
  'worlds',
  'fascinating'],
 'neg_verb': ['was',
  'wastes',
  'reading',
  'winging',
  'Wish',
  'plan',
  'is',
  'forget',
  'consists',
  'can',
  'saw',
  'have',
  'looking',
  'says',
  'wonder',
  'dull',
  'must',
  'recognize',
  'called',
  'could',
  'begins',
  'saying',
  'overwrought'],
 'pos_adj': ['magnificent',
  'hilarious',
  'American',
  'phenomenal',
  'great',
  'tale',
  'documentary',
  'goofy',
  'sharp',
  'beautiful',
  'Sexy-N-Loose',
  'wonderful'],
 'neg_adj': ['white',
  'unfunny',
  'least',
  'stereotypic',
  'predictable',
  'horrible',
  'ridiculous',
  'low',
  'incoherent',
  'worse',
  'lame']}

# Usando os templates gerados pelo TemplateGenerator no CheckList

In [14]:
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [15]:
lexicons = tg.lexicons
templates = tg.template_texts
masked = tg.masked_texts
labels = [sent.prediction.label for sent in tg.sentences]

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [16]:
for template, label, i in zip(templates, labels, range(len(templates))):
    t = editor.template(template, remove_duplicates=True, labels=int(label))

    suite.add(MFT(
        data=t.data,
        labels=label,
        capability="Vocabullary", 
        name=f"Test: MFT with vocabullary - template{i+1}",
        description="Checking if the model can handle vocabullary"))

In [17]:
suite.run(model.predict, overwrite=True)
suite.save('./suites/posneg-approach2.suite')

Running Test: MFT with vocabullary - template1
Predicting 132 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


Running Test: MFT with vocabullary - template2
Predicting 253 examples
Running Test: MFT with vocabullary - template3
Predicting 276 examples
Running Test: MFT with vocabullary - template4
Predicting 253 examples
Running Test: MFT with vocabullary - template5
Predicting 253 examples
Running Test: MFT with vocabullary - template6
Predicting 253 examples
Running Test: MFT with vocabullary - template7
Predicting 276 examples
Running Test: MFT with vocabullary - template8
Predicting 23 examples
Running Test: MFT with vocabullary - template9
Predicting 132 examples
Running Test: MFT with vocabullary - template10
Predicting 276 examples
Running Test: MFT with vocabullary - template11
Predicting 253 examples
Running Test: MFT with vocabullary - template12
Predicting 253 examples
Running Test: MFT with vocabullary - template13
Predicting 12 examples
Running Test: MFT with vocabullary - template14
Predicting 253 examples
Running Test: MFT with vocabullary - template15
Predicting 23 examples
Run

# Carregando suite de teste

In [18]:
from checklist.test_suite import TestSuite
suite = TestSuite.from_file('./suites/posneg-approach2.suite')

# suite.visual_summary_table()

In [19]:
table = suite.visual_summary_by_test('Test: MFT with vocabullary - template1')

for item in table.candidate_testcases:
    print(item['examples'][0]['new']['text'])

In [20]:
passed = 0
failed = 0
for i in range(1, 36):
    table = suite.visual_summary_by_test(f'Test: MFT with vocabullary - template{i}')
    failed = failed + len(table.candidate_testcases)    
    passed = passed + len(table.filtered_testcases)

In [21]:
print(failed, passed, passed + failed)

2880 6197 9077
