# Abordagem 1

Usando a abordagem 1 para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística *Vocabullary* com o teste **MFT**.

As etapas desta abordagem são:

1. Rankear as palavras das instâncias completas
2. Quebrar as instâncias em sentenças
3. Filtrar as sentenças que contêm ao menos uma das palavras mais bem rankeadas na etapa anterior
4. Filtrar as sentenças com palavras relevantes (adjetivos ou verbos)
5. Classificar as sentenças usando o *Oráculo*
6. Filtrar as sentenças classificadas de forma unânime
7. Substituir as palavras relevantes por máscaras

In [1]:
%config Completer.use_jedi = False
import sys
sys.path.append('../')

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [2]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

imdb_df = pd.read_csv('./data/imdb_sampled/data-1000samples.csv')
imdb_df.head(5)

Unnamed: 0,label,text,words
0,0,"Here's example number 87,358 of Hollywood's anti-Biblical bias, so typical of them.<br /><br />Early on, Ray Liotta's wife has did and women are being interviewed for the position of housekeeper. The first interviewee is an old-fashioned-looking (dress, mannerisms, speech) who immediately lays down here strict rules, stating that ""there will be two hours of Bible study ever day.""<br /><br />This is said, of course, to make it sound like reading the Bible is the worse punishment you could ever inflict on someone, especially a kid. Once again, the Bible is equated with stuffy, mean-spirited people. That woman, of course, is dismissed immediately.<br /><br />Naturally, the liberal black woman (Whoopi Goldberg - who else?) is the one who is hired and, voilà, saves the day! <br /><br />Yawn.",127
1,0,"I don't know about the real Cobb but I got the distinct impression that the filmmakers' aim was to try to soften his jagged edges and reputation, not give us a true portrait of the man himself. In the movie, besides a few racist remarks, he's shown to be just another hard-nosed, cantakerous old coot (he's so full of life!) with a heart of gold(more or less). This is also the worst acting I've seen T.L.Jones do(he brings nothing new or subtle to his stereotyped character). He just doesn't flesh out Cobb in a way that pulls me into the movie. Not for one minute did I forget that it was Tommy Lee Jones on the screen pretending to be Ty Cobb. Robert Wuhl didnt impress either. The ""comedic"" elements in this movie were just distracting and didnt ring true at all. A bloody waste of time, it is",149
2,0,"Reba is a very dumb show. You can predict pretty much anything that's about to happen. Barbra Jean is just too stupid. It's like she's not even a character. A show like this should at least have SOMEONE who resembles a real-life person. I guess Barbra Jean represents a retarded person. Keira or whatever her name is, Reba, Brock, they're all stupid! Keira is like the smartest person on the show, and she's still stupid. EVERYONE IS STUPID! That's my opinion on Reba. Since I have said all I can say about this show, I'll just take up the next few lines of text by saying what I am currently saying right now and do it until there's 10 lines. There. Reba gets 2/10.",124
3,1,"""One Crazy Summer"" is the funniest, craziest (not necessarily the best), movie I have ever seen.<br /><br />Just when one crazy scene is done, another emerges. It never lets you rest. Just one thing after another. The soundtrack is great. The songs are the right ones for the scenes.<br /><br />It is also a clean movie. Little that is dirty in it.<br /><br />Of course, it has the story of the guys you wouldn't trust with your lunch money, taking up a challenge, and winning over people with more resources. Who'd want to see it if they failed? There is a serious side, in that parents and children do not live up to each others' dreams. One should always have an open mind, and weigh all the options. This applies both to parents and children. In ""One Crazy Summer"", the parents are wrong. This is not always the case.",149
4,0,"""A young man, recently engaged to be married, is the victim of a traffic accident and dies as a result of his injuries. His father, desperate to revive his son, agrees to let a scientist friend try his experimental soul transmigration process to save him. After the young man returns to life, the father and fiancée notice a dark and violent change in the young man's behavior, leading them to believe something went horribly wrong in the revival process,"" according to the DVD sleeve's synopsis.<br /><br />At one point, Edward Norris (as Philip Bennett) is asked, ""What do you think this is, Boys Town?"" Mr. Norris should know, since he was in ""Boys Town"". ""The Man with Two Lives "" is more like ""Black Friday"" minus Karloff and Lugosi. You do the math. This film might have been a contender, with a re-worked script; it does feature an intriguing final act. After a tepid ""shoot out"", hang in for the drama to pick up with a well-played scene between star Norris and pursuing detective Addison Richards (as George Bradley).<br /><br />**** The Man with Two Lives (1942) Phil Rosen ~ Edward Norris, Eleanor Lawson, Addison Richards",196


In [3]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
imdb_models = {
    'bert': 'textattack/bert-base-uncased-imdb', 
    'albert': 'textattack/albert-base-v2-imdb', 
    'distilbert': 'textattack/distilbert-base-uncased-imdb', 
    'roberta': 'textattack/roberta-base-imdb', 
    'xlnet': 'textattack/xlnet-base-cased-imdb'
}

In [4]:
m1 = load_model(imdb_models['albert'])
m2 = load_model(imdb_models['distilbert'])
m3 = load_model(imdb_models['roberta'])
m4 = load_model(imdb_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(imdb_models['bert'])

Loading model textattack/albert-base-v2-imdb...
Loading model textattack/distilbert-base-uncased-imdb...
Loading model textattack/roberta-base-imdb...


Some weights of the model checkpoint at textattack/roberta-base-imdb were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Loading model textattack/xlnet-base-cased-imdb...
Loading model textattack/bert-base-uncased-imdb...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [5]:
from template_generator.tasks.sentiment_analisys import PosNegTemplateGeneratorApp1

tg = PosNegTemplateGeneratorApp1(model, models)

### Número inicial de instâncias: 5

In [6]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = imdb_df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [7]:
templates = tg.generate_templates(instances)

Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


Converting texts to sentences...
:: 31 sentences were generated.
Filtering instances by contaning ranked words...
:: 13 sentences remaining.
Filtering instances by relevant words...
:: 1 sentences remaining.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 5 instâncias: 6m 8.6s

In [8]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text
0,1,"This story of a young white teacher who takes a position teaching poor black kids on an island in the Carolina's is a great advertisement for teaching , and for simply helping each other .Set in the early 60s , with the civil rights issues , Viet Nam and all that came with the 60s ,it is forgotten that the Peace Corps and many young people struck out to make a difference helping the unprivileged .Conrack with his open style of teaching is interested in these kids as people , and encourages an honest interaction in his class that scares the power's that be .The greatest part was that Jon Voight said they had a 20 year reunion and 18 of those kids became teachers !!","This story of a young white teacher who takes a position teaching poor black kids on an island in the Carolina 's {mask} a {mask} advertisement for teaching , and for simply helping each other .Set in the early 60s , with the civil rights issues , Viet Nam and all that came with the 60s , it is forgotten that the Peace Corps and many young people struck out to make a difference helping the unprivileged .Conrack with his open style of teaching is interested in these kids as people , and encourages an honest interaction in his class that scares the power 's that be .The greatest part was that Jon Voight said they had a 20 year reunion and 18 of those kids became teachers ! !","This story of a young white teacher who takes a position teaching poor black kids on an island in the Carolina 's {neg_verb} a {pos_adj} advertisement for teaching , and for simply helping each other .Set in the early 60s , with the civil rights issues , Viet Nam and all that came with the 60s , it is forgotten that the Peace Corps and many young people struck out to make a difference helping the unprivileged .Conrack with his open style of teaching is interested in these kids as people , and encourages an honest interaction in his class that scares the power 's that be .The greatest part was that Jon Voight said they had a 20 year reunion and 18 of those kids became teachers ! !"


In [9]:
tg.lexicons

{'pos_verb': [], 'neg_verb': ['is'], 'pos_adj': ['great'], 'neg_adj': []}

### Número inicial de instâncias: 100

In [10]:
# Using all 100 instances
instances = [x for x in imdb_df['text'].values]

In [11]:
tg = PosNegTemplateGeneratorApp1(model, models)
templates = tg.generate_templates(instances)

Ranking words using Replace-1 Score...


  prob = softmax(tensor_logits[0]).detach().numpy()


Converting texts to sentences...
:: 7973 sentences were generated.
Filtering instances by contaning ranked words...
:: 3497 sentences remaining.
Filtering instances by relevant words...
:: 366 sentences remaining.
Predicting inputs...
:: Sentence predictions done.


#### Tempo de execução para 100 instâncias: 123m 6.7s

In [12]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text
0,0,This is also the worst acting I've seen T.L.Jones do(he brings nothing new or subtle to his stereotyped character).,This is also the {mask} {mask} I 've seen T.L.Jones do ( he brings nothing new or subtle to his stereotyped character ) .,This is also the {neg_adj} {neg_verb} I 've seen T.L.Jones do ( he brings nothing new or subtle to his stereotyped character ) .
1,0,I guess Barbra Jean represents a retarded person.,I {mask} Barbra Jean represents a {mask} person .,I {neg_verb} Barbra Jean represents a {neg_adj} person .
2,1,The soundtrack is great.,The soundtrack {mask} {mask} .,The soundtrack {neg_verb} {pos_adj} .
3,0,That horrible last film was about gangsters killing a kid and then being haunted by a voodoo clown.,That {mask} last film was about gangsters killing a kid and then being haunted by a {mask} clown .,That {neg_adj} last film was about gangsters killing a kid and then being haunted by a {neg_adj} clown .
4,0,It would basically just be a B-movie that likes to show boobs.,It {mask} basically just {mask} a B-movie that likes to show boobs .,It {neg_verb} basically just {neg_verb} a B-movie that likes to show boobs .
...,...,...,...,...
361,1,Danson is outstanding as the title character and edward fox makes a wonderful villain.,Danson {mask} {mask} as the title character and edward fox makes a wonderful villain .,Danson {neg_verb} {pos_adj} as the title character and edward fox makes a wonderful villain .
362,1,It is different for everyone but for most of us it is love.,It {mask} {mask} for everyone but for most of us it is love .,It {neg_verb} {pos_adj} for everyone but for most of us it is love .
363,1,"The second live action outing for Asterix is far better than the glued together elements of ten different stories that was called the first film, instead staying fairly close to the original comic.<br /><br />In a nutshell, Queen Cleopatra has made a bet with Caeser to build a palace in Egypt to show that the Egyptians are a great people.","The second live action {mask} for Asterix {mask} far better than the glued together elements of ten different stories that was called the first film , instead staying fairly close to the original comic. < br / > < br / > In a nutshell , Queen Cleopatra has made a bet with Caeser to build a palace in Egypt to show that the Egyptians are a great people .","The second live action {neg_verb} for Asterix {neg_verb} far better than the glued together elements of ten different stories that was called the first film , instead staying fairly close to the original comic. < br / > < br / > In a nutshell , Queen Cleopatra has made a bet with Caeser to build a palace in Egypt to show that the Egyptians are a great people ."
364,0,"will this comment contain any spoilers?<br /><br />no, because i just did not understand this movie.","{mask} this comment contain any spoilers ? < br / > < br / > no , because i just did not {mask} this movie .","{pos_verb} this comment contain any spoilers ? < br / > < br / > no , because i just did not {neg_verb} this movie ."


In [13]:
tg.lexicons

{'pos_verb': ['animating',
  'felt',
  'present',
  'helps',
  'lives',
  'discuss',
  'find',
  'dancing',
  'sit',
  'watch',
  "'ll",
  'praying',
  'thru',
  'kept',
  'loved',
  'written',
  'editing',
  'resorts',
  'fans',
  'focuses',
  'realised',
  'noticed',
  'sets',
  'plays',
  'restrained',
  'am',
  'Combined',
  'mind',
  'makes',
  'feeling',
  'will',
  'moving.Watching',
  'enjoy',
  'delighted',
  'convincing',
  'balancing',
  'pleasing',
  'bears',
  'take',
  'realizes',
  'coming',
  'appreciate',
  'entertaining',
  'enjoyed',
  'laughing',
  'becoming',
  'being',
  'need',
  'may',
  'promoted',
  'surprised',
  'believe',
  'revolve',
  'deal',
  'Surprised',
  'know',
  'depicts',
  'inspire',
  'captured',
  'bind',
  'listing',
  'reckoned',
  'cleanse',
  "'m",
  'watched',
  'exploring',
  'taking',
  'recommend',
  'deserves'],
 'neg_verb': ['relegated',
  'goes',
  'convoluted',
  'Truly',
  'tries',
  'requires',
  'lacks',
  'dealt',
  'shown',
  '

# Usando os templates gerados pelo TemplateGenerator no CheckList

In [14]:
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [15]:
lexicons = tg.lexicons
templates = tg.template_texts
masked = tg.masked_texts
labels = [sent.prediction.label for sent in tg.sentences]

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [16]:
for template, label, i in zip(templates, labels, range(len(templates))):
    t = editor.template(template, remove_duplicates=True, labels=int(label))

    suite.add(MFT(
        data=t.data,
        labels=label,
        capability="Vocabullary", 
        name=f"Test: MFT with vocabullary - template{i+1}",
        description="Checking if the model can handle vocabullary"))

In [17]:
suite.run(model.predict, overwrite=True)
suite.save('./suites/posneg-approach1.suite')

Running Test: MFT with vocabullary - template1
Predicting 17108 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


Running Test: MFT with vocabullary - template2
Predicting 17108 examples
Running Test: MFT with vocabullary - template3
Predicting 16171 examples
Running Test: MFT with vocabullary - template4
Predicting 109 examples
Running Test: MFT with vocabullary - template5
Predicting 157 examples
Running Test: MFT with vocabullary - template6
Predicting 17108 examples
Running Test: MFT with vocabullary - template7
Predicting 10833 examples
Running Test: MFT with vocabullary - template8
Predicting 17108 examples
Running Test: MFT with vocabullary - template9
Predicting 157 examples
Running Test: MFT with vocabullary - template10
Predicting 17108 examples
Running Test: MFT with vocabullary - template11
Predicting 16171 examples
Running Test: MFT with vocabullary - template12
Predicting 157 examples
Running Test: MFT with vocabullary - template13
Predicting 17108 examples
Running Test: MFT with vocabullary - template14
Predicting 157 examples
Running Test: MFT with vocabullary - template15
Predicti

# Carregando suite de teste

In [18]:
from checklist.test_suite import TestSuite
suite = TestSuite.from_file('./suites/posneg-approach1.suite')

suite.visual_summary_table()

Please wait as we prepare the table data...


SuiteSummarizer(stats={'npassed': 0, 'nfailed': 0, 'nfiltered': 0}, test_infos=[{'name': 'Test: MFT with vocab…

In [19]:
table = suite.visual_summary_by_test('Test: MFT with vocabullary - template5')

failed = table.candidate_testcases
tests = table.filtered_testcases

for item in tests:
    if not item in failed:
        print(item['examples'][0]['new']['text'])

It relegated basically just relegated a B-movie that likes to show boobs .
It goes basically just goes a B-movie that likes to show boobs .
It convoluted basically just convoluted a B-movie that likes to show boobs .
It Truly basically just Truly a B-movie that likes to show boobs .
It tries basically just tries a B-movie that likes to show boobs .
It requires basically just requires a B-movie that likes to show boobs .
It lacks basically just lacks a B-movie that likes to show boobs .
It dealt basically just dealt a B-movie that likes to show boobs .
It shown basically just shown a B-movie that likes to show boobs .
It all. basically just all. a B-movie that likes to show boobs .
It tread basically just tread a B-movie that likes to show boobs .
It went basically just went a B-movie that likes to show boobs .
It teeter basically just teeter a B-movie that likes to show boobs .
It throws basically just throws a B-movie that likes to show boobs .
It thought basically just thought a B-mo