# Abordagem 4

Usando a abordagem 4 para gerar templates com foco em templates positivos e negativos. Uma possível aplicação seria testar a capacidade linguística "Vocabulary" com o teste MFT.

As etapas desta abordagem são:

1. Classificar as instancias usando um ou mais modelos
2. Filtrar instâncias classificadas de forma unânime
3. Quebrar a instância em sentenças
4. Classificar as sentenças usando um ou mais modelos para ajudar a rotular as sentenças
5. Filtrar as sentenças classificadas de forma unânime
6. Filtrar as sentenças com alta confiança nas predições
7. Rankear as palavras de cada sentença
8. Filtrar sentenças com palavras relevantes
9. Substituir as palavras relevantes por máscaras

In [5]:
%config Completer.use_jedi = False
import sys
sys.path.append('../')

## Carregando o dataset, o modelo alvo e os modelos auxiliares

In [6]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

movie_reviews_rt_df = pd.read_csv('./data/data-rt-100samples.csv')
movie_reviews_rt_df.head(5)

Unnamed: 0,label,text,words
0,1,allen's underestimated charm delivers more goodies than lumps of coal .,11
1,0,skip the film and buy the philip glass soundtrack cd .,11
2,0,involving at times but lapses quite casually into the absurd .,11
3,0,while hoffman's performance is great the subject matter goes nowhere .,11
4,1,a flick about our infantilized culture that isn't entirely infantile .,11


In [7]:
import re
import numpy as np
from torch.nn.functional import softmax
from transformers import AutoTokenizer, AutoModelForSequenceClassification

def pre_proccess(text):
    text = text.lower()
    text = re.sub('["\',!-.:-@0-9/]()', ' ', text)
    return text

# Wrapper to adapt output format
class SentimentAnalisysModelWrapper:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        
    def __predict(self, text_input):
        text_preprocessed = pre_proccess(text_input)
        tokenized = self.tokenizer(text_preprocessed, padding=True, truncation=True, max_length=512, 
                                    add_special_tokens = True, return_tensors="pt")
        
        tensor_logits = self.model(**tokenized)
        prob = softmax(tensor_logits[0]).detach().numpy()
        pred = np.argmax(prob)
        
        return pred, prob
    
    def predict_label(self, text_inputs):
        return self.predict(text_inputs)[0]
        
    def predict_proba(self, text_inputs):
        return self.predict(text_inputs)[1]
        
    def predict(self, text_inputs):
        if isinstance(text_inputs, str):
            text_inputs = [text_inputs]
        
        preds = []
        probs = []

        for text_input in text_inputs:
            pred, prob = self.__predict(text_input)
            preds.append(pred)
            probs.append(prob[0])

        return np.array(preds), np.array(probs) # ([0, 1], [[0.99, 0.01], [0.03, 0.97]])

# Auxiliar function to load and wrap a model from Hugging Face
def load_model(model_name):
    print(f'Loading model {model_name}...')
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    return SentimentAnalisysModelWrapper(model, tokenizer)

# Hugging Face hosted model names 
movie_reviews_models = {
    'bert': 'textattack/bert-base-uncased-rotten-tomatoes', 
    'albert': 'textattack/albert-base-v2-rotten-tomatoes', 
    'distilbert': 'textattack/distilbert-base-uncased-rotten-tomatoes', 
    'roberta': 'textattack/roberta-base-rotten-tomatoes', 
    'xlnet': 'textattack/xlnet-base-cased-rotten-tomatoes'
}

In [9]:
m1 = load_model(movie_reviews_models['albert'])
m2 = load_model(movie_reviews_models['distilbert'])
m3 = load_model(movie_reviews_models['roberta'])
m4 = load_model(movie_reviews_models['xlnet'])

# Models to be used as oracle
models = [m1, m2, m3, m4]
# Target model
model = load_model(movie_reviews_models['bert'])

Loading model textattack/albert-base-v2-rotten-tomatoes...
Loading model textattack/distilbert-base-uncased-rotten-tomatoes...
Loading model textattack/roberta-base-rotten-tomatoes...


Some weights of the model checkpoint at textattack/roberta-base-rotten-tomatoes were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Loading model textattack/xlnet-base-cased-rotten-tomatoes...
Loading model textattack/bert-base-uncased-rotten-tomatoes...


# Gerando os templates
O método de rankeamento das palavras usado no PosNegTemplateGenerator é o Replace-1 Score

In [10]:
from template_generator.tasks.sentiment_analisys import PosNegTemplateGeneratorApp4

tg = PosNegTemplateGeneratorApp4(model, models)

### Número inicial de instâncias: 5

In [11]:
# Sampling instances
np.random.seed(220)
n_instances = 5
df_sampled = movie_reviews_rt_df.sample(n_instances)

instances = [x for x in df_sampled['text'].values]

In [12]:
templates = tg.generate_templates(instances, n_masks=2, range_words=5, min_classification_score=0.80)

Predicting inputs...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Instance predictions done.
Filtering instances classified unanimously...
:: 2 instances remaining.
Converting texts to sentences...
:: 2 sentences were generated.
Predicting inputs...
:: Sentence predictions done.
Filtering instances classified unanimously...
:: 2 sentences remaining.
Filtering instances by classification score greater than 0.8
:: 2 sentences remaining.
Ranking words using Replace-1 Score...
:: Word ranking done.
Filtering instances by relevant words...
:: 1 sentences remaining.
Filtering instances by relevant words classification score greater than 0.8
:: 0 sentences remaining.


#### Tempo de execução para 100 instâncias: 9.3s

In [13]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text


In [14]:
tg.lexicons

{'pos_verb': [], 'neg_verb': [], 'pos_adj': [], 'neg_adj': []}

### Número inicial de instâncias: 100

In [15]:
# Using all 100 instances
instances = [x for x in movie_reviews_rt_df['text'].values]

In [16]:
templates = tg.generate_templates(instances, n_masks=2, range_words=5, min_classification_score=0.80)

Predicting inputs...


  prob = softmax(tensor_logits[0]).detach().numpy()


:: Instance predictions done.
Filtering instances classified unanimously...
:: 84 instances remaining.
Converting texts to sentences...
:: 109 sentences were generated.
Predicting inputs...
:: Sentence predictions done.
Filtering instances classified unanimously...
:: 90 sentences remaining.
Filtering instances by classification score greater than 0.8
:: 83 sentences remaining.
Ranking words using Replace-1 Score...
:: Word ranking done.
Filtering instances by relevant words...
:: 46 sentences remaining.
Filtering instances by relevant words classification score greater than 0.8
:: 34 sentences remaining.


#### Tempo de execução para 100 instâncias: 5min 0.1s

In [17]:
df = tg.to_dataframe()
df

Unnamed: 0,label,original_text,masked_text,template_text
0,1,intelligent caustic take on a great writer and dubious human being .,{mask} caustic take on a great writer and dubious {mask} being .,{pos_adj} caustic take on a great writer and dubious {neg_adj} being .
1,0,it's a bad sign in a thriller when you instantly know whodunit .,it 's a {mask} sign in a thriller when you instantly {mask} whodunit .,it 's a {neg_adj} sign in a thriller when you instantly {pos_verb} whodunit .
2,0,falsehoods pile up undermining the movie's reality and stifling its creator's comic voice .,falsehoods {mask} up undermining the movie 's reality and stifling its creator 's {mask} voice .,falsehoods {neg_verb} up undermining the movie 's reality and stifling its creator 's {neg_adj} voice .
3,1,this charming thought-provoking new york fest of life and love has its rewards .,this {mask} {mask} new york fest of life and love has its rewards .,this {pos_verb} {pos_adj} new york fest of life and love has its rewards .
4,0,a long dull procession of despair set to cello music culled from a minimalist funeral .,a long {mask} procession of despair set to cello music culled from a {mask} funeral .,a long {neg_adj} procession of despair set to cello music culled from a {neg_adj} funeral .
5,1,awesome creatures breathtaking scenery and epic battle scenes add up to another 'spectacular spectacle . ',{mask} creatures {mask} scenery and epic battle scenes add up to another 'spectacular spectacle . ',{pos_adj} creatures {pos_verb} scenery and epic battle scenes add up to another 'spectacular spectacle . '
6,1,a fascinating dark thriller that keeps you hooked on the delicious pulpiness of its lurid fiction .,a fascinating dark thriller that {mask} you hooked on the delicious pulpiness of its {mask} fiction .,a fascinating dark thriller that {pos_verb} you hooked on the delicious pulpiness of its {neg_adj} fiction .
7,0,could the country bears really be as bad as its trailers ?,{mask} the country bears really be as {mask} as its trailers ?,{neg_verb} the country bears really be as {neg_adj} as its trailers ?
8,0,the movie has generic virtues and despite a lot of involved talent seems done by the numbers .,the movie has {mask} virtues and despite a lot of involved talent {mask} done by the numbers .,the movie has {neg_adj} virtues and despite a lot of involved talent {neg_verb} done by the numbers .
9,0,doesn't get the job done running off the limited chemistry created by ralph fiennes and jennifer lopez .,does n't get the job done {mask} off the {mask} chemistry created by ralph fiennes and jennifer lopez .,does n't get the job done {neg_verb} off the {pos_adj} chemistry created by ralph fiennes and jennifer lopez .


In [18]:
tg.lexicons

{'pos_verb': ['saved',
  'eat',
  'keeps',
  'breathtaking',
  'know',
  'moviemaking',
  'heartbreaking',
  'mesmerize',
  'inspiring',
  'explore',
  'looking',
  'charming',
  'is'],
 'neg_verb': ['otherwise',
  'pile',
  'seeks',
  'shows',
  'should',
  'have',
  'has',
  'thinks',
  'cliched',
  'does',
  'seems',
  'could',
  'chokes',
  'lost',
  'running',
  'depends'],
 'pos_adj': ['riveting',
  'in-depth',
  'unflinching',
  'nincompoop',
  'unselfconscious',
  'astonish',
  'grand-scale',
  'thought-provoking',
  'much',
  'powerful',
  'intelligent',
  'deceptively',
  'limited',
  'awesome',
  'pleasant',
  'gorgeous'],
 'neg_adj': ['little',
  'minimalist',
  'pessimistic',
  'unbearable',
  'drab',
  'bad',
  'comic',
  'human',
  'difficult',
  'self-indulgent',
  'vapid',
  'undone',
  'lulling',
  'consuming',
  'lurid',
  'dull',
  'generic',
  'ridiculous',
  'pompous']}

## Checklist

In [19]:
import checklist
from checklist.editor import Editor
from checklist.test_suite import TestSuite
from checklist.test_types import MFT

In [20]:
lexicons = tg.lexicons
templates = tg.template_texts
labels = [sent.prediction.label for sent in tg.sentences]

editor = Editor()
editor.add_lexicon('pos_verb', lexicons['pos_verb'])
editor.add_lexicon('neg_verb', lexicons['neg_verb'])
editor.add_lexicon('pos_adj', lexicons['pos_adj'])
editor.add_lexicon('neg_adj', lexicons['neg_adj'])

suite = TestSuite()

In [21]:
data = []
lbl = []
for template, label in zip(templates, labels):
    t = editor.template(template, remove_duplicates=True, labels=int(label))
    data.extend(t.data)
    lbl.extend(t.labels)

suite.add(MFT(
    data=data,
    labels=lbl,
    capability="Vocabullary",
    name="Template Generator - Vocabulary in MFT",
    description="Testing the model for vocabulary capability"
))

In [22]:
suite.run(model.predict, overwrite=True)

Running Template Generator - Vocabulary in MFT
Predicting 6655 examples


  prob = softmax(tensor_logits[0]).detach().numpy()


In [23]:
suite.summary()

Vocabullary

Template Generator - Vocabulary in MFT
Test cases:      6655
Fails (rate):    615 (9.2%)

Example fails:
0.3 this is a movie that is what it is : a limited distraction a friday night diversion an excuse to eat popcorn .
----
1.0 it 's a consuming sign in a thriller when you instantly inspiring whodunit .
----
1.0 does n't get the job done have off the gorgeous chemistry created by ralph fiennes and jennifer lopez .
----




