# Sentimental dataset 1
This notebook is responsible for creating test cases for sentiment analysis. After processing all test suites, it will save the test cases to sentimental_suite_dt1.pkl. This scenario had changed some word from the author to approprate with the amazon dataset for example, `This was a well aircraft` to `This was a well silent movie.`. This scenario nlp task also used on the following capabilities: 
    - Capability: Vocabulary
    - Capability: Temporal Awareness
    - Capability: Negation
    - Capability: SRL

Note:
- MFT(Minimum Functionality Test): focuses on evaluating whether a model has the basic functionality 
- DIR(Directional Expectation test). determine whether a model’s predictions are consistent with a prior expectation or hypothesis 
- INV (Invariance testing) is a type of testing in ML that checks whether a model is invariant to certain transformations or changes in the input data. 



ref:
- https://www.godeltech.com/how-to-automate-the-testing-process-for-machine-learning-systems/

To test each test case fail or not, depends on the `Label` that provide in line like this
```test = MFT(**t, labels=0, name=name, capability = 'Vocabulary',description=desc)```
ps. it can be changed depends on the type(MFT, DIR, or INV) that select to test 

For the sentimental scenario
- 'Negative': 0
- 'Positive': 2
- 'Neutral': 1

In [1]:
%load_ext autoreload
%autoreload 2

import checklist
import spacy
import itertools

import checklist.editor
import checklist.text_generation
from checklist.test_types import MFT, INV, DIR
from checklist.expect import Expect
import numpy as np
import spacy
from checklist.test_suite import TestSuite
from checklist.perturb import Perturb
from transformers import pipeline

In [2]:
editor = checklist.editor.Editor()
editor.tg

<checklist.text_generation.TextGenerator at 0x1fa08a7e510>

In [4]:
import csv
r = csv.DictReader(open('amazon_review.csv')) # read file amazon review...... 

# Initialize array to store the data
labels = []
scores = []
titles = []
tdata = []
# reasons = []

# Append data from csv file into the array
for row in r:
    sentiment, score, title, text = row['sentiment'], row['score'], row['title'], row['text']
    labels.append(sentiment)  #append sentiment label
    scores.append(score) #append the score of sentiment
    titles.append(title) #append the title
    tdata.append(text) #append the text
    # reasons.append(row['negativereason'])

mapping = {'Negative': 0, 'Positive': 2, 'Neutral': 1} # define to map the label data into number
labels = np.array([mapping[x] for x in labels]).astype(int)

In [5]:
# This is the model from Spacy library used for the NLP task below in the parsed_question. to parse data into the pipeline
nlp = spacy.load('en_core_web_sm')

In [6]:
# turn data into pipeline then convert to list, and saved the result in `parsed_data`
sentences = tdata
parsed_data = list(nlp.pipe(sentences))

In [7]:
# Test suite is container for the unit test. used for the test case.
suite = TestSuite()

## Capability: Vocabulary

### MFTs

In [8]:
# define noun in amazon review
amazon_noun = ['western movie', 'action movie', 'action movie', 'adventure movie', 'animation', 'cartoon', 'comedy', 'coming of age movie', 'crime movie', 'documentary',
              'drama', 'entertainment', 'family', 'fantasy', 'horror movie', 'musical', 'mystery', 'realistic', 'remale', 'romance', 'sci-fi',
              'short film', 'silent movie', 'thriller movie', 'war', 'genre', 'book', 'single', 'headphone', 'gift', 'series', 'product', 'track', 'shoes',
              'novel', 'E-book', 'tablet', 'smartphone', 'laptop', 'computer', 'smartwatch', 'CD' ]

# add the noun array in to lexicon which can use like mask. On this line, it will be used in {amazon_noun} which can be changed when testing.
editor.add_lexicon('amazon_noun', amazon_noun, overwrite=True)

In [10]:
# define positive, negative, and neutral adjective
pos_adj = ['greatest','well','good', 'great', 'excellent', 'amazing', 'extraordinary', 'beautiful', 'fantastic', 'nice', 'incredible', 'exceptional', 'awesome', 'perfect', 'fun', 'happy', 'adorable', 'brilliant', 'exciting', 'sweet', 'wonderful']
neg_adj = ['awful', 'bad', 'horrible', 'weird', 'rough', 'lousy', 'unhappy', 'average', 'difficult', 'poor', 'sad', 'frustrating', 'hard', 'lame', 'nasty', 'annoying', 'boring', 'creepy', 'dreadful', 'ridiculous', 'terrible', 'ugly', 'unpleasant']
neutral_adj = ['American', 'international',  'commercial', 'British', 'private', 'Italian', 'Indian', 'Australian', 'Israeli', ]

# add dictionary of word into object 
editor.add_lexicon('pos_adj', pos_adj, overwrite=True)
editor.add_lexicon('neg_adj', neg_adj, overwrite=True )
editor.add_lexicon('neutral_adj', neutral_adj, overwrite=True)

In [11]:
print(', '.join(editor.suggest('I really {mask} the {amazon_noun}.')[:200]))

enjoyed, liked, like, enjoy, loved, love, appreciate, appreciated, missed, dig, miss, wanted, got, prefer, hate, want, needed, dislike, admire, admired, dug, respect, found, understand, saw, recommend, get, felt, see, did, remember, preferred, understood, likes, think, feel, disliked, mean, believe, hated, bought, need, adore, do, followed, hope, follow, noticed, read, took, considered, LOVE, watched, enjoying, thought, played, trust, tried, enjoys, was, chose, lost, welcome, supported, favor, made, Love, watch, heard, picked, underestimated, caught, had, LIKE, used, heart, loving, fancy, finished, liking, treasure, support, have, respected, value, experienced, meant, believed, mind, received, cherish, recommended, embraced, remembered, cherished, anticipated, ordered, praised, welcomed, play, buy, Like, minded, valued, helped, use, expected, find, studied, sold, just, skipped, choose, started, hit, embrace, know, crave, left, consider, wrote, brought, applaud, inspired, leave, joined,

In [12]:
# define positive, negative, and neutral verb
pos_verb_present = ['like', 'enjoy', 'appreciate', 'love',  'recommend', 'admire', 'value', 'welcome']
neg_verb_present = ['hate', 'dislike', 'regret',  'abhor', 'dread', 'despise' ]
neutral_verb_present = ['see', 'find']

# define past participle verb
pos_verb_past = ['liked', 'enjoyed', 'appreciated', 'loved', 'admired', 'valued', 'welcomed']
neg_verb_past = ['hated', 'disliked', 'regretted',  'abhorred', 'dreaded', 'despised']
neutral_verb_past = ['saw', 'found']

# add dictionary of word into object 
editor.add_lexicon('pos_verb_present', pos_verb_present, overwrite=True)
editor.add_lexicon('neg_verb_present', neg_verb_present, overwrite=True)
editor.add_lexicon('neutral_verb_present', neutral_verb_present, overwrite=True)
editor.add_lexicon('pos_verb_past', pos_verb_past, overwrite=True)
editor.add_lexicon('neg_verb_past', neg_verb_past, overwrite=True)
editor.add_lexicon('neutral_verb_past', neutral_verb_past, overwrite=True)

# also add into `pos_verb`, `neg_verb`, `neutral_verb` object
editor.add_lexicon('pos_verb', pos_verb_present+ pos_verb_past, overwrite=True)
editor.add_lexicon('neg_verb', neg_verb_present + neg_verb_past, overwrite=True)
editor.add_lexicon('neutral_verb', neutral_verb_present + neutral_verb_past, overwrite=True)

Individual words

In [13]:
# Add individual word test: positive verb
test = MFT(pos_adj + pos_verb_present + pos_verb_past, labels=2)
suite.add(test, 'single positive words', 'Vocabulary', '')

In [14]:
# Add individual word test: negative verb
test = MFT(neg_adj + neg_verb_present + neg_verb_past, labels=0)
suite.add(test, 'single negative words', 'Vocabulary', '')

In [15]:
# Add individual word test: neutral verb
test = MFT(neutral_adj + neutral_verb_present + neutral_verb_past, labels=1)
suite.add(test, 'single neutral words', 'Vocabulary', 'TODO_DESCRIPTION')

Words in context

In [16]:
# Add the test. On each line provided different context with the mask provided
t = editor.template('{it} {amazon_noun} {be} {pos_adj}.', it=['The', 'This', 'That'], be=['is', 'was'], labels=2, save=True) #label 2 refers to positive
t += editor.template('{it} {be} {a:pos_adj} {amazon_noun}.', it=['It', 'This', 'That'], be=['is', 'was'], labels=2, save=True)
t += editor.template('{i} {pos_verb} {the} {amazon_noun}.', i=['I', 'We'], the=['this', 'that', 'the'], labels=2, save=True) #label 0 refer to negative
t += editor.template('{it} {amazon_noun} {be} {neg_adj}.', it=['That', 'This', 'The'], be=['is', 'was'], labels=0, save=True)
t += editor.template('{it} {be} {a:neg_adj} {amazon_noun}.', it=['It', 'This', 'That'], be=['is', 'was'], labels=0, save=True)
t += editor.template('{i} {neg_verb} {the} {amazon_noun}.', i=['I', 'We'], the=['this', 'that', 'the'], labels=0, save=True)

# **t means unpacking the dictionary into keyword argument
test = MFT(**t)
suite.add(test, 'Sentiment-laden words in context', 'Vocabulary', 'Use positive and negative verbs and adjectives with amazon nouns such as silent movie, thriller movie, genre, book, single, headphone, gift, series, etc. E.g. "This was a fantastic movie"')


In [17]:
editor.lexicons['neutral_verb']

['see', 'find', 'saw', 'found']

In [18]:
# Firstly, need to define
# needed to use += to append test on the list
t = editor.template('{it} {amazon_noun} {be} {neutral_adj}.', it=['That', 'This', 'The'], be=['is', 'was'], save=True)
t += editor.template('{it} {be} {a:neutral_adj} {amazon_noun}.', it=['It', 'This', 'That'], be=['is', 'was'], save=True)
t += editor.template('{i} {neutral_verb} {the} {amazon_noun}.', i=['I', 'We'], the=['this', 'that', 'the'], save=True)
test = MFT(t.data, labels=1, templates=t.templates)
suite.add(test, 'neutral words in context', 'Vocabulary', 'Use neutral verbs and adjectives with airline nouns such as silent movie, thriller movie, genre, book, single, headphone, gift, series, etc. E.g. "The thriller movie is American"')

### Intensifiers and reducers

In [19]:
# suggest the mask from the context. In this context mask provided the adverb
print(' , '.join(editor.suggest('{it} {be} {a:mask} {pos_adj} {amazon_noun}.', it=['It', 'This', 'That'], be=['is', 'was'])[:50]))

really , truly , absolutely , very , quite , most , pretty , just , simply , historically , genuinely , totally , incredibly , completely , fucking , extremely , utterly , seriously , an , visually , unbelievably , absolute , amazingly , actually , frankly , especially , damn , rather , particularly , equally , exceptionally , insanely , extraordinarily , obviously , overall , undeniably , almost , always , freaking , unexpectedly , amazing , real , entirely , tremendously , altogether , enormously , incredible , otherwise , overwhelmingly , honestly


In [20]:
# define adverb
intens_adj = ['very', 'really', 'absolutely', 'truly', 'extremely', 'quite', 'incredibly', 'amazingly', 'especially', 'exceptionally', 'unbelievably', 'utterly', 'exceedingly', 'rather', 'totally', 'particularly']

In [21]:
print(', '.join(editor.suggest('{i} {mask} {pos_verb} {the} {amazon_noun}.', i=['I', 'We'], the=['this', 'that', 'the'])[:100]))

really, always, also, just, definitely, greatly, actually, absolutely, truly, certainly, especially, still, particularly, both, thoroughly, so, highly, quite, personally, totally, all, never, simply, rather, obviously, very, strongly, much, generally, most, even, too, honestly, deeply, clearly, genuinely, completely, already, have, only, sure, usually, seriously, immediately, sincerely, mostly, do, often, did, REALLY, probably, ultimately, dearly, had, specifically, naturally, fully, kinda, secretly, initially, desperately, we, almost, immensely, hugely, tremendously, first, surely, ever, again, quickly, ALL, extremely, finally, therefore, then, originally, instantly, normally, would, each, once, mainly, heavily, basically, vastly, equally, frankly, largely, long, guys, profoundly, fucking, forever, can, literally, should, constantly, will, collectively


In [23]:
# define adverb
intens_verb = [ 'really', 'absolutely', 'truly', 'extremely',  'especially',  'utterly',  'totally', 'particularly', 'highly', 'definitely', 'certainly', 'genuinely', 'honestly', 'strongly', 'sure', 'sincerely']

In [24]:
monotonic_label = Expect.monotonic(increasing=True, tolerance=0.1) #label the model: prediction is increasing like pattern and tolerance=0.1. 
non_neutral_pred = lambda pred, *args, **kwargs: pred != 1 #to check that the neutral is not equal 1, used to filter the neutral out
monotonic_label = Expect.slice_pairwise(monotonic_label, non_neutral_pred) #combine label

In [25]:
# add dictionary of word into object 
# need to define first then, use `+=` to add. for protecting duplicating problem
t = editor.template(['{it} {be} {a:pos_adj} {amazon_noun}.', '{it} {be} {a:intens} {pos_adj} {amazon_noun}.'] , intens=intens_adj, it=['It', 'This', 'That'], be=['is', 'was'], nsamples=500, save=True)
t += editor.template(['{i} {pos_verb} {the} {amazon_noun}.', '{i} {intens} {pos_verb} {the} {amazon_noun}.'], intens=intens_verb, i=['I', 'We'], the=['this', 'that', 'the'], nsamples=500, save=True)
t += editor.template(['{it} {be} {a:neg_adj} {amazon_noun}.', '{it} {be} {a:intens} {neg_adj} {amazon_noun}.'] , intens=intens_adj, it=['It', 'This', 'That'], be=['is', 'was'], nsamples=500, save=True)
t += editor.template(['{i} {neg_verb} {the} {amazon_noun}.', '{i} {intens} {neg_verb} {the} {amazon_noun}.'], intens=intens_verb, i=['I', 'We'], the=['this', 'that', 'the'], nsamples=500, save=True)
t.data[:5] #show array in 5 list


[['This was a well silent movie.',
  'This was a particularly well silent movie.'],
 ['This was a greatest musical.', 'This was a quite greatest musical.'],
 ['That is a brilliant coming of age movie.',
  'That is a really brilliant coming of age movie.'],
 ['It was a fantastic headphone.', 'It was a very fantastic headphone.'],
 ['It was a perfect romance.', 'It was a quite perfect romance.']]

In [26]:
# add dictionary of word into object 
# need to define first then, use `+=` to add. for protecting duplicating problem
t = editor.template(['{it} {be} {a:pos_adj} {amazon_noun}.', '{it} {be} {a:intens} {pos_adj} {amazon_noun}.'] , intens=intens_adj, it=['It', 'This', 'That'], be=['is', 'was'], nsamples=500, save=True)
t += editor.template(['{i} {pos_verb} {the} {amazon_noun}.', '{i} {intens} {pos_verb} {the} {amazon_noun}.'], intens=intens_verb, i=['I', 'We'], the=['this', 'that', 'the'], nsamples=500, save=True)
t += editor.template(['{it} {be} {a:neg_adj} {amazon_noun}.', '{it} {be} {a:intens} {neg_adj} {amazon_noun}.'] , intens=intens_adj, it=['It', 'This', 'That'], be=['is', 'was'], nsamples=500, save=True)
t += editor.template(['{i} {neg_verb} {the} {amazon_noun}.', '{i} {intens} {neg_verb} {the} {amazon_noun}.'], intens=intens_verb, i=['I', 'We'], the=['this', 'that', 'the'], nsamples=500, save=True)
test = DIR(t.data, monotonic_label, templates=t.templates)
description = '''Test is composed of pairs of sentences (x1, x2), where we add an intensifier
such as "really",or "very" to x2 and expect the confidence to NOT go down (with tolerance=0.1). e.g.:
x1 = "That was a good movie"
x2 = "That was a very good novel"
We disregard cases where the prediction of x1 is neutral.
'''

#add the test into suite
suite.add(test, 'intensifiers', 'Vocabulary', description)


In [27]:
# define reducer adjective to reduce the meaning.
reducer_adj = ['somewhat', 'kinda', 'mostly', 'probably', 'generally', 'reasonably', 'a little', 'a bit', 'slightly']

In [28]:
monotonic_label_down = Expect.monotonic(increasing=False, tolerance=0.1) #label the model: prediction is increasing like pattern and tolerance=0.1. 
monotonic_label_down = Expect.slice_pairwise(monotonic_label_down, non_neutral_pred) #combine label

In [29]:
# add dictionary of word into object 
t = editor.template(['{it} {amazon_noun} {be} {pos_adj}.', '{it} {amazon_noun} {be} {red} {pos_adj}.'] , red=reducer_adj, it=['The', 'This', 'That'], be=['is', 'was'], nsamples=1000, save=True)
t += editor.template(['{it} {amazon_noun} {be} {neg_adj}.', '{it} {amazon_noun} {be} {red} {neg_adj}.'] , red=reducer_adj, it=['The', 'This', 'That'], be=['is', 'was'], nsamples=1000, save=True)
t.data[:50]

[['This novel was excellent.', 'This novel was reasonably excellent.'],
 ['This drama was nice.', 'This drama was slightly nice.'],
 ['That animation was fantastic.', 'That animation was mostly fantastic.'],
 ['The musical is perfect.', 'The musical is generally perfect.'],
 ['This adventure movie was awesome.',
  'This adventure movie was slightly awesome.'],
 ['That war was great.', 'That war was reasonably great.'],
 ['This product is great.', 'This product is generally great.'],
 ['The shoes is fantastic.', 'The shoes is generally fantastic.'],
 ['The thriller movie was perfect.',
  'The thriller movie was somewhat perfect.'],
 ['This thriller movie is extraordinary.',
  'This thriller movie is mostly extraordinary.'],
 ['This entertainment is brilliant.',
  'This entertainment is probably brilliant.'],
 ['The shoes was incredible.', 'The shoes was mostly incredible.'],
 ['The horror movie is happy.', 'The horror movie is probably happy.'],
 ['The short film was well.', 'The short 

In [30]:
# add dictionary of word into object 
# need to define first then, use `+=` to add. for protecting duplicating problem
t = editor.template(['{it} {amazon_noun} {be} {pos_adj}.', '{it} {amazon_noun} {be} {red} {pos_adj}.'] , red=reducer_adj, it=['The', 'This', 'That'], be=['is', 'was'], nsamples=1000, save=True)
t += editor.template(['{it} {amazon_noun} {be} {neg_adj}.', '{it} {amazon_noun} {be} {red} {neg_adj}.'] , red=reducer_adj, it=['The', 'This', 'That'], be=['is', 'was'], nsamples=1000, save=True)
test = DIR(t.data, monotonic_label_down, templates=t.templates)
description = '''Test is composed of pairs of sentences (x1, x2), where we add a reducer
such as "somewhat", or "kinda" to x2 and expect the confidence to NOT go up (with tolerance=0.1). e.g.:
x1 = "The smartphone was good."
x2 = "The smartphone was somewhat good."
We disregard cases where the prediction of x1 is neutral.
'''

#add the test into suite. This test case is about reducer adjective to ne more neutral
suite.add(test, 'reducers', 'Vocabulary', description)


### INVariance: change neutral words

In [31]:
#define set of neutral word
neutral_words = set(
    ['.', 'the', 'The', ',', 'a', 'A', 'and', 'of', 'to', 'it', 'that', 'in',
     'this', 'for',  'you', 'there', 'or', 'an', 'by', 'about', 'flight', 'my',
     'in', 'of', 'have', 'with', 'was', 'at', 'it', 'get', 'from', 'this', 'Flight', 'plane'
    ])
forbidden = set(['No', 'no', 'Not', 'not', 'Nothing', 'nothing', 'without', 'but'] + pos_adj + neg_adj + pos_verb_present + pos_verb_past + neg_verb_present + neg_verb_past)

# function to change the set of neutral word to other context.
def change_neutral(d):
    examples = []
    subs = []
    words_in = [x for x in d.capitalize().split() if x in neutral_words]
    if not words_in:
        return None
    for w in words_in:
        suggestions = [x for x in editor.suggest_replace(d, w, beam_size=5, words_and_sentences=True) if x[0] not in forbidden]
        examples.extend([x[1] for x in suggestions])
        subs.extend(['%s -> %s' % (w, x[0]) for x in suggestions])
    if examples:
        idxs = np.random.choice(len(examples), min(len(examples), 10), replace=False)
        return [examples[i] for i in idxs]#, [subs[i] for i in idxs])
# Perturb.perturb(parsed_data[:5], perturb)

In [32]:
t = Perturb.perturb(sentences, change_neutral, nsamples=20)
test = INV(t.data) #test type is Invariance

# BERT means using `Perturb.perturb(...)`
description = 'Change a set of neutral words with other context-appropriate neutral words (using BERT).'
suite.add(test, 'change neutral words with BERT', 'Vocabulary', description)

### Add negative phrases

In [35]:
#edit !!
# define and add context into positive variable
positive = editor.template('I {pos_verb_present} product.').data
positive += editor.template('Product are {pos_adj}.').data
positive += ['I would purchase product again.']

# define and add context into negative variable
negative = editor.template('I {neg_verb_present} product.').data
negative += editor.template('Products are {neg_adj}.').data
negative += ['Never purchasing this product again.']

# function to add phrase
def add_phrase_function(phrases):
    def pert(d):
        while d[-1].pos_ == 'PUNCT':
            d = d[:-1]
        d = d.text
        ret = [d + '. ' + x for x in phrases]
        idx = np.random.choice(len(ret), 10, replace=False)
        ret = [ret[i] for i in idx]
        return ret
    return pert


In [36]:
# function to change positive sentiment probability after perturbation
def positive_change(orig_conf, conf):
    softmax = type(orig_conf) in [np.array, np.ndarray]
    if not softmax or orig_conf.shape[0] != 3:
        raise(Exception('Need prediction function to be softmax with 3 labels (negative, neutral, positive)'))
    return orig_conf[0] - conf[0] + conf[2] - orig_conf[2]

# If the change in positive sentiment probability is within 0.1, then the result will return True if the total change is equal to or greater than 0, 
# indicating an increase or no change in positive sentiment. Otherwise, it will return the sum of the calculated change and the tolerance.
def diff_up(orig_pred, pred, orig_conf, conf, labels=None, meta=None):
    tolerance = 0.1
    change = positive_change(orig_conf, conf)
    if change + tolerance >= 0:
        return True
    else:
        return change + tolerance
    
# checks whether the positive sentiment probability has changed by 0.1 or more. If the change is zero or negative, it returns True, 
# meaning the positive sentiment probability has decreased or stayed the same. Otherwise, it returns the negative value of the change.
def diff_down(orig_pred, pred, orig_conf, conf, labels=None, meta=None):
    tolerance = 0.1
    change = positive_change(orig_conf, conf)
    if change - tolerance <= 0:
        return True
    else:
        return -(change - tolerance)
goes_up = Expect.pairwise(diff_up) #for adding positive phrase
goes_down = Expect.pairwise(diff_down) #for adding negative phrase
    

In [37]:
# This test case provided the adding very positive test. When adding the very positive phrase, the probabilily of positive will not go down
t = Perturb.perturb(parsed_data, add_phrase_function(positive), nsamples=500)
test = DIR(t.data, goes_up)
description = 'Add very positive phrases (e.g. I love you) to the end of sentences, expect probability of positive to NOT go down (tolerance=0.1)'
suite.add(test, 'add positive phrases', 'Vocabulary', description)


In [38]:
# This test case provided the adding very negative test. When adding the very negative phrase, the probabilily of positive will not go up
t = Perturb.perturb(parsed_data, add_phrase_function(negative), nsamples=500)
test = DIR(t.data, goes_down)
description = 'Add very negative phrases (e.g. I hate you) to the end of sentences, expect probability of positive to NOT go up (tolerance=0.1)'
suite.add(test, 'add negative phrases', 'Vocabulary', description)


### punctuation, contractions, typos

In [39]:
# This test case provided the punctuation 
t = Perturb.perturb(parsed_data, Perturb.punctuation, nsamples=500)
test = INV(t.data)
suite.add(test, 'punctuation', 'Robustness', 'strip punctuation and / or add "."')


In [40]:
# This test case provided the typos test. e.g., `;` `?`
t = Perturb.perturb(sentences, Perturb.add_typos, nsamples=500, typos=1)
test = INV(t.data)
suite.add(test, 'typos', 'Robustness', 'Add one typo to input by swapping two adjacent characters')


In [41]:
# This test case provided the 2 typos in test. e.g., `;` `?`
t = Perturb.perturb(sentences, Perturb.add_typos, nsamples=500, typos=2)
test = INV(t.data)
suite.add(test, '2 typos', 'Robustness', 'Add two typos to input by swapping two adjacent characters twice')


In [42]:
# this test case about expanding the contraction. For example; do not -> don't
t = Perturb.perturb(sentences, Perturb.contractions, nsamples=1000)
test = INV(t.data)
suite.add(test, 'contractions', 'Robustness', 'Contract or expand contractions, e.g. What is -> What\'s')

## Capability: temporal awareness

In [43]:
# check whats include in object `neg_verb_present`
editor.template('{neg_verb_present}').data

['hate', 'dislike', 'regret', 'abhor', 'dread', 'despise']

In [44]:
# add dictionary of word into object 
change = ['but', 'even though', 'although', '']

# need to define first then, use `+=` to add. for protecting duplicating problem
t = editor.template(['I used to think this movie was {neg_adj}, {change} now I think it is {pos_adj}.',
                                 'I think this movie is {pos_adj}, {change} I used to think it was {neg_adj}.',
                                 'In the past I thought this movie was {neg_adj}, {change} now I think it is {pos_adj}.',
                                 'I think this movie is {pos_adj}, {change} in the past I thought it was {neg_adj}.',
                                ] ,
                                 change=change, unroll=True, nsamples=500, save=True, labels=2)
t += editor.template(['I used to {neg_verb_present} this movie, {change} now I {pos_verb_present} it.',
                                 'I {pos_verb_present} this movie, {change} I used to {neg_verb_present} it.',
                                 'In the past I would {neg_verb_present} this movie, {change} now I {pos_verb} it.',
                                 'I {pos_verb_present} this movie, {change} in the past I would {neg_verb_present} it.',
                                ] ,
                                change=change, unroll=True, nsamples=500, save=True, labels=2)

t += editor.template(['I used to think this movie was {pos_adj}, {change} now I think it is {neg_adj}.',
                                 'I think this movie is {neg_adj}, {change} I used to think it was {pos_adj}.',
                                 'In the past I thought this movie was {pos_adj}, {change} now I think it is {neg_adj}.',
                                 'I think this movie is {neg_adj}, {change} in the past I thought it was {pos_adj}.',
                                ] ,
                                 change=change, unroll=True, nsamples=500, save=True, labels=0)
t += editor.template(['I used to {pos_verb_present} this movie, {change} now I {neg_verb_present} it.',
                                 'I {neg_verb_present} this movie, {change} I used to {pos_verb_present} it.',
                                 'In the past I would {pos_verb_present} this movie, {change} now I {neg_verb_present} it.',
                                 'I {neg_verb_present} this movie, {change} in the past I would {pos_verb_present} it.',
                                ] ,
                                change=change, unroll=True, nsamples=500, save=True, labels=0)
test = MFT(**t) # basic function test
description = '''Have two conflicing statements, one about the past and one about the present.
Expect the present to carry the sentiment. Examples:
I used to love this movie, now I hate it -> should be negative
I love this movie, although I used to hate it -> should be positive
'''
suite.add(test, 'used to, but now', 'Temporal', description)



used to should reduce

In [45]:
# Append the data, add more context. To reduce the meaning feeling to become less strong meaning.
t = editor.template(['{it} {be} {a:adj} {amazon_noun}.', 'I used to think {it} {be} {a:adj} {amazon_noun}.'], it=['it', 'this', 'that'], be=['is', 'was'], adj=editor.lexicons['pos_adj'] + editor.lexicons['neg_adj'], save=True)
t += editor.template(['{i} {verb} {the} {amazon_noun}.', '{i} used to {verb} {the} {amazon_noun}.'], i=['I', 'We'], the=['this', 'that', 'the'], verb=editor.lexicons['pos_verb_present'] + editor.lexicons['neg_verb_present'], save=True)
t.data[:5]


[['it is a greatest western movie.',
  'I used to think it is a greatest western movie.'],
 ['this is a greatest western movie.',
  'I used to think this is a greatest western movie.'],
 ['that is a greatest western movie.',
  'I used to think that is a greatest western movie.'],
 ['it is a well western movie.',
  'I used to think it is a well western movie.'],
 ['this is a well western movie.',
  'I used to think this is a well western movie.']]

In [46]:
# Add the test case: add used to reduce confidence
t = editor.template(['{it} {be} {a:adj} {amazon_noun}.', 'I used to think {it} {be} {a:adj} {amazon_noun}.'], it=['it', 'this', 'that'], be=['is', 'was'], adj=editor.lexicons['pos_adj'] + editor.lexicons['neg_adj'], save=True)
t += editor.template(['{i} {verb} {the} {amazon_noun}.', '{i} used to {verb} {the} {amazon_noun}.'], i=['I', 'We'], the=['this', 'that', 'the'], verb=editor.lexicons['pos_verb_present'] + editor.lexicons['neg_verb_present'], save=True)
test = DIR(t.data, monotonic_label_down, templates=t.templates)
suite.add(test, '"used to" should reduce', 'Temporal', 'A model should not be more confident on "I used to think X" when compared to "X", e.g. "I used to love this product" should have less confidence than "I love this product"')



## Capability: Negation

Simple templates:

In [57]:
# add the simple negation test case
t = editor.template('{it} {amazon_noun} {nt} {pos_adj}.', it=['This', 'That', 'The'], nt=['is not', 'isn\'t'], save=True)
t += editor.template('{it} {benot} {a:pos_adj} {amazon_noun}.', it=['It', 'This', 'That'], benot=['is not',  'isn\'t', 'was not', 'wasn\'t'], save=True)
neg = ['I can\'t say I', 'I don\'t', 'I would never say I', 'I don\'t think I', 'I didn\'t' ]
t += editor.template('{neg} {pos_verb_present} {the} {amazon_noun}.', neg=neg, the=['this', 'that', 'the'], save=True)
t += editor.template('No one {pos_verb_present}s {the} {amazon_noun}.', neg=neg, the=['this', 'that', 'the'], save=True)
test = MFT(t.data, labels=0, templates=t.templates)
suite.add(test, 'simple negations: negative', 'Negation', 'Very simple negations of positive statements')


In [58]:
# add the simple negation test case.
t = editor.template('{it} {amazon_noun} {nt} {neg_adj}.', it=['This', 'That', 'The'], nt=['is not', 'isn\'t'], save=True)
t += editor.template('{it} {benot} {a:neg_adj} {amazon_noun}.', it=['It', 'This', 'That'], benot=['is not',  'isn\'t', 'was not', 'wasn\'t'], save=True)
neg = ['I can\'t say I', 'I don\'t', 'I would never say I', 'I don\'t think I', 'I didn\'t' ]
t += editor.template('{neg} {neg_verb_present} {the} {amazon_noun}.', neg=neg, the=['this', 'that', 'the'], save=True)
t += editor.template('No one {neg_verb_present}s {the} {amazon_noun}.', neg=neg, the=['this', 'that', 'the'], save=True)
# expectation: prediction is not 0
is_not_0 = lambda x, pred, *args: pred != 0
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'simple negations: not negative', 'Negation', 'Very simple negations of negative statements. Expectation requires prediction to NOT be negative (i.e. neutral or positive)')


In [59]:
# add not neutral is still neutral test case
# e.g., I thought the airline would be Italian, but it wasn't. 
t = editor.template('{it} {amazon_noun} {nt} {neutral_adj}.', it=['This', 'That', 'The'], nt=['is not', 'isn\'t'], save=True)
t += editor.template('{it} {benot} {a:neutral_adj} {amazon_noun}.', it=['It', 'This', 'That'], benot=['is not',  'isn\'t', 'was not', 'wasn\'t'], save=True)
neg = ['I can\'t say I', 'I don\'t', 'I would never say I', 'I don\'t think I', 'I didn\'t' ]
t += editor.template('{neg} {neutral_verb_present} {the} {amazon_noun}.', neg=neg, the=['this', 'that', 'the'], save=True)
test = MFT(t.data, labels=1, templates=t.templates)
suite.add(test, 'simple negations: not neutral is still neutral', 'Negation', 'Negating neutral statements should still result in neutral predictions')


Different templates:

In [60]:
amazon_noun_it = [x for x in editor.lexicons['amazon_noun'] if x != 'book'] # without using `book`` assume that not used one word
t = editor.template('I thought {it} {amazon_noun} would be {pos_adj}, but it {neg}.', amazon_noun=amazon_noun_it, neg=['was not', 'wasn\'t'], it=['this', 'that', 'the'], nt=['is not', 'isn\'t'], save=True)
t += editor.template('I thought I would {pos_verb_present} {the} {amazon_noun}, but I {neg}.', neg=['did not', 'didn\'t'], the=['this', 'that', 'the'], save=True)
test = MFT(t.data, labels=0, templates=t.templates) # expect to be negative
suite.add(test, 'simple negations: I thought x was positive, but it was not (should be negative)', 'Negation', '', overwrite=True)


In [61]:
t = editor.template('I thought {it} {amazon_noun} would be {neg_adj}, but it {neg}.', amazon_noun=amazon_noun_it, neg=['was not', 'wasn\'t'], it=['this', 'that', 'the'], nt=['is not', 'isn\'t'], save=True)
t += editor.template('I thought I would {neg_verb_present} {the} {amazon_noun}, but I {neg}.', neg=['did not', 'didn\'t'], the=['this', 'that', 'the'], save=True)
# expectation: prediction is not 0 (negative)
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'simple negations: I thought x was negative, but it was not (should be neutral or positive)', 'Negation', '')


In [62]:
t = editor.template('I thought {it} {amazon_noun} would be {neutral_adj}, but it {neg}.', amazon_noun=amazon_noun_it, neg=['was not', 'wasn\'t'], it=['this', 'that', 'the'], nt=['is not', 'isn\'t'], save=True)
t += editor.template('I thought I would {neutral_verb_present} {the} {amazon_noun}, but I {neg}.', neg=['did not', 'didn\'t'], the=['this', 'that', 'the'], save=True)
# expectation: prediction is not 0 (negative)
test = MFT(t.data, labels=1, templates=t.templates)
suite.add(test, 'simple negations: but it was not (neutral) should still be neutral', 'Negation', '')


Harder: negation with neutral in the middle

In [63]:
# add the test case: negation with neutral in the middle.
# expect to be negative
# e.g., I don't think, given Its better then a normal bed, that the remale is awesome.
#change the use case
neutral =['Its better then a normal bed', 'I just moved out of my moms house', ' Its been 2 months since I moved out', 'I\'m still sleeping on it.', ' its like sleeping on a cloud.']
t = editor.template('{neg}, given {neutral}, that {it} {amazon_noun} {be} {pos_adj}.', neutral=neutral, neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that', 'the'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {it} {be} {a:pos_adj} {amazon_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that', 'the'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {i} {pos_verb_present} {the} {amazon_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], i=['I', 'we'], the=['this', 'that', 'the'], save=True)
t.data = list(np.random.choice(t.data, 1000, replace=False))
test = MFT(t.data, labels=0, templates=t.templates)
suite.add(test, 'Hard: Negation of positive with neutral stuff in the middle (should be negative)', 'Negation', '')


In [64]:
# add the test case: negation with neutral in the middle.
# e.g., I can't say, given I just moved out of my moms house, that the fantasy is weird.
# expect to be positive or neutral
neutral =['Its better then a normal bed', 'I just moved out of my moms house', ' Its been 2 months since I moved out', 'I\'m still sleeping on it.', ' its like sleeping on a cloud.']
t = editor.template('{neg}, given {neutral}, that {it} {amazon_noun} {be} {neg_adj}.', neutral=neutral, neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that', 'the'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {it} {be} {a:neg_adj} {amazon_noun}.',neutral=neutral,  neg=['i don\'t think', 'i can\'t say', 'i wouldn\'t say'], it=['this', 'that', 'the'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {i} {neg_verb_present} {the} {amazon_noun}.',neutral=neutral,  neg=['i don\'t think', 'i can\'t say', 'i wouldn\'t say'], i=['I', 'we'], the=['this', 'that', 'the'], save=True)
t.data = list(np.random.choice(t.data, 1000, replace=False))
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'Hard: Negation of negative with neutral stuff in the middle (should be positive or neutral)', 'Negation', '')


In [65]:
# add the test case: negation with neutral in the middle.
# e.g.,I can't say, given  Its been 2 months since I moved out, that this is a private E-book.
# expect to be neutral
neutral =['Its better then a normal bed', 'I just moved out of my moms house', ' Its been 2 months since I moved out', 'I\'m still sleeping on it.', ' its like sleeping on a cloud.']
t = editor.template('{neg}, given {neutral}, that {it} {amazon_noun} {be} {neutral_adj}.', neutral=neutral, neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that', 'the'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {it} {be} {a:neutral_adj} {amazon_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that', 'the'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {i} {neutral_verb_present} {the} {amazon_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], i=['I', 'we'], the=['this', 'that', 'the'], save=True)
t.data = list(np.random.choice(t.data, 1000, replace=False))
test = MFT(t.data, labels=1, templates=t.templates)
suite.add(test, 'negation of neutral with neutral in the middle, should still neutral', 'Negation', '')



## Capability: SRL

my opinion (change to product) is more important than others

In [66]:
# add the test case about my opinion is important than others
change = [' but', '']
templates = ['Some people think product are {neg_adj},{change} I think product are {pos_adj}.',
             'I think product are {pos_adj},{change} some people think product are {neg_adj}.',
             'I had heard product were {neg_adj},{change} I think product are {pos_adj}.',
             'I think product are {pos_adj},{change} I had heard product were {neg_adj}.',
             ]
t = editor.template(templates, change=change, unroll=True, labels=2, save=True)
templates = ['{others} {neg_verb_present} product,{change} I {pos_verb_present} product.',
             'I {pos_verb_present} product,{change} {others} {neg_verb_present} product.',
            ]
others = ['some people', 'my parents', 'my friends', 'people']
t += editor.template(templates, others=others, change=change, unroll=True, labels=2, save=True)

change = [' but', '']
templates = ['Some people think product are {pos_adj},{change} I think product are {neg_adj}.',
             'I think product are {neg_adj},{change} some people think product are {pos_adj}.',
             'I had heard product were {pos_adj},{change} I think product are {neg_adj}.',
             'I think product are {neg_adj},{change} I had heard product were {pos_adj}.',
             ]
t += editor.template(templates, change=change, unroll=True, labels=0, save=True)
templates = ['{others} {pos_verb_present} product,{change} I {neg_verb_present} product.',
             'I {neg_verb_present} product,{change} {others} {pos_verb_present} product.',
            ]
others = ['some people', 'my parents', 'my friends', 'people']
t += editor.template(templates, others=others, change=change, unroll=True, labels=0, save=True)
test = MFT(**t)
description = '''Have conflicting statements where the author has an opinion and a third party has a contrary opinion.
Expect sentiment to be the authors'. Example:
"Some people think product are great, but I think product are terrible" -> should be negative
'''
suite.add(test, 'my opinion is what matters', 'SRL', description)


q & a form: yes

In [68]:
# add the q and a test case
# e.g., Do I think that was a commercial animation? Yes

# label positive
t = editor.template('Do I think {it} {amazon_noun} {be} {pos_adj}? Yes', it=['that', 'this', 'the'], be=['is', 'was'], save=True, labels=2)
t += editor.template('Do I think {it} {be} {a:pos_adj} {amazon_noun}? Yes', it=['it', 'this', 'that'], be=['is', 'was'], save=True, labels=2)
t += editor.template('Did {i} {pos_verb_present} {the} {amazon_noun}? Yes', i=['I', 'we'], the=['this', 'that', 'the'], save=True, labels=2)

# label negative
t += editor.template('Do I think {it} {amazon_noun} {be} {neg_adj}? Yes', it=['that', 'this', 'the'], be=['is', 'was'], save=True, labels=0)
t += editor.template('Do I think {it} {be} {a:neg_adj} {amazon_noun}? Yes', it=['it', 'this', 'that'], be=['is', 'was'], save=True, labels=0)
t += editor.template('Did {i} {neg_verb_present} {the} {amazon_noun}? Yes', i=['I', 'we'], the=['this', 'that', 'the'], save=True, labels=0)
test = MFT(**t)
suite.add(test, 'Q & A: yes', 'SRL', 'TODO_DESCRIPTION')


In [69]:
# add more test case about qna expect to be neutral
t = editor.template('Do I think {it} {amazon_noun} {be} {neutral_adj}? Yes', it=['that', 'this', 'the'], be=['is', 'was'], save=True)
t += editor.template('Do I think {it} {be} {a:neutral_adj} {amazon_noun}? Yes', it=['it', 'this', 'that'], be=['is', 'was'], save=True)
t += editor.template('Did {i} {neutral_verb_present} {the} {amazon_noun}? Yes', i=['I', 'we'], the=['this', 'that', 'the'], save=True)
test = MFT(t.data, labels=1, templates=t.templates)
suite.add(test, 'Q & A: yes (neutral)', 'SRL', 'TODO_DESCRIPTION')


In [72]:
# label negative
t = editor.template('Do I think {it} {amazon_noun} {be} {pos_adj}? No', it=['that', 'this', 'the'], be=['is', 'was'], save=True, labels=0)
t += editor.template('Do I think {it} {be} {a:pos_adj} {amazon_noun}? No', it=['it', 'this', 'that'], be=['is', 'was'], save=True, labels=0)
t += editor.template('Did {i} {pos_verb_present} {the} {amazon_noun}? No', i=['I', 'we'], the=['this', 'that', 'the'], save=True, labels=0)

# label to be neutral
t += editor.template('Do I think {it} {amazon_noun} {be} {neg_adj}? No', it=['that', 'this', 'the'], be=['is', 'was'], save=True, labels=1)
t += editor.template('Do I think {it} {be} {a:neg_adj} {amazon_noun}? No', it=['it', 'this', 'that'], be=['is', 'was'], save=True, labels=1)
t += editor.template('Did {i} {neg_verb_present} {the} {amazon_noun}? No', i=['I', 'we'], the=['this', 'that', 'the'], save=True, labels=1)

# firstly, if label=1 (neutral),then check that is not negative. otherwise, pred label equal to label
allow_for_neutral = lambda x, pred, _, label, _2 : pred != 0 if label == 1 else pred == label
test = MFT(t.data, Expect.single(allow_for_neutral), labels=t.labels, templates=t.templates)
suite.add(test, 'Q & A: no', 'SRL', 'TODO_DESCRIPTION', overwrite=True)


In [73]:
# test case that have `no` answer. 
t = editor.template('Do I think {it} {amazon_noun} {be} {neutral_adj}? No', it=['that', 'this', 'the'], be=['is', 'was'], save=True)
t += editor.template('Do I think {it} {be} {a:neutral_adj} {amazon_noun}? No', it=['it', 'this', 'that'], be=['is', 'was'], save=True)
t += editor.template('Did {i} {neutral_verb_present} {the} {amazon_noun}? No', i=['I', 'we'], the=['this', 'that', 'the'], save=True)

# expect to be neutral
test = MFT(t.data, labels=1, templates=t.templates)
suite.add(test, 'Q & A: no (neutral)', 'SRL', 'TODO_DESCRIPTION')


In [74]:
# update parameter name
for test in suite.tests:
    suite.tests[test].name = test
    suite.tests[test].description = suite.info[test]['description]']
    suite.tests[test].capability = suite.info[test]['capability']

In [75]:
path = 'sentiment_suite_dt1.pkl' # define path
suite.save(path) # save suite (test case) to the path