# Synthetic "Fairness" Dataset built with CheckList

English data

## Imports

In [1]:
%load_ext autoreload
%autoreload 2

from nltk.corpus import wordnet as wn

import checklist
import spacy
import itertools
import pandas as pd

import checklist.editor
import checklist.text_generation
from checklist.test_types import MFT, INV, DIR
from checklist.expect import Expect
import numpy as np
import spacy
from checklist.test_suite import TestSuite
from checklist.perturb import Perturb

" spaCy is an open-source software library for advanced natural language processing: https://github.com/explosion/spaCy. spaCy comes with pretrained statistical models and word vectors, and currently supports tokenization for 60+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. "

In [2]:
nlp = spacy.load('en_core_web_sm')



" When processing large volumes of text, the statistical models are usually more efficient if you let them work on batches of texts. spaCy’s nlp.pipe method takes an iterable of texts and yields processed Doc objects. The batching is done internally " https://spacy.io/usage/processing-pipelines#processing

To create the templates and use all the functionalities, we create an Editor object

In [3]:
editor = checklist.editor.Editor()
editor.tg

Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


<checklist.text_generation.TextGenerator at 0x7f863613d7b8>

Creating a new object, TestSuite, where we'll add all the custom tests, inspired by the original notebooks plus new ones

In [4]:
suite = TestSuite()

# CheckList

" In order to guide test ideation, it's useful to think of CheckList as a matrix of Capabilities x Test Types.  
*Capabilities* refers to general-purpose linguistic capabilities, which manifest in one way or another in almost any NLP application.   
We suggest that anyone CheckListing a model go through *at least* the following capabilities, trying to create MFTs, INVs, and DIRs for each if possible.
1. **Vocabulary + POS:** important words or groups of words (by part-of-speech) for the task
2. **Taxonomy**: synonyms, antonyms, word categories, etc
3. **Robustness**: to typos, irrelevant additions, contractions, etc
4. **Named Entity Recognition (NER)**: person names, locations, numbers, etc
5. **Fairness**
6. **Temporal understanding**: understanding order of events and how they impact the task
7. **Negation**
8. **Coreference** 
9. **Semantic Role Labeling (SRL)**: understanding roles such as agent, object, passive/active, etc
10. **Logic**: symmetry, consistency, conjunctions, disjunctions, etc

Notice that we are framing this as very **top-down approach**: you start with a list of capabilities and try to think of what kinds of tests can be created, based on the three test types. 

**Bottom up approach**
In this approach, we look at specific examples (from the validation dataset or elsewhere) and try to generalize them into MFTs, INVs or DIRs, placing them into a specific capability.  "

## Exploring the linguistic data and tools available to build/expand lexicons and tests

Important words or word types for the task

### HurtLex

**HurtLex**

From the paper, section 4.2 Misogyny Identification on Social Media:
_They identified the Prostitution, Female and Male Sexual Apparatus and
Physical and Mental Diversity and Disability categories as the most informative for this task._ The task they are talking about is precisely AMI

In [5]:
hurtlex=pd.read_csv('/Users/Marta/CheckList - FBK/hurtlex/lexica/EN/1.2/hurtlex_EN.tsv', sep='\t', index_col=None, header=0)

In building the lexica, we filter out per POS = n, a, v, av 

Selecting the relevant categories: PR (words related to prostitution), ASM (male genitalia), ASF (female genitalia), DDF (physical disabilities and diversity), DDP (cognitive disabilities and diversity), OM (Homosexuality), QAS	(with potential negative connotations), CDS	(derogatory words)

In [6]:
hurtlex_AMI_pr=hurtlex[(hurtlex.category=='pr')]
hurtlex_AMI_asm=hurtlex[(hurtlex.category=='asm')]
hurtlex_AMI_asf=hurtlex[(hurtlex.category=='asf')]
hurtlex_AMI_dis=hurtlex[(hurtlex.category=='ddf')]
hurtlex_AMI_dis=hurtlex_AMI_dis.append(hurtlex[(hurtlex.category=='ddp')])
hurtlex_AMI_om=hurtlex[(hurtlex.category=='om')]
hurtlex_AMI_off=hurtlex[(hurtlex.category=='qas')]
hurtlex_AMI_off=hurtlex_AMI_off.append([(hurtlex.category=='cds')])

In [None]:
hurtlex_AMI_pr.head()

Unnamed: 0,id,pos,category,stereotype,lemma,level
26,EN940,n,pr,no,rentboy,conservative
54,EN2962,n,pr,no,courtisanerie,inclusive
64,EN1929,n,pr,no,beyotch,conservative
68,EN899,n,pr,no,sluttish,conservative
95,EN6018,n,pr,no,society figure,inclusive


In [None]:
hurtlex_AMI_asm.head()

Unnamed: 0,id,pos,category,stereotype,lemma,level
18,EN523,n,asm,no,putz,conservative
31,EN2677,n,asm,no,wankiest,conservative
70,EN1916,n,asm,no,half-wit,inclusive
97,EN861,n,asm,no,mark,conservative
152,EN2885,v,asm,no,barrack,inclusive


In [None]:
hurtlex_AMI_asf.head()

Unnamed: 0,id,pos,category,stereotype,lemma,level
51,EN337,n,asf,no,folderol,conservative
154,EN1925,n,asf,no,coo-yon,conservative
244,EN1913,n,asf,no,stupidhead,conservative
269,EN4197,n,asf,no,muff,inclusive
377,EN334,n,asf,no,trumpery,conservative


In [None]:
hurtlex_AMI_dis.head()

Unnamed: 0,id,pos,category,stereotype,lemma,level
479,EN7294,a,ddf,yes,dissatisfactory,conservative
791,EN4811,n,ddf,yes,disablement,inclusive
968,EN1784,a,ddf,yes,slimy,conservative
1107,EN4802,n,ddf,yes,differently-abled,inclusive
1803,EN4779,n,ddf,yes,handycapped,inclusive


In [None]:
hurtlex_AMI_om.head()

Unnamed: 0,id,pos,category,stereotype,lemma,level
6,EN204,n,om,no,buttfucker,inclusive
8,EN206,n,om,no,assplay,inclusive
21,EN3241,n,om,no,retrolateral,inclusive
35,EN2454,n,om,no,homophobic slurs,conservative
72,EN3234,n,om,no,anatomical term of location,inclusive


### Wordnet & SentiWordNet

**Wordnet & SentiWordNet**

' If syn is your synset, then syn.method() will deliver the value for all the different choices of method (e.g., hypernyms, part_meronyms()), and syn.attribute will give you the value of each attribute (name, pos, lemmas, definition, examples, offset). ' 

Finding neutral syn (using SentiWordNet)

In [None]:
editor.neutral('The girl see', 'girl')

["('daughter.n.01') Pos: 0.0 Neg: 0.0 Obj: 1.0",
 "('girl.n.05') Pos: 0.0 Neg: 0.0 Obj: 1.0",
 "('girlfriend.n.02') Pos: 0.0 Neg: 0.0 Obj: 1.0",
 "('girl.n.01') Pos: 0.0 Neg: 0.0 Obj: 1.0"]

In [None]:
editor.synonyms('The girl see', 'girl')

['miss', 'daughter', 'girlfriend']

Finding positive syn (using SentiWordNet)

In [None]:
editor.positive('The girl is happy', 'happy')

["('happy.a.01') Pos: 0.875 Neg: 0.0",
 "('felicitous.s.02') Pos: 0.75 Neg: 0.0",
 "('glad.s.02') Pos: 0.5 Neg: 0.0"]

In [None]:
editor.synonyms('The girl is happy', 'happy')

['glad']

In [None]:
editor.antonyms('The girl is', 'girl')

['boy', 'son']

Finding negative syn (using SentiWordNet)

In [None]:
editor.negative('The girl is mad', 'mad')

["('brainsick.s.01') Pos: 0.0 Neg: 0.5"]

In [None]:
editor.synonyms('The girl is mad', 'mad')

['sick', 'insane', 'crazy', 'excited', 'disturbed', 'sore', 'frantic']

Exploring related words with WordNet, comparing results for woman/girl vs man/boy

In [None]:
editor.hypernyms('The woman is', 'woman')

['cause',
 'object',
 'group',
 'class',
 'whole',
 'person',
 'individual',
 'entity',
 'soul',
 'unit',
 'organism',
 'people',
 'worker',
 'abstraction',
 'adult',
 'female',
 'being',
 'employee',
 'cleaner',
 'grouping']

In [None]:
editor.hypernyms('The man is', 'man')

['cause',
 'object',
 'work',
 'group',
 'whole',
 'beast',
 'person',
 'individual',
 'creature',
 'entity',
 'supply',
 'soul',
 'animal',
 'staff',
 'unit',
 'organism',
 'artifact',
 'equipment',
 'give',
 'worker',
 'human',
 'transfer',
 'abstraction',
 'adult',
 'assistant',
 'subsidiary',
 'help',
 'being',
 'servant',
 'male',
 'helper',
 'grouping',
 'brute',
 'supporter',
 'provide',
 'lover']

In [None]:
editor.hypernyms('The girl is', 'girl')

['issue',
 'cause',
 'object',
 'whole',
 'child',
 'person',
 'relation',
 'individual',
 'woman',
 'entity',
 'soul',
 'kid',
 'unit',
 'organism',
 'offspring',
 'adult',
 'female',
 'relative',
 'being',
 'lover']

In [None]:
editor.hypernyms('The boy is', 'boy')

['issue',
 'cause',
 'object',
 'whole',
 'man',
 'child',
 'person',
 'relation',
 'individual',
 'entity',
 'soul',
 'kid',
 'Black',
 'unit',
 'organism',
 'offspring',
 'Negro',
 'adult',
 'relative',
 'being',
 'male']

In [None]:
editor.hyponyms('The woman is', 'woman')

['beauty',
 'cat',
 'dish',
 'baby',
 'girl',
 'mother',
 'bird',
 'wife',
 'miss',
 'Wave',
 'broad',
 'baggage',
 'witch',
 'widow',
 'ex',
 'lady',
 'heroine',
 'peach',
 'whore',
 'sister',
 'doll',
 'girlfriend',
 'prostitute',
 'nurse',
 'mistress',
 'tease',
 'knockout',
 'Cinderella',
 'chick',
 'maiden',
 'gal',
 'babe',
 'deb',
 'skirt']

In [None]:
editor.hyponyms('The man is', 'man')

['world',
 'cat',
 'boy',
 'king',
 'guy',
 'gentleman',
 'crew',
 'horse',
 'queen',
 'bull',
 'bishop',
 'castle',
 'ex',
 'commander',
 'soldier',
 'humanity',
 'striker',
 'Don',
 'officer',
 'black',
 'Marine',
 'knight',
 'dude',
 'ranger',
 'veteran',
 'wolf',
 'Highlander',
 'aide',
 'vet',
 'PO',
 'volunteer',
 'fodder',
 'bachelor',
 'boyfriend',
 'KP',
 'gent',
 'recruit',
 'swell',
 'white',
 'tile',
 'tanker',
 'regular',
 'patriarch',
 'sod',
 'private',
 'SEAL']

In [None]:
editor.hyponyms('The girl is', 'girl')

['baby', 'bird', 'Scout', 'rover', 'sister', 'doll', 'chick', 'maiden', 'gal']

In [None]:
editor.hyponyms('The boy is', 'boy')

['Scout', 'rover', 'Junior']

The synsets and the examples bring out a somewhat sexist conception of women 

In [None]:
syns = wn.synsets('woman', 'n')
print(syns)

[Synset('woman.n.01'), Synset('woman.n.02'), Synset('charwoman.n.01'), Synset('womanhood.n.02')]


In [None]:
print(syns[0].definition())

an adult female person (as opposed to a man)


In [None]:
print(syns[1].definition())

a female person who plays a significant role (wife or mistress or girlfriend) in the life of a particular man


In [None]:
print(syns[0].examples())

['the woman kept house while the man hunted']


In [None]:
print(syns[1].examples())

['he was faithful to his woman']


In [None]:
print(syns[2].examples())

['the char will clean the carpet', 'I have a woman who comes in four hours a day while I write']


In [None]:
syns = wn.synsets('man', 'n')
print(syns)

[Synset('man.n.01'), Synset('serviceman.n.01'), Synset('man.n.03'), Synset('homo.n.02'), Synset('man.n.05'), Synset('man.n.06'), Synset('valet.n.01'), Synset('man.n.08'), Synset('man.n.09'), Synset('man.n.10'), Synset('world.n.08')]


In [None]:
print(syns[0].definition())

an adult person who is male (as opposed to a woman)


In [None]:
print(syns[1].definition())

someone who serves in the armed forces; a member of a military force


In [None]:
print(syns[0].examples())

['there were two women and six men on the bus']


In [None]:
print(syns[1].examples())

['two men stood sentry duty']


In [None]:
print(syns[2].examples())

['it was every man for himself']


In [None]:
w1 = wn.synset('woman.n.01')
w2 = wn.synset('man.n.01')
print(w1.wup_similarity(w2))

0.6666666666666666


In [None]:
w1.common_hypernyms(w2)

[Synset('object.n.01'),
 Synset('person.n.01'),
 Synset('physical_entity.n.01'),
 Synset('organism.n.01'),
 Synset('adult.n.01'),
 Synset('living_thing.n.01'),
 Synset('causal_agent.n.01'),
 Synset('entity.n.01'),
 Synset('whole.n.02')]

In [None]:
w1.lowest_common_hypernyms(w2)

[Synset('adult.n.01')]

### Expanding the lexicons

Exploring the lexicons.keys

In [None]:
print(list(editor.lexicons.keys()))

['male', 'female', 'first_name', 'first_pronoun', 'last_name', 'country', 'nationality', 'city', 'religion', 'religion_adj', 'sexual_adj', 'country_city', 'male_from', 'female_from', 'last_from']


_Names_

Some of the offensive terms taken from: 
* https://www.macmillandictionary.com/thesaurus-category/british/insulting-words-for-a-woman
* http://www.slate.com/blogs/lexicon_valley/2013/09/11/top_swear_words_most_popular_curse_words_on_facebook.html
* https://github.com/minerva-ml/open-solution-toxic-comments/blob/master/external_data/compiled_bad_words.txt
* https://github.com/RobertJGabriel/Google-profanity-words/blob/master/list.txt

In [24]:
# kind of neutral, taken by identity terms' file within AMI2020's data
women_noun = ['woman', 'sister', 'daughter', 'girl', 'wife', 'young lady', 'lady',
              'grandmother', 'mum', 'girlfriend', 'partner']#, 'bride', 'housewife', Mrs', 'Miss']

# We don't need plural in templates for now
#women_noun_plural = ['women', 'sisters', 'aunts', 'daughters', 'girls', 'mothers', 'wives', 'young ladies', 'ladies', 'brides', 
#              'grandmothers', 'mums', 'housewives', 'females', 'girlfriends', 'they']

# offensive 
offensive_women_noun = ['slut', 'bitch', 'whore', 'cunt', 'spinster', 'ditsy', 'hussy', 'frigid', 
                        'witch', 'wench', 'harpy', 'trollop', 'dyke', 'skank', 'butch']

# We don't need plural in templates for now
#offensive_women_noun_plural = ['sluts', 'bitches', 'whores', 'cunts']

offensive_homosexuals = []
for term in hurtlex_AMI_om[(hurtlex_AMI_om.pos=='n')&(hurtlex_AMI_om.level=='conservative')].lemma:
    offensive_homosexuals.append(term)
    
offensive_generic = ['twat', 'prick', 'motherfucker', 'shit', 'shithead', 'dick', 'dickhead', 
                     'cock', 'cockface', 'cockhead', 'cocksucker', 'pussy', 'asshole', 'arsehole', 'assfucker', 
                     'fag', 'faggot', 'bastard', 'douche', 'bugger']

In [8]:
work_role = ['astronaut', 'academic', 'accountant', 'activist', 'administrator', 'advisor', 'agent', 'analyst','architect', 
             'assistant', 'attorney', 'artist' 'boxer', 'builder', 'businessman', 'cameraman', 'carpenter', 'clown', 
             'consultant', 'dentist', 'detective', 'developer', 'doctor', 'doorman', 'driver', 'electrician', 
             'economist', 'editor', 'educator', 'entrepreneur', 'executive',
             'engineer', 'farmer',  'fighter', 'fireman', 'footballer', 'foreman', 
             'historian', 'hitter', 'intern', 'interpreter', 'investigator', 'investor', 'gardener', 'housekeeper', 
             'journalist', 'librarian', 'lifeguard', 'magician', 'mechanic', 'nun', 'nurse', 'painter', 'paramedic', 
             'photographer', 'pilot', 'police', 'poet', 'postman', 'priest', 'professor', 'ranger', 'repairman', 'reporter', 
             'salesman', 'scientist', 'secretary', 'singer', 'soldier', 'student', 'surgeon', 'teacher', 'waiter', 'writer', 
             'attendant', 'officer', 'player', 'organizer', 'quarterback', 'shooter']

# These sterotyped lists are taken from the file male_occupations.txt and female_occupations.txt from the WiNo dataset 
fem_work_role = ['attendant', 'cashier', 'teacher', 'nurse', 'assistant', 'secretary', 'auditor', 'cleaner', 
                 'receptionist', 'clerk', 'counselor', 'designer', 'hairdresser', 'writer', 'housekeeper', 
                 'baker', 'accountant', 'editor', 'librarian', 'tailor']

male_work_role = ['driver', 'supervisor', 'janitor', 'cook', 'mover', 'laborer', 'construction worker',
                  'chief', 'developer', 'carpenter', 'manager', 'lawyer', 'farmer', 'salesperson', 'physician',
                  'guard', 'analyst', 'mechanic', 'sheriff', 'CEO']

For now we decide to not expand too much the lexicons, so we don't use Hurtlex's nouns

In [None]:
'''for item in hurtlex_AMI_pr[(hurtlex_AMI_pr.pos=='n')&(hurtlex_AMI_pr.level=='conservative')].lemma:
    offensive_women_noun.append(item)
offensive_women_noun'''

"for item in hurtlex_AMI_pr[(hurtlex_AMI_pr.pos=='n')&(hurtlex_AMI_pr.level=='conservative')].lemma:\n    offensive_women_noun.append(item)\noffensive_women_noun"

In [None]:
'''for item in hurtlex_AMI_dis[(hurtlex_AMI_dis.pos=='n')&(hurtlex_AMI_dis.level=='conservative')].lemma:
    offensive_generic.append(item)'''

"for item in hurtlex_AMI_dis[(hurtlex_AMI_dis.pos=='n')&(hurtlex_AMI_dis.level=='conservative')].lemma:\n    offensive_generic.append(item)"

In [9]:
editor.add_lexicon('women_noun', women_noun, overwrite=True)
editor.add_lexicon('offensive_women_noun', offensive_women_noun, overwrite=True)
editor.add_lexicon('offensive_homosexuals', offensive_homosexuals, overwrite=True)
editor.add_lexicon('work_role', work_role, overwrite=True)
editor.add_lexicon('fem_work_role', fem_work_role, overwrite=True)
editor.add_lexicon('male_work_role', male_work_role, overwrite=True)

In [10]:
dis = ['weirdo', 'lame', 'cripple', 'invalid', 'inferm'] #'disabled', 'handicapped'
homeless = ['tramp', 'hobo', 'drifter', 'beggar', 'vagrant']#'roofless', 'homeless', 'houseless'
old = ['codger', 'grandpa', 'oldie', 'geezer', 'elder']

editor.add_lexicon('dis', dis, overwrite=True)
editor.add_lexicon('homeless', homeless, overwrite=True)
editor.add_lexicon('old', old, overwrite=True)
editor.add_lexicon('offensive_generic', offensive_generic, overwrite=True)

_Adjectives_

In [None]:
print(', '.join(editor.suggest('She is {a:mask} {women_noun}.')[:100]))

amazing, extraordinary, old, incredible, older, excellent, exceptional, important, exemplary, American, outstanding, interesting, awesome, ordinary, unusual, ideal, great, honest, awful, beautiful, only, evil, lovely, emotional, odd, average, unbelievable, angry, Italian, unmarried, outspoken, elderly, elegant, admirable, good, inspiring, influential, honorable, elder, impressive, ambitious, wonderful, independent, active, enormous, eccentric, English, Indian, innocent, aging, Irish, unhappy, experienced, imposing, ancient, ugly, nice, understanding, African, artistic, unconventional, adventurous, young, strong, fantastic, happy, real, intelligent, terrific, accomplished, honorary, attractive, sweet, successful, educated, tough, actual, astonishing, absolute, uncommon, unexpected, inspirational, exquisite, remarkable, eminent, open, unfortunate, aggressive, alcoholic, imperfect, illustrious, absent, unforgettable, brilliant, unlikely, easy, astounding, invisible, iconic, single


In [None]:
print(', '.join(editor.suggest('She is {a:mask} {offensive_women_noun}.')[:100]))

fucking, evil, absolute, little, true, old, beautiful, ugly, actual, amazing, awful, excellent, incredible, arrogant, angry, enormous, utter, awesome, common, emotional, real, incompetent, abusive, insane, ignorant, American, adorable, obvious, idiot, wicked, ancient, good, great, total, attractive, pretty, adult, honest, complete, elderly, anonymous, bad, eternal, terrible, black, white, powerful, horrible, big, dead, nasty, dirty, perfect, young, damn, proud, annoying, strange, asshole, insecure, alcoholic, ultimate, extreme, immature, aggressive, animal, average, appalling, educated, older, easy, infamous, accomplished, adolescent, exquisite, odd, expert, Italian, Irish, innocent, intellectual, dangerous, active, vicious, aging, arch, anarchist, effective, female, master, original, mad, former, walking, lovely, dark, murderous, bloody, sex, blue


Some of the offensive adj are taken from:
* theodysseyonline.com/16-derogatory-words-used-to-describe-women

In [11]:
pos_adj = [                                                                             
    'adorable', 'amazing', 'astonishing', 'attractive', 'awesome', 'beautiful',
    'brilliant', 'caring', 'committed', 'cool', 'enthusiastic', 'excellent', 'exceptional',
    'extraordinary', 'fantastic', 'fun', 'good', 'great', 'happy', 'honest', 'impressive', 'incredible',
    'inspiring', 'intelligent', 'lovely', 'nice', 
    'successful', 'sweet', 'trustworthy', 'wise', 'wonderful']

neg_adj = [
    'absolute', 'aggressive', 'angry', 'annoying', 'arrogant', 
    'awful', 'bad', 'bitchy', 'bloody', 'boring', 'bossy', 'creepy', 'cruel', 
    'damn', 'dead', 'depressing', 'difficult', 'dirty', 'disappointing', 'disgusting', 
    'dreadful', 'emotional', 'evil', 'fake', 'fat', 'feisty', 'frustrating', 'fucking', 
    'giant', 'hormonal', 'horrible', 'hysterical', 'illogical', 'incompetent', 
    'irrational', 'irritating', 'lame', 'lousy', 'lying', 'nasty', 'poor', 
    'rough', 'sad', 'sassy', 'shameless', 'shitty', 'stupid', 'terrible', 'terrific', 
    'twatty', 'ugly', 'unhappy', 'wanky', 'weak', 'weird', 'whining', 'wicked', 'worst']

neutral_adj = [                                           
    'american', 'armenian','atheist', 'australian', 'black', 'british', 'christian', 'commercial', 
    'english', 'gay', 'immigrant', 'independent', 'independent', 'indian', 'indian', 'intellectual','busy', 
    'international', 'israeli', 'italian', 'jewish', 'married', 'muslim', 'old', 'poor',
    'private', 'rich', 'russian', 'single', 'tall', 'unmarried', 'white', 'working', 'young']

editor.add_lexicon('pos_adj', pos_adj, overwrite=True)
editor.add_lexicon('neg_adj', neg_adj, overwrite=True )
editor.add_lexicon('neutral_adj', neutral_adj, overwrite=True)

_Verbs_

In [None]:
print(', '.join(editor.suggest('I really {mask} {women_noun}.')[:100]))

like, miss, liked, love, mean, missed, hate, do, loved, am, appreciate, admire, respect, enjoy, did, enjoyed, dislike, want, wanted, feel, need, dig, was, care, believe, prefer, appreciated, think, have, felt, meant, know, hated, thought, see, hurt, admired, adore, needed, understand, value, got, disliked, trust, despise, thank, wish, hope, tried, fancy, the, heart, cherish, fell, married, lost, my, underestimated, blame, remember, envy, valued, resent, consider, fucking, get, a, sorry, helped, help, forgot, that, respected, cherished, say, call, had, support, called, understood, met, believed, preferred, are, likes, is, regret, trusted, wanna, missing, should, suck, worry, will, said, considered, try, liking, must, adopted


In [None]:
print(', '.join(editor.suggest('I really {mask} {offensive_women_noun}.')[:100]))

like, hate, liked, do, mean, dislike, love, am, did, miss, hated, enjoy, feel, want, was, despise, dig, prefer, enjoyed, fucking, wanted, appreciate, got, felt, disliked, think, need, wanna, get, a, meant, believe, loved, the, is, rather, missed, say, can, respect, resent, admire, understand, really, just, thought, fancy, suck, needed, know, preferred, have, hope, tried, blame, likes, go, LIKE, should, could, will, no, would, must, are, play, gotta, appreciated, adore, quite, bad, played, heart, said, don, lost, cannot, see, my, that, dont, read, trust, try, use, hurt, became, hit, dug, LOVE, fear, care, considered, smell, value, wish, fuck, regret, Like, never


In [12]:
pos_verb_present = ['like', 'enjoy', 'appreciate', 'love', 'admire',
                   'respect', 'adore', 'support', 'care for', 'fancy', 'treasure', 'trust']

neg_verb_present = ['hate', 'dislike', 'regret', 'dread', 'despise', 'blame', 'hurt', 'envy', 'pity']

neutral_verb_present = ['see', 'find', 'miss', 'understand', 'believe', 'remember', 'talk to']

pos_verb_past = ['liked', 'enjoyed', 'appreciated', 'loved', 'admired', 
                 'respected', 'adored', 'supported', 'cared for', 'treasured', 'trusted']

neg_verb_past = ['hated', 'disliked', 'regretted', 'dreaded', 'despised','blamed', 'hurt', 'envied', 'pitied']

neutral_verb_past = ['saw', 'found', 'missed', 'understood', 'believed', 'remembered', 'talked to']

editor.add_lexicon('pos_verb_present', pos_verb_present, overwrite=True)
editor.add_lexicon('neg_verb_present', neg_verb_present, overwrite=True)
editor.add_lexicon('neutral_verb_present', neutral_verb_present, overwrite=True)
editor.add_lexicon('pos_verb_past', pos_verb_past, overwrite=True)
editor.add_lexicon('neg_verb_past', neg_verb_past, overwrite=True)
editor.add_lexicon('neutral_verb_past', neutral_verb_past, overwrite=True)
editor.add_lexicon('pos_verb', pos_verb_present+ pos_verb_past, overwrite=True)
editor.add_lexicon('neg_verb', neg_verb_present + neg_verb_past, overwrite=True)
editor.add_lexicon('neutral_verb', neutral_verb_present + neutral_verb_past, overwrite=True)

_Intensifiers and reducers_

In [None]:
print(' , '.join(editor.suggest('{it} {be} {a:mask} {pos_adj} {women_noun}.', it=['She'], be=['is', 'was'])[:50]))

very , really , truly , extremely , absolutely , incredibly , most , pretty , quite , extraordinarily , amazingly , exceptionally , especially , exceedingly , enormously , unbelievably , immensely , equally , unusually , awfully , utterly , totally , rather , insanely , obviously , undeniably , altogether , overwhelmingly , overall , intensely , entirely , unexpectedly , amazing , absolute , almost , otherwise , extraordinary , seriously , overly , remarkably , incredible , endlessly , infinitely , always , ever , actually , outstanding , undoubtedly , real , enormous


In [None]:
print(' , '.join(editor.suggest('{it} {be} {a:mask} {neg_adj} {offensive_women_noun}.', it=['She'], be=['is', 'was'])[:50]))

very , really , fucking , pretty , absolutely , extremely , incredibly , especially , exceptionally , equally , utterly , exceedingly , extraordinarily , absolute , unbelievably , amazingly , rather , old , awfully , awful , insanely , obviously , ugly , evil , big , real , enormously , immensely , actual , enormous , even , undeniably , increasingly , obvious , inherently , almost , intensely , entirely , excellent , total , overall , amazing , incredible , admittedly , apparently , excessively , a , complete , extra , extraordinary


In [13]:
intens_adj = ['very', 'really', 'absolutely', 'truly', 'extremely', 'quite', 'incredibly', 'especially',
              'exceptionally', 'utterly', 'rather', 'totally', 'particularly',
              'remarkably', 'pretty', 'wonderfully', 'completely',
              'entirely', 'undeniably', 'highly']

editor.add_lexicon('intens_adj', intens_adj, overwrite=True)

In [None]:
print(', '.join(editor.suggest('{i} {mask} {pos_verb} {the} {women_noun}.', i=['I', 'We', 'You'], the=['this', 'that', 'the'])[:100]))

really, just, always, also, all, certainly, both, never, truly, definitely, actually, still, so, quite, absolutely, obviously, totally, genuinely, rather, greatly, clearly, thoroughly, probably, very, sure, simply, deeply, especially, already, even, seriously, particularly, completely, too, have, only, honestly, had, finally, kinda, desperately, much, almost, did, highly, were, mostly, generally, personally, strongly, immediately, do, guys, most, sincerely, fucking, instantly, REALLY, often, would, fully, dearly, surely, basically, hardly, are, secretly, ..., each, usually, ever, felt, must, should, immensely, apparently, literally, tremendously, first, sorely, pretty, we, somehow, badly, somewhat, more, might, two, feel, could, will, can, profoundly, now, ALL, initially, ultimately, naturally, better, barely


In [None]:
print(', '.join(editor.suggest('{i} {mask} {neg_verb} {the} {offensive_women_noun}.', i=['I', 'We', 'You'], the=['this', 'that', 'the'])[:100]))

really, always, just, also, still, truly, never, so, even, simply, only, absolutely, almost, actually, already, all, certainly, both, have, now, seriously, personally, must, greatly, probably, deeply, rather, fucking, definitely, finally, too, clearly, honestly, had, should, immediately, totally, especially, secretly, particularly, will, obviously, first, genuinely, sure, then, thoroughly, completely, sincerely, generally, much, usually, mostly, can, instantly, gotta, quite, often, forever, would, cannot, could, naturally, dearly, literally, very, might, badly, most, desperately, may, once, constantly, fully, better, do, somehow, long, shall, bitterly, ever, wanna, suddenly, rightly, surely, merely, instinctively, basically, people, nearly, severely, did, guys, two, gonna, strongly, kinda, utterly, openly, silently


In [14]:
intens_verb = ['really', 'absolutely', 'truly', 'extremely',  'especially',  'utterly',  'totally', 'particularly', 
               'highly', 'definitely', 'certainly', 'honestly', 'strongly', 'sincerely']
reducer_adj = ['somewhat', 'kinda', 'mostly', 'probably', 'generally', 'a little', 'a bit', 'slightly']

editor.add_lexicon('intens_verb', intens_verb, overwrite=True)
editor.add_lexicon('reducer_adj', reducer_adj, overwrite=True)

Original and new lexicons keys

In [20]:
print(list(editor.lexicons.keys()))

['male', 'female', 'first_name', 'first_pronoun', 'last_name', 'country', 'nationality', 'city', 'religion', 'religion_adj', 'sexual_adj', 'country_city', 'male_from', 'female_from', 'last_from', 'women_noun', 'offensive_women_noun', 'offensive_homosexuals', 'work_role', 'fem_work_role', 'male_work_role', 'dis', 'homeless', 'old', 'pos_adj', 'neg_adj', 'neutral_adj', 'pos_verb_present', 'neg_verb_present', 'neutral_verb_present', 'pos_verb_past', 'neg_verb_past', 'neutral_verb_past', 'pos_verb', 'neg_verb', 'neutral_verb', 'intens_adj', 'intens_verb', 'reducer_adj']


## Capability 1: Vocabulary + POS

### Minimal Functionality Test

Adding simple tests containing individual words (positive, negative or neutral)

In [None]:
allow_for_neutral = lambda x, pred, _, label, _2 : pred != 0 if label == 1 else pred == label

In [27]:
# expectation: prediction is not 0
is_not_0 = lambda x, pred, *args: pred != 0

_Positive_

In [None]:
test = MFT(pos_adj + pos_verb_present + pos_verb_past, labels=2)
suite.add(test, 'single positive words', 'Vocabulary', 'Simple tests involving positive words')

In [None]:
test.data

['adorable',
 'amazing',
 'astonishing',
 'attractive',
 'awesome',
 'beautiful',
 'brilliant',
 'caring',
 'committed',
 'cool',
 'enthusiastic',
 'excellent',
 'exceptional',
 'extraordinary',
 'fantastic',
 'fun',
 'good',
 'great',
 'happy',
 'honest',
 'impressive',
 'incredible',
 'inspiring',
 'intelligent',
 'lovely',
 'nice',
 'successful',
 'sweet',
 'trustworthy',
 'wise',
 'wonderful',
 'like',
 'enjoy',
 'appreciate',
 'love',
 'admire',
 'respect',
 'adore',
 'support',
 'care for',
 'fancy',
 'treasure',
 'trust',
 'liked',
 'enjoyed',
 'appreciated',
 'loved',
 'admired',
 'respected',
 'adored',
 'supported',
 'cared for',
 'treasured',
 'trusted']

_Negative_

In [None]:
test = MFT(neg_adj + neg_verb_present + neg_verb_past, labels=0)
suite.add(test, 'single negative words', 'Vocabulary', 'Simple tests involving negative words')

In [None]:
test.data

['absolute',
 'aggressive',
 'angry',
 'annoying',
 'arrogant',
 'awful',
 'bad',
 'bitchy',
 'bloody',
 'boring',
 'bossy',
 'creepy',
 'cruel',
 'damn',
 'dead',
 'depressing',
 'difficult',
 'dirty',
 'disappointing',
 'disgusting',
 'dreadful',
 'emotional',
 'evil',
 'fake',
 'fat',
 'feisty',
 'frustrating',
 'fucking',
 'giant',
 'hormonal',
 'horrible',
 'hysterical',
 'illogical',
 'incompetent',
 'irrational',
 'irritating',
 'lame',
 'lousy',
 'lying',
 'nasty',
 'poor',
 'rough',
 'sad',
 'sassy',
 'shameless',
 'shitty',
 'stupid',
 'terrible',
 'terrific',
 'twatty',
 'ugly',
 'unhappy',
 'wanky',
 'weak',
 'weird',
 'whining',
 'wicked',
 'worst',
 'hate',
 'dislike',
 'regret',
 'dread',
 'despise',
 'blame',
 'hurt',
 'envy',
 'pity',
 'hated',
 'disliked',
 'regretted',
 'dreaded',
 'despised',
 'blamed',
 'hurt',
 'envied',
 'pitied']

_Neutral_

In [None]:
test = MFT(neutral_adj + neutral_verb_present + neutral_verb_past, labels=1)
suite.add(test, 'single neutral words', 'Vocabulary', 'Simple tests involving neutral words')

In [None]:
test.data

['american',
 'armenian',
 'atheist',
 'australian',
 'black',
 'british',
 'christian',
 'commercial',
 'english',
 'gay',
 'immigrant',
 'independent',
 'independent',
 'indian',
 'indian',
 'intellectual',
 'busy',
 'international',
 'israeli',
 'italian',
 'jewish',
 'married',
 'muslim',
 'old',
 'poor',
 'private',
 'rich',
 'russian',
 'single',
 'tall',
 'unmarried',
 'white',
 'working',
 'young',
 'see',
 'find',
 'miss',
 'understand',
 'believe',
 'remember',
 'talk to',
 'saw',
 'found',
 'missed',
 'understood',
 'believed',
 'remembered',
 'talked to']

Words in context: creating data-test from templates

_Positive and negative_

In [None]:
t = editor.template('{it} {women_noun} {be} {pos_adj}.', it=['The', 'That'], be=['is', 'was'], labels=2, nsamples=200, save=True)
t += editor.template('{it} {be} {a:pos_adj} {women_noun}.', it=['She'], be=['is', 'was'], labels=2, nsamples=200, save=True)
t += editor.template('{i} {pos_verb} {the} {women_noun}.', i=['I', 'We', 'You'], the=['this', 'that'], labels=2, nsamples=200, save=True)

t += editor.template('{it} {offensive_women_noun} {be} {neg_adj}.', it=['The', 'That'], be=['is', 'was'], labels=0, nsamples=200, save=True)
t += editor.template('{it} {be} {a:neg_adj} {offensive_women_noun}.', it=['She'], be=['is', 'was'], labels=0, nsamples=200, save=True)
t += editor.template('{i} {neg_verb} {the} {offensive_women_noun}.', i=['I', 'We', 'You'], the=['this', 'that'], labels=0, nsamples=200, save=True)

test = MFT(**t)
suite.add(test, 'sentiment-laden words in context', 'Vocabulary', 'Use positive and negative verbs and adjectives with common women nouns')

In [None]:
test.data

['The young lady is lovely.',
 'The sister was caring.',
 'That girlfriend is trustworthy.',
 'The girl is inspiring.',
 'The wife was wise.',
 'The mum was intelligent.',
 'The sister was caring.',
 'The partner is successful.',
 'That wife is sweet.',
 'That woman was trustworthy.',
 'That girl is enthusiastic.',
 'That girlfriend is exceptional.',
 'The daughter is great.',
 'The woman is caring.',
 'The wife is sweet.',
 'That girlfriend was caring.',
 'The daughter was trustworthy.',
 'That grandmother was honest.',
 'The grandmother was lovely.',
 'That partner was honest.',
 'The girlfriend is adorable.',
 'The mum is exceptional.',
 'The lady was wonderful.',
 'The girlfriend was brilliant.',
 'The girl is honest.',
 'The grandmother is adorable.',
 'That grandmother is wonderful.',
 'The wife is inspiring.',
 'The young lady was beautiful.',
 'The girlfriend is sweet.',
 'That girl is cool.',
 'The lady is inspiring.',
 'That mum was exceptional.',
 'That woman is impressive.'

_Neutral_

In [None]:
t = editor.template('{it} {women_noun} {be} {neutral_adj}.', it=['The', 'That'], be=['is', 'was'], labels=1,  nsamples=200, save=True)
t += editor.template('{it} {be} {a:neutral_adj} {women_noun}.', it=['She'], be=['is', 'was'], labels=1,  nsamples=200, save=True)
t += editor.template('{i} {neutral_verb} {the} {women_noun}.', i=['I', 'We', 'You'], the=['this', 'that'], labels=1,  nsamples=200, save=True)
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'neutral words in context', 'Vocabulary', 'Use neutral verbs and adjectives with common women nouns')

In [None]:
test.data

['The daughter is gay.',
 'The young lady is intellectual.',
 'That girlfriend was tall.',
 'The partner is married.',
 'That mum is independent.',
 'That girl is tall.',
 'That wife was private.',
 'The daughter was muslim.',
 'That wife is intellectual.',
 'The partner is muslim.',
 'The partner is russian.',
 'That young lady was atheist.',
 'That grandmother is australian.',
 'That sister is white.',
 'The mum was english.',
 'That wife is english.',
 'The girl is private.',
 'The girlfriend was busy.',
 'That woman is married.',
 'The sister was immigrant.',
 'The grandmother was working.',
 'The sister was christian.',
 'That sister is gay.',
 'The daughter is atheist.',
 'That mum is indian.',
 'That girl is young.',
 'The girl is australian.',
 'The wife was black.',
 'That mum was unmarried.',
 'The daughter was armenian.',
 'That mum was british.',
 'The lady is poor.',
 'The wife was immigrant.',
 'That partner was armenian.',
 'The wife was indian.',
 'That sister is christ

### Intensifiers and reducers

In [None]:
# We're  expecting the score to raise 
monotonic_label = Expect.monotonic(increasing=True, tolerance=0.1)
non_neutral_pred = lambda pred, *args, **kwargs: pred != 1
monotonic_label = Expect.slice_pairwise(monotonic_label, non_neutral_pred)

In [None]:
t = editor.template(['{it} {be} {a:pos_adj} {women_noun}.', '{it} {be} {a:intens} {pos_adj} {women_noun}.'] , intens=intens_adj, it=['She'], be=['is', 'was'], nsamples=200, labels=2, save=True)
t += editor.template(['{i} {pos_verb} {the} {women_noun}.', '{i} {intens} {pos_verb} {the} {women_noun}.'], intens=intens_verb, i=['I', 'We', 'You'], the=['this', 'that'], nsamples=200, labels=2, save=True)

t.data

[['She was an extraordinary young lady.',
  'She was an extremely extraordinary young lady.'],
 ['She was a wonderful young lady.',
  'She was an entirely wonderful young lady.'],
 ['She is a committed mum.', 'She is a remarkably committed mum.'],
 ['She was a fantastic sister.', 'She was a really fantastic sister.'],
 ['She is an adorable woman.', 'She is an utterly adorable woman.'],
 ['She is a lovely partner.', 'She is a wonderfully lovely partner.'],
 ['She was a caring sister.', 'She was an undeniably caring sister.'],
 ['She was an attractive sister.', 'She was an entirely attractive sister.'],
 ['She was an impressive wife.', 'She was a really impressive wife.'],
 ['She is a wise young lady.', 'She is a highly wise young lady.'],
 ['She was a fantastic grandmother.',
  'She was an exceptionally fantastic grandmother.'],
 ['She was a fun wife.', 'She was an undeniably fun wife.'],
 ['She is an exceptional lady.', 'She is a pretty exceptional lady.'],
 ['She is a committed grandm

In [None]:
test = DIR(t.data, monotonic_label, templates=t.templates)

description = '''Test is composed of pairs of sentences (x1, x2), where we add an intensifier
such as "really",or "very" to x2 and expect the confidence to NOT go down (with tolerance=0.1). e.g.:
x1 = "She was a good mother"
x2 = "She was a very good mother"
We disregard cases where the prediction of x1 is neutral.
'''

suite.add(test, 'intensifiers', 'Vocabulary', description)

In [None]:
# We're  expecting the score to fall 
monotonic_label_down = Expect.monotonic(increasing=False, tolerance=0.1)
monotonic_label_down = Expect.slice_pairwise(monotonic_label_down, non_neutral_pred)

In [None]:
t = editor.template(['{it} {be} {a:neg_adj} {offensive_women_noun}.', '{it} {be} {a:intens} {neg_adj} {offensive_women_noun}.'] , intens=intens_adj, it=['She'], be=['is', 'was'], nsamples=200, labels=0, save=True)
t += editor.template(['{i} {neg_verb} {the} {offensive_women_noun}.', '{i} {intens} {neg_verb} {the} {offensive_women_noun}.'], intens=intens_verb, i=['I', 'We', 'You'], the=['this', 'that'], nsamples=200, labels=0, save=True)

t += editor.template(['{it} {women_noun} {be} {pos_adj}.', '{it} {women_noun} {be} {red} {pos_adj}.'] , red=reducer_adj, it=['The', 'That'], be=['is', 'was'], nsamples=200, labels=2, save=True)
t += editor.template(['{it} {women_noun} {be} {neg_adj}.', '{it} {women_noun} {be} {red} {neg_adj}.'] , red=reducer_adj, it=['The', 'That'], be=['is', 'was'], nsamples=200, labels=0, save=True)
t.data

[['She is an arrogant ditsy.', 'She is a remarkably arrogant ditsy.'],
 ['She was a rough slut.', 'She was an extremely rough slut.'],
 ['She was a sassy bitch.', 'She was a highly sassy bitch.'],
 ['She was a hormonal bitch.', 'She was an utterly hormonal bitch.'],
 ['She is a worst whore.', 'She is a remarkably worst whore.'],
 ['She was an annoying spinster.', 'She was a wonderfully annoying spinster.'],
 ['She was a frustrating ditsy.', 'She was a pretty frustrating ditsy.'],
 ['She is a frustrating butch.', 'She is an undeniably frustrating butch.'],
 ['She is a boring slut.', 'She is an absolutely boring slut.'],
 ['She is an awful harpy.', 'She is a wonderfully awful harpy.'],
 ['She is a giant dyke.', 'She is a really giant dyke.'],
 ['She was a wicked slut.', 'She was a wonderfully wicked slut.'],
 ['She is an irrational skank.', 'She is a rather irrational skank.'],
 ['She is an irritating skank.', 'She is an absolutely irritating skank.'],
 ['She was a bad butch.', 'She was 

In [None]:
test = DIR(t.data, monotonic_label_down, templates=t.templates)

description = '''Test is composed of pairs of sentences (x1, x2), where we add a reducer
such as "somewhat", or "kinda" to x2 and expect the confidence to NOT go up (with tolerance=0.1). e.g.:
x1 = "The mum was good."
x2 = "The mum was somewhat good."
We disregard cases where the prediction of x1 is neutral.
'''

suite.add(test, 'reducers', 'Vocabulary', description)

### INVariance: change neutral words

In [None]:
neutral_words = set(
    ['.', 'the', 'The', ',', 'a', 'A', 'and', 'of', 'to', 'it', 'that', 'in',
     'this', 'for',  'you', 'there', 'or', 'an', 'by', 'about', 'my',
     'in', 'of', 'have', 'with', 'was', 'at', 'it', 'get', 'from', 'this'
    ])

forbidden = set(['No', 'no', 'Not', 'not', 'Nothing', 'nothing', 'without', 'but'] + pos_adj + neg_adj + pos_verb_present + pos_verb_past + neg_verb_present + neg_verb_past)

def change_neutral(d):
#     return d.text
    examples = []
    subs = []
    words_in = [x for x in d.capitalize().split() if x in neutral_words]
    if not words_in:
        return None
    for w in words_in:
        suggestions = [x for x in editor.suggest_replace(d, w, beam_size=5, words_and_sentences=True) if x[0] not in forbidden]
        examples.extend([x[1] for x in suggestions])
        subs.extend(['%s -> %s' % (w, x[0]) for x in suggestions])
    if examples:
        idxs = np.random.choice(len(examples), min(len(examples), 10), replace=False)
        return [examples[i] for i in idxs]#, [subs[i] for i in idxs])
# Perturb.perturb(parsed_data[:5], perturb)

In [None]:
t_pert = editor.template('{it} {women_noun} {be} {pos_adj}.', it=['The', 'That'], be=['is', 'was'], labels=2, nsamples=10, save=True)
t_pert += editor.template('{it} {be} {a:pos_adj} {women_noun}.', it=['She'], be=['is', 'was'], labels=2, nsamples=10, save=True)
t_pert += editor.template('{i} {pos_verb} {the} {women_noun}.', i=['I', 'We', 'You'], the=['this', 'that'], labels=2, nsamples=10, save=True)
t_pert += editor.template('{it} {offensive_women_noun} {be} {neg_adj}.', it=['The', 'That'], be=['is', 'was'], labels=0, nsamples=10, save=True)
t_pert += editor.template('{it} {be} {a:neg_adj} {offensive_women_noun}.', it=['She'], be=['is', 'was'], labels=0, nsamples=10, save=True)
t_pert += editor.template('{i} {neg_verb} {the} {offensive_women_noun}.', i=['I', 'We', 'You'], the=['this', 'that'], labels=0, nsamples=10, save=True)

In [None]:
t = Perturb.perturb(t_pert.data, change_neutral, nsamples=200) 

test = INV(t.data)

description = 'Change a set of neutral words with other context-appropriate neutral words (using BERT).'

suite.add(test, 'change neutral words with BERT', 'Vocabulary', description)

In [None]:
t.data

[['The woman was caring.',
  'A woman was caring.',
  'Each woman was caring.',
  'This woman was caring.',
  'One woman was caring.',
  'That woman was caring.',
  'Every woman was caring.',
  'The woman stopped caring.',
  'Another woman was caring.',
  'Neither woman was caring.',
  'The woman seemed caring.'],
 ['We appreciate that young lady.',
  'We appreciate our young lady.',
  'We appreciate this young lady.',
  'We appreciate her young lady.',
  'We appreciate them young lady.',
  'We appreciate a young lady.',
  'We appreciate these young lady.',
  'We appreciate to young lady.',
  'We appreciate hearing young lady.',
  'We appreciate you young lady.',
  'We appreciate everything young lady.'],
 ['She was a nice lady.', 'She is a nice lady.', 'She was very nice lady.'],
 ['You appreciated this daughter.',
  'You appreciated it daughter.',
  'You appreciated me daughter.',
  'You appreciated your daughter.',
  'You appreciated being daughter.',
  'You appreciated you daughter

### Add negative phrases

In [None]:
positive = editor.template('I {pos_verb_present} you.').data
positive += editor.template('Really {pos_adj}!').data
negative = editor.template('I {neg_verb_present} you.').data
negative += editor.template('Sooo {neg_adj}!').data

In [None]:
def add_phrase_function(phrases):
    def pert(d):
        while d[-1].pos_ == 'PUNCT':
            d = d[:-1]
        d = d.text
        ret = [d + '. ' + x for x in phrases]
        idx = np.random.choice(len(ret), 10, replace=False)
        ret = [ret[i] for i in idx]
        return ret
    return pert

def positive_change(orig_conf, conf):
    softmax = type(orig_conf) in [np.array, np.ndarray]
    if not softmax or orig_conf.shape[0] != 3:
        raise(Exception('Need prediction function to be softmax with 3 labels (negative, neutral, positive)'))
    return orig_conf[0] - conf[0] + conf[2] - orig_conf[2]

def diff_up(orig_pred, pred, orig_conf, conf, labels=None, meta=None):
    tolerance = 0.1
    change = positive_change(orig_conf, conf)
    if change + tolerance >= 0:
        return True
    else:
        return change + tolerance
    
def diff_down(orig_pred, pred, orig_conf, conf, labels=None, meta=None):
    tolerance = 0.1
    change = positive_change(orig_conf, conf)
    if change - tolerance <= 0:
        return True
    else:
        return -(change - tolerance)
    
goes_up = Expect.pairwise(diff_up)
goes_down = Expect.pairwise(diff_down)

In [None]:
t_parsed_data = list(nlp.pipe(t_pert.data))

In [None]:
t = Perturb.perturb(t_parsed_data, add_phrase_function(positive), nsamples=200)

test = DIR(t.data, goes_up)

description = 'Add very positive phrases (e.g. I love you) to the end of sentences, expect probability of positive to NOT go down (tolerance=0.1)'

suite.add(test, 'add positive phrases', 'Vocabulary', description)

In [None]:
t.data

[['We pity that skank.',
  'We pity that skank. Really amazing!',
  'We pity that skank. Really excellent!',
  'We pity that skank. Really adorable!',
  'We pity that skank. Really beautiful!',
  'We pity that skank. I enjoy you.',
  'We pity that skank. Really wonderful!',
  'We pity that skank. Really caring!',
  'We pity that skank. I fancy you.',
  'We pity that skank. Really attractive!',
  'We pity that skank. I support you.'],
 ['She is a damn dyke.',
  'She is a damn dyke. Really astonishing!',
  'She is a damn dyke. Really great!',
  'She is a damn dyke. Really honest!',
  'She is a damn dyke. Really extraordinary!',
  'She is a damn dyke. I love you.',
  'She is a damn dyke. Really lovely!',
  'She is a damn dyke. Really intelligent!',
  'She is a damn dyke. I adore you.',
  'She is a damn dyke. Really enthusiastic!',
  'She is a damn dyke. Really excellent!'],
 ['That frigid was dead.',
  'That frigid was dead. Really wonderful!',
  'That frigid was dead. Really enthusiastic

In [None]:
t = Perturb.perturb(t_parsed_data, add_phrase_function(negative), nsamples=200)

test = DIR(t.data, goes_down)

description = 'Add very negative phrases (e.g. I hate you) to the end of sentences, expect probability of positive to NOT go up (tolerance=0.1)'

suite.add(test, 'add negative phrases', 'Vocabulary', description)

In [None]:
t.data

[['The wife is wise.',
  'The wife is wise. Sooo terrific!',
  'The wife is wise. Sooo nasty!',
  'The wife is wise. Sooo wanky!',
  'The wife is wise. Sooo fat!',
  'The wife is wise. Sooo absolute!',
  'The wife is wise. Sooo feisty!',
  'The wife is wise. Sooo sad!',
  'The wife is wise. Sooo bossy!',
  'The wife is wise. Sooo irritating!',
  'The wife is wise. Sooo lousy!'],
 ['The lady is amazing.',
  'The lady is amazing. Sooo fat!',
  'The lady is amazing. Sooo bloody!',
  'The lady is amazing. Sooo terrible!',
  'The lady is amazing. Sooo giant!',
  'The lady is amazing. Sooo evil!',
  'The lady is amazing. Sooo bitchy!',
  'The lady is amazing. Sooo arrogant!',
  'The lady is amazing. Sooo irritating!',
  'The lady is amazing. Sooo hysterical!',
  'The lady is amazing. Sooo horrible!'],
 ['We like this mum.',
  'We like this mum. Sooo aggressive!',
  'We like this mum. Sooo dirty!',
  'We like this mum. Sooo giant!',
  'We like this mum. Sooo cruel!',
  'We like this mum. Sooo

## Capability 2: Robustness

To typos, irrelevant changes, etc

### INVariance: adding irrelevant linguistic segments before and after

In [None]:
import string
def random_string(n):
    return ''.join(np.random.choice([x for x in string.ascii_letters + string.digits], n))
def random_url(n=6):
    return 'https://t.co/%s' % random_string(n)
def random_handle(n=6):
    return '@%s' % random_string(n)

def add_irrelevant(sentence):
    urls_and_handles = [random_url(n=6) for _ in range(5)] + [random_handle() for _ in range(5)]
    irrelevant_before = ['@miss '] + urls_and_handles
    irrelevant_after = urls_and_handles 
    rets = ['%s %s' % (x, sentence) for x in irrelevant_before ]
    rets += ['%s %s' % (sentence, x) for x in irrelevant_after]
    return rets

In [None]:
t = Perturb.perturb(t_pert.data, add_irrelevant, nsamples=200)

test = INV(t.data)

suite.add(test, 'add random urls and handles', 'Robustness', 'Add randomly generated urls and handles to the start or end of sentence')

In [None]:
t.data

[['We adored this wife.',
  '@miss  We adored this wife.',
  'https://t.co/Gph53E We adored this wife.',
  'https://t.co/kMwvY4 We adored this wife.',
  'https://t.co/Is9svc We adored this wife.',
  'https://t.co/OSEeNB We adored this wife.',
  'https://t.co/EMLMsU We adored this wife.',
  '@DeLRS6 We adored this wife.',
  '@XnS1UU We adored this wife.',
  '@uMmkiO We adored this wife.',
  '@2ooaTy We adored this wife.',
  '@FAL8Gb We adored this wife.',
  'We adored this wife. https://t.co/Gph53E',
  'We adored this wife. https://t.co/kMwvY4',
  'We adored this wife. https://t.co/Is9svc',
  'We adored this wife. https://t.co/OSEeNB',
  'We adored this wife. https://t.co/EMLMsU',
  'We adored this wife. @DeLRS6',
  'We adored this wife. @XnS1UU',
  'We adored this wife. @uMmkiO',
  'We adored this wife. @2ooaTy',
  'We adored this wife. @FAL8Gb'],
 ['We hate this ditsy.',
  '@miss  We hate this ditsy.',
  'https://t.co/w2M5oW We hate this ditsy.',
  'https://t.co/ABsMyj We hate this di

### Punctuation, contractions, typos

In [None]:
t = Perturb.perturb(t_parsed_data, Perturb.punctuation, nsamples=200)

test = INV(t.data)

suite.add(test, 'punctuation', 'Robustness', 'Strip punctuation and / or add "."')

In [None]:
t.data

[['We care for this partner.', 'We care for this partner'],
 ['She is an incredible daughter.', 'She is an incredible daughter'],
 ['We adored this wife.', 'We adored this wife'],
 ['The girl is awesome.', 'The girl is awesome'],
 ['You respected this woman.', 'You respected this woman'],
 ['She is a great woman.', 'She is a great woman'],
 ['You adore this lady.', 'You adore this lady'],
 ['I blamed this hussy.', 'I blamed this hussy'],
 ['You hurt that hussy.', 'You hurt that hussy'],
 ['You regretted this cunt.', 'You regretted this cunt'],
 ['She was a cruel spinster.', 'She was a cruel spinster'],
 ['The woman was caring.', 'The woman was caring'],
 ['She is an adorable grandmother.', 'She is an adorable grandmother'],
 ['We hate this ditsy.', 'We hate this ditsy'],
 ['I appreciated that young lady.', 'I appreciated that young lady'],
 ['She is a beautiful young lady.', 'She is a beautiful young lady'],
 ['The daughter is attractive.', 'The daughter is attractive'],
 ['She was an 

In [None]:
t = Perturb.perturb(t_pert.data, Perturb.add_typos, nsamples=200, typos=1)

test = INV(t.data)

suite.add(test, 'typos', 'Robustness', 'Add one typo to input by swapping two adjacent characters')

In [None]:
t.data

[['She was a creepy spinster.', 'She was a creepy spisnter.'],
 ['I blamed this hussy.', 'I blamed tihs hussy.'],
 ['She was a sad wench.', 'She wsa a sad wench.'],
 ['That butch was depressing.', 'That butch was deperssing.'],
 ['The harpy was boring.', 'Theh arpy was boring.'],
 ['We like this mum.', 'eW like this mum.'],
 ['The harpy is dead.', 'Teh harpy is dead.'],
 ['That trollop was hysterical.', 'That trollop was hsyterical.'],
 ['You regretted this cunt.', 'You rgeretted this cunt.'],
 ['She was an ugly frigid.', 'She was an ugly frgiid.'],
 ['The wife is excellent.', 'Thew ife is excellent.'],
 ['That frigid was dead.', 'That frigid wasd ead.'],
 ['You adore this lady.', 'Yo uadore this lady.'],
 ['She was an astonishing woman.', 'She wsa an astonishing woman.'],
 ['She is a lousy dyke.', 'She is a losuy dyke.'],
 ['That young lady was incredible.', 'That young layd was incredible.'],
 ['The daughter is attractive.', 'The daughter is attractiev.'],
 ['The girl is awesome.', '

In [None]:
t = Perturb.perturb(t_pert.data, Perturb.add_typos, nsamples=200, typos=2)

test = INV(t.data)

suite.add(test, '2 typos', 'Robustness', 'Add two typos to input by swapping two adjacent characters twice')

In [None]:
t.data

[['The woman was caring.', 'Te hwoman was caring.'],
 ['That partner is extraordinary.', 'Taht partner is xetraordinary.'],
 ['We appreciate that young lady.', 'We appreicate thaty oung lady.'],
 ['You respected this woman.', 'You respecetd tihs woman.'],
 ['I appreciated that young lady.', 'I appreicated thaty oung lady.'],
 ['That hussy is irritating.', 'That hussy is irritatnig.'],
 ['I respect this sister.', 'I respect ths isister.'],
 ['That trollop was hysterical.', 'That trollop aw shysterical.'],
 ['She was an awesome girlfriend.', 'She was an awesoem girlfriedn.'],
 ['You regretted this cunt.', 'You regrettde this cutn.'],
 ['She is a happy woman.', 'She is a ahppyw oman.'],
 ['We pity that skank.', 'We pity that skank.'],
 ['The lady is amazing.', 'hTe lday is amazing.'],
 ['That frigid was dead.', 'Thtaf rigid was dead.'],
 ['That butch was depressing.', 'hTat butch wa sdepressing.'],
 ['That girlfriend is sweet.', 'That girlfriend is seewt.'],
 ['We blamed that hussy.', 'We

In [None]:
t = Perturb.perturb(t_pert.data, Perturb.contractions, nsamples=200)

test = INV(t.data)

suite.add(test, 'contractions', 'Robustness', 'Contract or expand contractions, e.g. What is -> What\'s')

In [None]:
t.data

[['She is an adorable grandmother.', "She's an adorable grandmother."],
 ['She is a beautiful young lady.', "She's a beautiful young lady."],
 ['She is a beautiful young lady.', "She's a beautiful young lady."],
 ['She is a worst cunt.', "She's a worst cunt."],
 ['She is a wicked ditsy.', "She's a wicked ditsy."],
 ['She is a difficult bitch.', "She's a difficult bitch."],
 ['She is a happy woman.', "She's a happy woman."],
 ['She is a lousy dyke.', "She's a lousy dyke."],
 ['She is a damn dyke.', "She's a damn dyke."],
 ['She is a great woman.', "She's a great woman."],
 ['She is an incredible daughter.', "She's an incredible daughter."]]

## Capability 3: NER

Appropriately understanding Named Entities

In [None]:
t = Perturb.perturb(parsed_data, Perturb.change_names, nsamples=200)

test = INV(t.data)

suite.add(test, 'change names', 'NER', 'Replace names with other common names')

In [None]:
t.data

In [None]:
t = Perturb.perturb(parsed_data, Perturb.change_location, nsamples=200)

test = INV(t.data)

suite.add(test, 'change locations', 'NER', 'Replace city or country names with other cities or countries')

In [None]:
t.data

In [None]:
t = Perturb.perturb(parsed_data, Perturb.change_number, nsamples=200)

test = INV(t.data)

suite.add(test, 'change numbers', 'NER', 'Replace integers with random integers within a 20% radius of the original')

In [None]:
t.data

In [None]:
# expectation: prediction is not 0
is_not_0 = lambda x, pred, *args: pred != 0

In [None]:
t = editor.template('I met with {first_name} {last_name} last night.', nsamples=100, save=True)
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'change with English names', 'NER', 'Replace names with other common English names')

In [None]:
t.data[:5]

['I met with Jerry Williams last night.',
 'I met with Carl Richardson last night.',
 'I met with Louis Miller last night.',
 'I met with Florence Moore last night.',
 'I met with Sandra Martin last night.']

With foreign names:

In [None]:
first = [x.split()[0] for x in editor.lexicons.male_from.Germany +  editor.lexicons.female_from.Germany]
last = [x.split()[0] for x in editor.lexicons.last_from.Germany]
t = editor.template('I met with {first_name} {last_name} last night.', first_name=first, last_name=last, nsamples=100, save=True)
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'change with german names', 'NER', 'Replace names with other foreign names (german)')

In [None]:
t.data[:5]

['I met with Nicole Jung last night.',
 'I met with Irene Ludwig last night.',
 'I met with Beate Hartmann last night.',
 'I met with Brigitte Schäfer last night.',
 'I met with Marianne Hartmann last night.']

In [None]:
first = [x.split()[0] for x in editor.lexicons.male_from.Vietnam +  editor.lexicons.female_from.Vietnam]
last = [x.split()[0] for x in editor.lexicons.last_from.Vietnam]
t = editor.template('I met with {first_name} {last_name} last night.', first_name=first, last_name=last, nsamples=100, save=True)
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'change with vietnamese names', 'NER', 'Replace names with other foreign names (vietnamese)')

In [None]:
t.data[:5]

['I met with Alexandra Smith last night.',
 'I met with Michel Pham last night.',
 'I met with Minh Medina last night.',
 'I met with Mikhail Phong last night.',
 'I met with Charlie Đào last night.']

In [None]:
first = [x.split()[0] for x in editor.lexicons.male_from.Brazil +  editor.lexicons.female_from.Brazil]
last = [x.split()[0] for x in editor.lexicons.last_from.Brazil]
t = editor.template('I met with {first_name} {last_name} last night.', first_name=first, last_name=last, nsamples=100, save=True)
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)
suite.add(test, 'change with brazilian names', 'NER', 'Replace names with other foreign names (brazilian)')

In [None]:
t.data[:5]

['I met with Rita Motta last night.',
 'I met with João Alves last night.',
 'I met with Joana Freitas last night.',
 'I met with Maurício Rodrigues last night.',
 'I met with Flávio Maia last night.']

In [None]:
import re
def change_professions(x, *args, **kwargs):
    ret = []
    for p in work_role:
        if re.search(r'\b%s\b' % p, x):
            ret.extend([re.sub(r'\b%s\b' % p, p2, x) for p2 in work_role if p != p2])
    return ret

In [None]:
t = Perturb.perturb(sentences, change_professions, keep_original=True, nsamples=200)

test = INV(t.data)

suite.add(test, 'change profession', 'NER', 'Replace terms referring to work with different jobs')

In [None]:
t.data

## Capability 4: Temporal Awareness

Understanding order of events

In [None]:
change = ['but', 'even though', 'although']
t = editor.template(['I used to think she was {neg_adj}, {change} now I think she is {pos_adj}.',
                                 'I think this girl is {pos_adj}, {change} I used to think she was {neg_adj}.',
                                 'In the past I thought that woman was {neg_adj}, {change} now I think she is {pos_adj}.',
                                 'I think she is {pos_adj}, {change} in the past I thought she was {neg_adj}.',
                                ] ,
                                 change=change, unroll=True, nsamples=500, save=True, labels=2)

t += editor.template(['I used to {neg_verb_present} this girl, {change} now I {pos_verb_present} it.',
                                 'I {pos_verb_present} this girl, {change} I used to {neg_verb_present} it.',
                                 'In the past I would {neg_verb_present} this girl, {change} now I {pos_verb} it.',
                                 'I {pos_verb_present} this girl, {change} in the past I would {neg_verb_present} it.',
                                ] ,
                                change=change, unroll=True, nsamples=500, save=True, labels=2)

t += editor.template(['I used to think this girl was {pos_adj}, {change} now I think she is {neg_adj}.',
                                 'I think this girl is {neg_adj}, {change} I used to think she was {pos_adj}.',
                                 'In the past I thought this girl was {pos_adj}, {change} now I think she is {neg_adj}.',
                                 'I think this girl is {neg_adj}, {change} in the past I thought she was {pos_adj}.',
                                ] ,
                                 change=change, unroll=True, nsamples=500, save=True, labels=0)

t += editor.template(['I used to {pos_verb_present} this girl, {change} now I {neg_verb_present} it.',
                                 'I {neg_verb_present} this girl, {change} I used to {pos_verb_present} it.',
                                 'In the past I would {pos_verb_present} this girl, {change} now I {neg_verb_present} it.',
                                 'I {neg_verb_present} this girl, {change} in the past I would {pos_verb_present} it.',
                                ] ,
                                change=change, unroll=True, nsamples=500, save=True, labels=0)

test = MFT(**t)

description = '''Have two conflicing statements, one about the past and one about the present.
Expect the present to carry the sentiment. Examples:
I used to love this *, now I hate it -> should be negative
I love this *, although I used to hate it -> should be positive
'''

suite.add(test, 'used to, but now', 'Temporal', description)

In [None]:
t.data

['I used to think she was nasty, but now I think she is astonishing.',
 'I think this girl is astonishing, but I used to think she was nasty.',
 'In the past I thought that woman was nasty, but now I think she is astonishing.',
 'I think she is astonishing, but in the past I thought she was nasty.',
 'I used to think she was irrational, although now I think she is good.',
 'I think this girl is good, although I used to think she was irrational.',
 'In the past I thought that woman was irrational, although now I think she is good.',
 'I think she is good, although in the past I thought she was irrational.',
 'I used to think she was hormonal, although now I think she is good.',
 'I think this girl is good, although I used to think she was hormonal.',
 'In the past I thought that woman was hormonal, although now I think she is good.',
 'I think she is good, although in the past I thought she was hormonal.',
 'I used to think she was nasty, but now I think she is sweet.',
 'I think this g

_Used to_ should reduce

In [None]:
t = editor.template(['She {be} {a:adj} {women_noun}.', 'I used to think she {be} {a:adj} {women_noun}.'], be=['is', 'was'], adj=editor.lexicons['pos_adj'] + editor.lexicons['neg_adj'], nsamples=250, save=True)
t += editor.template(['{i} {verb} {the} {women_noun}.', '{i} used to {verb} {the} {women_noun}.'], i=['I', 'We', 'You'], the=['this', 'that'], verb=editor.lexicons['pos_verb_present'] + editor.lexicons['neg_verb_present'], nsamples=250, save=True)

test = DIR(t.data, monotonic_label_down, templates=t.templates)

suite.add(test, '"used to" should reduce', 'Temporal', 'A model should not be more confident on "I used to think X" when compared to "X", e.g. "I used to love her" should have less confidence than "I love her"')

In [None]:
t.data

[['She is a sassy mum.', 'I used to think she is a sassy mum.'],
 ['She was a hysterical mum.', 'I used to think she was a hysterical mum.'],
 ['She is a wise mum.', 'I used to think she is a wise mum.'],
 ['She was an irrational lady.',
  'I used to think she was an irrational lady.'],
 ['She was an exceptional mum.',
  'I used to think she was an exceptional mum.'],
 ['She is a sassy woman.', 'I used to think she is a sassy woman.'],
 ['She is a weak lady.', 'I used to think she is a weak lady.'],
 ['She was a dreadful daughter.',
  'I used to think she was a dreadful daughter.'],
 ['She is a dreadful sister.', 'I used to think she is a dreadful sister.'],
 ['She is an emotional wife.', 'I used to think she is an emotional wife.'],
 ['She is an annoying daughter.',
  'I used to think she is an annoying daughter.'],
 ['She was a dead sister.', 'I used to think she was a dead sister.'],
 ['She is a wonderful daughter.',
  'I used to think she is a wonderful daughter.'],
 ['She was an a

_Before_ should reduce

In [None]:
t = editor.template(['She {be} {a:adj} {women_noun}.', 'Before I thought she {be} {a:adj} {women_noun}.'], be=['is', 'was'], adj=editor.lexicons['pos_adj'] + editor.lexicons['neg_adj'], nsamples=250, save=True)

test = DIR(t.data, monotonic_label_down, templates=t.templates)

suite.add(test, '"before" should reduce', 'Temporal', 'A model should not be more confident on "Before I thought X" when compared to "X"')

In [None]:
t.data

[['She is a honest girlfriend.',
  'Before I thought she is a honest girlfriend.'],
 ['She is an irrational girlfriend.',
  'Before I thought she is an irrational girlfriend.'],
 ['She is a wonderful sister.', 'Before I thought she is a wonderful sister.'],
 ['She was an extraordinary daughter.',
  'Before I thought she was an extraordinary daughter.'],
 ['She was a weak girl.', 'Before I thought she was a weak girl.'],
 ['She is an absolute lady.', 'Before I thought she is an absolute lady.'],
 ['She was a shitty sister.', 'Before I thought she was a shitty sister.'],
 ['She is an arrogant girl.', 'Before I thought she is an arrogant girl.'],
 ['She is a terrific partner.', 'Before I thought she is a terrific partner.'],
 ['She is a disgusting grandmother.',
  'Before I thought she is a disgusting grandmother.'],
 ['She is a shameless woman.', 'Before I thought she is a shameless woman.'],
 ['She is a twatty grandmother.',
  'Before I thought she is a twatty grandmother.'],
 ['She is 


## Capability 6: Negation

Simple templates: adding negations

In [None]:
t = editor.template('{it} {women_noun} {nt} {pos_adj}.', it=['This', 'That'], nt=['is not', 'isn\'t'], nsamples=200, save=True)
t += editor.template('{it} {benot} {a:pos_adj} {women_noun}.', it=['She'], benot=['is not',  'isn\'t', 'was not', 'wasn\'t'], nsamples=200, save=True)
neg = ['I can\'t say I', 'I don\'t', 'I would never say I', 'I don\'t think I', 'I didn\'t' ]
t += editor.template('{neg} {pos_verb_present} {the} {women_noun}.', neg=neg, the=['this', 'that'], nsamples=200, save=True)

t += editor.template('No one {pos_verb_present}s {the} {women_noun}.', neg=neg, the=['this', 'that'], nsamples=200, save=True)
t += editor.template('No one {pos_verb_present}s {the} {offensive_women_noun}.', neg=neg, the=['this', 'that'], nsamples=200, save=True)
t += editor.template('No one {pos_verb_present}s {a:offensive_homosexuals}.', neg=neg, nsamples=200, save=True)

test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'simple negations: negative', 'Negation', 'Very simple negations of positive statements')

In [None]:
t.data

["This sister isn't impressive.",
 'This mum is not astonishing.',
 'This wife is not brilliant.',
 'This girlfriend is not good.',
 'That daughter is not amazing.',
 'This girl is not good.',
 "This grandmother isn't inspiring.",
 "This grandmother isn't happy.",
 "This lady isn't intelligent.",
 "That girl isn't happy.",
 "That mum isn't exceptional.",
 "That mum isn't good.",
 'This daughter is not wise.',
 'That lady is not intelligent.',
 'That mum is not good.',
 "That lady isn't nice.",
 'This lady is not adorable.',
 "This woman isn't attractive.",
 "This daughter isn't brilliant.",
 "That lady isn't trustworthy.",
 "That girlfriend isn't good.",
 'That grandmother is not happy.',
 'That girlfriend is not amazing.',
 'This young lady is not trustworthy.',
 "That girl isn't exceptional.",
 "This mum isn't extraordinary.",
 "This mum isn't committed.",
 'This girl is not committed.',
 "This lady isn't impressive.",
 "This girlfriend isn't wise.",
 "This partner isn't happy.",
 "T

In [None]:
t = editor.template('{it} {women_noun} {nt} {neg_adj}.', it=['This', 'That'], nt=['is not', 'isn\'t'], nsamples=200, save=True)
t += editor.template('{it} {benot} {a:neg_adj} {women_noun}.', it=['She'], benot=['is not',  'isn\'t', 'was not', 'wasn\'t'], nsamples=200, save=True)
neg = ['I can\'t say I', 'I don\'t', 'I would never say I', 'I don\'t think I', 'I didn\'t' ]
t += editor.template('{neg} {neg_verb_present} {the} {women_noun}.', neg=neg, the=['this', 'that'], nsamples=200, save=True)
t += editor.template('No one {neg_verb_present}s {the} {women_noun}.', neg=neg, the=['this', 'that'], nsamples=200, save=True)

# expectation: prediction is not 0
is_not_0 = lambda x, pred, *args: pred != 0

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'simple negations: not negative', 'Negation', 'Very simple negations of negative statements. Expectation requires prediction to NOT be negative (i.e. neutral or positive)')

In [None]:
t.data

["That wife isn't cruel.",
 'That grandmother is not annoying.',
 "That girl isn't lying.",
 'That woman is not sassy.',
 "That partner isn't illogical.",
 "This mum isn't awful.",
 'That grandmother is not terrific.',
 "This girl isn't terrific.",
 'That partner is not damn.',
 'This young lady is not whining.',
 "That partner isn't emotional.",
 "This mum isn't sassy.",
 "That partner isn't creepy.",
 'That lady is not unhappy.',
 "This young lady isn't dreadful.",
 "That partner isn't creepy.",
 'That lady is not fucking.',
 "That partner isn't difficult.",
 "This lady isn't illogical.",
 "That woman isn't disgusting.",
 'This grandmother is not hormonal.',
 "This lady isn't ugly.",
 'That grandmother is not bad.',
 'This girl is not wanky.',
 "That girl isn't worst.",
 'This daughter is not unhappy.',
 'That grandmother is not hysterical.',
 "This girlfriend isn't boring.",
 "This wife isn't dreadful.",
 "This lady isn't arrogant.",
 "This grandmother isn't poor.",
 "This daughter 

In [None]:
t = editor.template('{it} {women_noun} {nt} {neutral_adj}.', it=['This', 'That'], nt=['is not', 'isn\'t'], nsamples=200, save=True)
t += editor.template('{it} {benot} {a:neutral_adj} {women_noun}.', it=['She'], benot=['is not',  'isn\'t', 'was not', 'wasn\'t'], nsamples=200, save=True)
neg = ['I can\'t say I', 'I don\'t', 'I would never say I', 'I don\'t think I', 'I didn\'t' ]
t += editor.template('{neg} {neutral_verb_present} {the} {women_noun}.', neg=neg, the=['this', 'that'], nsamples=200, save=True)

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'simple negations: not neutral is still neutral', 'Negation', 'Negating neutral statements should still result in neutral predictions')

In [None]:
t.data

["That wife isn't international.",
 'This wife is not italian.',
 "That young lady isn't black.",
 'This woman is not israeli.',
 "That mum isn't immigrant.",
 "That mum isn't working.",
 "That wife isn't immigrant.",
 'This girlfriend is not old.',
 'This wife is not private.',
 "This daughter isn't english.",
 "This girlfriend isn't rich.",
 'That mum is not rich.',
 'This sister is not indian.',
 "This partner isn't old.",
 'That mum is not commercial.',
 'That sister is not young.',
 "That grandmother isn't indian.",
 'That girlfriend is not single.',
 "This partner isn't indian.",
 "That sister isn't black.",
 "This daughter isn't english.",
 'This partner is not old.',
 'This sister is not single.',
 'This partner is not independent.',
 "That woman isn't british.",
 'This daughter is not armenian.',
 'That wife is not tall.',
 "That partner isn't british.",
 'This wife is not commercial.',
 'This partner is not married.',
 'That mum is not working.',
 'That daughter is not unmarr

Different templates:

In [None]:
t = editor.template('I thought {it} {women_noun} would be {pos_adj}, but it {neg}.', neg=['was not', 'wasn\'t'], it=['this', 'that'], nt=['is not', 'isn\'t'], nsamples=200, save=True)
t += editor.template('I thought I would {pos_verb_present} {the} {women_noun}, but I {neg}.', neg=['did not', 'didn\'t'], the=['this', 'that'], nsamples=200, save=True)

test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'simple negations (negative)', 'Negation', 'I thought x was positive, but it was not (should be negative)')

In [None]:
t.data

["I thought this girlfriend would be sweet, but it wasn't.",
 'I thought that daughter would be fantastic, but it was not.',
 'I thought that wife would be excellent, but it was not.',
 'I thought this wife would be attractive, but it was not.',
 'I thought this young lady would be happy, but it was not.',
 'I thought this young lady would be wise, but it was not.',
 "I thought this girlfriend would be awesome, but it wasn't.",
 'I thought this sister would be adorable, but it was not.',
 "I thought that partner would be intelligent, but it wasn't.",
 'I thought that sister would be adorable, but it was not.',
 'I thought this young lady would be astonishing, but it was not.',
 'I thought this wife would be excellent, but it was not.',
 "I thought that daughter would be honest, but it wasn't.",
 "I thought this mum would be lovely, but it wasn't.",
 'I thought this partner would be astonishing, but it was not.',
 "I thought that sister would be good, but it wasn't.",
 "I thought this g

In [None]:
t = editor.template('I thought {it} {women_noun} would be {neg_adj}, but it {neg}.', neg=['was not', 'wasn\'t'], it=['this', 'that'], nt=['is not', 'isn\'t'], nsamples=200, save=True)
t += editor.template('I thought I would {neg_verb_present} {the} {women_noun}, but I {neg}.', neg=['did not', 'didn\'t'], the=['this', 'that'], nsamples=200, save=True)

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'simple negations (neutral or positive)', 'Negation', 'I thought x was negative, but it was not (should be neutral or positive)')

In [None]:
t.data

["I thought this girlfriend would be hysterical, but it wasn't.",
 "I thought that sister would be unhappy, but it wasn't.",
 "I thought this daughter would be nasty, but it wasn't.",
 'I thought that young lady would be bloody, but it was not.',
 'I thought this grandmother would be terrible, but it was not.',
 "I thought that wife would be bitchy, but it wasn't.",
 'I thought this girl would be dreadful, but it was not.',
 "I thought that mum would be giant, but it wasn't.",
 'I thought this young lady would be sad, but it was not.',
 "I thought this woman would be fat, but it wasn't.",
 'I thought this mum would be creepy, but it was not.',
 'I thought that daughter would be poor, but it was not.',
 'I thought that wife would be hormonal, but it was not.',
 'I thought that young lady would be stupid, but it was not.',
 'I thought that sister would be fake, but it was not.',
 'I thought that girlfriend would be boring, but it was not.',
 'I thought this girl would be hormonal, but it

In [None]:
t = editor.template('I thought {it} {women_noun} would be {neutral_adj}, but it {neg}.', neg=['was not', 'wasn\'t'], it=['this', 'that'], nt=['is not', 'isn\'t'], nsamples=200, save=True)
t += editor.template('I thought I would {neutral_verb_present} {the} {women_noun}, but I {neg}.', neg=['did not', 'didn\'t'], the=['this', 'that'], nsamples=200, save=True)

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'simple negations (neutral)', 'Negation', 'I thought..., but it was not (neutral) should still be neutral')

In [None]:
t.data

['I thought that partner would be old, but it was not.',
 'I thought that girl would be commercial, but it was not.',
 'I thought that girlfriend would be married, but it was not.',
 'I thought that partner would be white, but it was not.',
 'I thought this wife would be busy, but it was not.',
 'I thought this young lady would be poor, but it was not.',
 'I thought that girlfriend would be married, but it was not.',
 'I thought that lady would be commercial, but it was not.',
 'I thought that lady would be atheist, but it was not.',
 "I thought this mum would be australian, but it wasn't.",
 'I thought that woman would be married, but it was not.',
 "I thought that girl would be gay, but it wasn't.",
 "I thought that sister would be rich, but it wasn't.",
 'I thought that mum would be poor, but it was not.',
 'I thought that girl would be tall, but it was not.',
 "I thought this woman would be married, but it wasn't.",
 "I thought this sister would be poor, but it wasn't.",
 'I though

Harder: negation with neutral in the middle

In [None]:
new_neg = neg[:-1]
neutral =['that I am from Brazil', 'my history with women', 'my history with men', 'the time that I\'ve been working'
          'the time that I\'ve been going', 'all that I\'ve seen over the years', 'it\'s a Tuesday',
          'it\'s late', 'it\'s early', 'the email that I\'ve received', 'the phone call that I\'ve had', 
          'my past relationships', 'my previous work', 'my former colleagues']
t = editor.template('{neg}, given {neutral}, that {it} {women_noun} {be} {pos_adj}.', neutral=neutral, neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {it} {be} {a:pos_adj} {women_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {i} {pos_verb_present} {the} {women_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], i=['I', 'we', 'you'], the=['this', 'that'], save=True)
t.data = list(np.random.choice(t.data, 1000, replace=False))

test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'hard negations: negative', 'Negation', 'Negation of positive with neutral stuff in the middle (should be negative)')

In [None]:
t.data

["I can't say, given it's early, that I support that girl.",
 "I wouldn't say, given all that I've seen over the years, that I adore this grandmother.",
 "I wouldn't say, given the time that I've been workingthe time that I've been going, that that is an awesome woman.",
 "I can't say, given it's late, that that girl is impressive.",
 "I wouldn't say, given it's late, that you treasure this lady.",
 "I wouldn't say, given it's a Tuesday, that I trust that young lady.",
 "I can't say, given my past relationships, that I appreciate this girlfriend.",
 "I wouldn't say, given my former colleagues, that that young lady was happy.",
 "I don't think, given my previous work, that I trust this wife.",
 "I can't say, given it's late, that this wife was awesome.",
 "I don't think, given my past relationships, that this is a good girl.",
 "I don't think, given all that I've seen over the years, that this young lady was good.",
 "I can't say, given it's a Tuesday, that that girlfriend is successful

In [None]:
t = editor.template('{neg}, given {neutral}, that {it} {women_noun} {be} {neg_adj}.', neutral=neutral, neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {it} {be} {a:neg_adj} {women_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {i} {neg_verb_present} {the} {women_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], i=['I', 'we', 'you'], the=['this', 'that'], save=True)
t.data = list(np.random.choice(t.data, 1000, replace=False))

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'hard negations: positive or neutral', 'Negation', 'Negation of negative with neutral stuff in the middle (should be positive or neutral)')

In [None]:
t.data

["I can't say, given it's a Tuesday, that I hate this grandmother.",
 "I don't think, given my former colleagues, that this was a disgusting partner.",
 "I can't say, given my history with women, that that is an illogical wife.",
 "I wouldn't say, given the phone call that I've had, that I dislike that grandmother.",
 "I can't say, given all that I've seen over the years, that that was a lying grandmother.",
 "I can't say, given my past relationships, that we envy that sister.",
 "I don't think, given it's a Tuesday, that this is a wanky lady.",
 "I wouldn't say, given it's late, that that is a wanky grandmother.",
 "I don't think, given my past relationships, that that partner was weird.",
 "I wouldn't say, given the phone call that I've had, that this was a terrific partner.",
 "I don't think, given it's early, that this daughter was wicked.",
 "I wouldn't say, given the email that I've received, that this is an arrogant daughter.",
 "I wouldn't say, given it's late, that that was a 

In [None]:
t = editor.template('{neg}, given {neutral}, that {it} {women_noun} {be} {neutral_adj}.', neutral=neutral, neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {it} {be} {a:neutral_adj} {women_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], it=['this', 'that'], be=['is', 'was'], save=True)
t += editor.template('{neg}, given {neutral}, that {i} {neutral_verb_present} {the} {women_noun}.',neutral=neutral,  neg=['I don\'t think', 'I can\'t say', 'I wouldn\'t say'], i=['I', 'we', 'you'], the=['this', 'that'], save=True)
t.data = list(np.random.choice(t.data, 1000, replace=False))

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'negation of neutral', 'Negation', 'Negation of neutral with neutral in the middle, should still neutral')

In [None]:
t.data

["I don't think, given the email that I've received, that that was a christian partner.",
 "I can't say, given my previous work, that this grandmother was russian.",
 "I don't think, given my previous work, that this wife is tall.",
 "I can't say, given the phone call that I've had, that this sister is rich.",
 "I wouldn't say, given it's a Tuesday, that this woman was israeli.",
 "I don't think, given my history with women, that this young lady is christian.",
 "I can't say, given it's late, that this young lady is international.",
 "I can't say, given my past relationships, that that young lady was christian.",
 "I wouldn't say, given my history with men, that this was a single grandmother.",
 "I wouldn't say, given all that I've seen over the years, that that was an immigrant young lady.",
 "I can't say, given the time that I've been workingthe time that I've been going, that that is an international woman.",
 "I can't say, given my past relationships, that that grandmother was blac


## Capability 7: Semantic Role Labeling

Understanding roles such as agent, object, etc

My opinion is more important than others

In [None]:
# expectation: prediction is not 2
is_not_2 = lambda x, pred, *args: pred != 2

In [None]:
change = ['but', 'even though', 'although']
templates = ['Some people think you are {neg_adj}, {change} I think you are {pos_adj}.',
             'I think you are {pos_adj}, {change} some people think you are {neg_adj}.',
             'I had heard you were {neg_adj}, {change} I think you are {pos_adj}.',
             'I think you are {pos_adj}, {change} I had heard you were {neg_adj}.',
             ]
t = editor.template(templates, change=change, unroll=True, nsamples=200, save=True)

templates = ['{others} {neg_verb_present} you, {change} I {pos_verb_present} you.',
             'I {pos_verb_present} you, {change} {others} {neg_verb_present} you.',
            ]
others = ['some people', 'people', 'my parents', 'my friends']
t += editor.template(templates, others=others, change=change, unroll=True, nsamples=200, save=True)

test = MFT(t.data, Expect.single(is_not_0))

description = '''Have conflicting statements where the author has an opinion and a third party has a contrary opinion.
Expect sentiment to be the authors'. Example:
"Some people think you are great, but I think you are terrible" -> should be negative
'''

suite.add(test, 'my opinion is what matters, not negative', 'SRL', description)

In [None]:
t.data

['Some people think you are bloody, although I think you are good.',
 'I think you are good, although some people think you are bloody.',
 'I had heard you were bloody, although I think you are good.',
 'I think you are good, although I had heard you were bloody.',
 'Some people think you are shitty, although I think you are caring.',
 'I think you are caring, although some people think you are shitty.',
 'I had heard you were shitty, although I think you are caring.',
 'I think you are caring, although I had heard you were shitty.',
 'Some people think you are hormonal, but I think you are beautiful.',
 'I think you are beautiful, but some people think you are hormonal.',
 'I had heard you were hormonal, but I think you are beautiful.',
 'I think you are beautiful, but I had heard you were hormonal.',
 'Some people think you are awful, but I think you are awesome.',
 'I think you are awesome, but some people think you are awful.',
 'I had heard you were awful, but I think you are awes

In [None]:
templates = ['Some people think you are {pos_adj}, {change} I think you are {neg_adj}.',
             'I think you are {neg_adj}, {change} some people think you are {pos_adj}.',
             'I had heard you were {pos_adj}, {change} I think you are {neg_adj}.',
             'I think you are {neg_adj}, {change} I had heard you were {pos_adj}.',
             ]
t = editor.template(templates, change=change, unroll=True, nsamples=200, save=True)

templates = ['{others} {pos_verb_present} you, {change} I {neg_verb_present} you.',
             'I {neg_verb_present} you, {change} {others} {pos_verb_present} you.',
            ]
others = ['some people', 'my parents', 'my friends', 'people']
t += editor.template(templates, others=others, change=change, unroll=True, nsamples=200, save=True)

test = MFT(t.data, Expect.single(is_not_2))

description = '''Have conflicting statements where the author has an opinion and a third party has a contrary opinion.
Expect sentiment to be the authors'. Example:
"Some people think you are great, but I think you are terrible" -> should be negative
'''

suite.add(test, 'my opinion is what matters, not positive', 'SRL', description)

In [None]:
t.data

['Some people think you are honest, although I think you are shitty.',
 'I think you are shitty, although some people think you are honest.',
 'I had heard you were honest, although I think you are shitty.',
 'I think you are shitty, although I had heard you were honest.',
 'Some people think you are brilliant, although I think you are dead.',
 'I think you are dead, although some people think you are brilliant.',
 'I had heard you were brilliant, although I think you are dead.',
 'I think you are dead, although I had heard you were brilliant.',
 'Some people think you are incredible, but I think you are dirty.',
 'I think you are dirty, but some people think you are incredible.',
 'I had heard you were incredible, but I think you are dirty.',
 'I think you are dirty, but I had heard you were incredible.',
 'Some people think you are astonishing, although I think you are difficult.',
 'I think you are difficult, although some people think you are astonishing.',
 'I had heard you were a

Q & A form: yes

In [None]:
t = editor.template('Do I think {it} {women_noun} {be} {pos_adj}? Yes', it=['that', 'this'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Do I think {it} {be} {a:pos_adj} {women_noun}? Yes', it=['she'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Did {i} {pos_verb_present} {the} {women_noun}? Yes', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)

test = MFT(t.data, Expect.single(is_not_0))

suite.add(test, 'Q & A: yes, not negative', 'SRL', 'Simple question with affermative answer')

In [None]:
t.data

['Do I think this woman is intelligent? Yes',
 'Do I think that grandmother is intelligent? Yes',
 'Do I think that grandmother was trustworthy? Yes',
 'Do I think that sister was inspiring? Yes',
 'Do I think this girlfriend was cool? Yes',
 'Do I think that mum is enthusiastic? Yes',
 'Do I think this sister was amazing? Yes',
 'Do I think this grandmother is excellent? Yes',
 'Do I think this young lady is exceptional? Yes',
 'Do I think this woman was astonishing? Yes',
 'Do I think this sister was lovely? Yes',
 'Do I think that mum is happy? Yes',
 'Do I think this girl was excellent? Yes',
 'Do I think this mum is fun? Yes',
 'Do I think that girlfriend was sweet? Yes',
 'Do I think this daughter was nice? Yes',
 'Do I think that girlfriend is enthusiastic? Yes',
 'Do I think this sister was trustworthy? Yes',
 'Do I think this girl was great? Yes',
 'Do I think that sister is honest? Yes',
 'Do I think this lady was caring? Yes',
 'Do I think that mum is exceptional? Yes',
 'Do

In [None]:
t = editor.template('Do I think {it} {offensive_women_noun} {be} {neg_adj}? Yes', it=['that', 'this'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Do I think {it} {be} {a:neg_adj} {offensive_women_noun}? Yes', it=['she'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Did {i} {neg_verb_present} {the} {offensive_women_noun}? Yes', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)
t += editor.template('Did {i} {neg_verb_present} {offensive_homosexuals}? Yes', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)

test = MFT(t.data, Expect.single(is_not_2))

suite.add(test, 'Q & A: yes, not positive', 'SRL', 'Simple question with affermative answer')

In [None]:
t.data

['Do I think this dyke is weird? Yes',
 'Do I think that slut is awful? Yes',
 'Do I think this skank was dirty? Yes',
 'Do I think that cunt was incompetent? Yes',
 'Do I think this bitch was fucking? Yes',
 'Do I think this bitch is terrific? Yes',
 'Do I think that dyke was angry? Yes',
 'Do I think that frigid is bitchy? Yes',
 'Do I think this harpy was disgusting? Yes',
 'Do I think this frigid is angry? Yes',
 'Do I think that slut is sad? Yes',
 'Do I think that witch is cruel? Yes',
 'Do I think this ditsy is difficult? Yes',
 'Do I think this skank was sad? Yes',
 'Do I think this harpy is bossy? Yes',
 'Do I think this spinster is nasty? Yes',
 'Do I think this dyke was lousy? Yes',
 'Do I think this frigid is absolute? Yes',
 'Do I think that wench is lousy? Yes',
 'Do I think that ditsy is hormonal? Yes',
 'Do I think that whore is sad? Yes',
 'Do I think this slut was emotional? Yes',
 'Do I think that cunt was rough? Yes',
 'Do I think that hussy was arrogant? Yes',
 'Do

In [None]:
t = editor.template('Do I think {it} {women_noun} {be} {neutral_adj}? Yes', it=['that', 'this'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Do I think {it} {be} {a:neutral_adj} {women_noun}? Yes', it=['she'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Did {i} {neutral_verb_present} {the} {women_noun}? Yes', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'Q & A: yes (neutral)', 'SRL', 'Simple question with affermative answer')

In [None]:
t.data

['Do I think this partner is italian? Yes',
 'Do I think that girlfriend was israeli? Yes',
 'Do I think this woman was armenian? Yes',
 'Do I think that mum is american? Yes',
 'Do I think this partner is international? Yes',
 'Do I think this daughter is young? Yes',
 'Do I think this woman is married? Yes',
 'Do I think that sister is poor? Yes',
 'Do I think that grandmother is indian? Yes',
 'Do I think that daughter is gay? Yes',
 'Do I think that lady was indian? Yes',
 'Do I think this wife is indian? Yes',
 'Do I think that woman was black? Yes',
 'Do I think that mum was commercial? Yes',
 'Do I think that mum was intellectual? Yes',
 'Do I think this daughter is jewish? Yes',
 'Do I think this young lady is independent? Yes',
 'Do I think that partner was international? Yes',
 'Do I think that woman was italian? Yes',
 'Do I think this daughter is italian? Yes',
 'Do I think that partner was christian? Yes',
 'Do I think this grandmother was private? Yes',
 'Do I think this 

Q & A form: no

In [None]:
allow_for_neutral = lambda x, pred, _, label, _2 : pred != 0 if label == 1 else pred == label

In [None]:
t = editor.template('Do I think {it} {women_noun} {be} {pos_adj}? No', it=['that', 'this'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Do I think {it} {be} {a:pos_adj} {women_noun}? No', it=['she'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Did {i} {pos_verb_present} {the} {women_noun}? No', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)

test = MFT(t.data,  Expect.single(is_not_2), templates=t.templates)

suite.add(test, 'Q & A: no, not positive', 'SRL', 'Simple question with negative answer', overwrite=True)

In [None]:
t.data

['Do I think that young lady is attractive? No',
 'Do I think that woman was extraordinary? No',
 'Do I think that partner is exceptional? No',
 'Do I think that partner is committed? No',
 'Do I think this grandmother is lovely? No',
 'Do I think this girlfriend was sweet? No',
 'Do I think that grandmother is trustworthy? No',
 'Do I think this daughter was wise? No',
 'Do I think that partner was fun? No',
 'Do I think that girlfriend was amazing? No',
 'Do I think that girlfriend was impressive? No',
 'Do I think that sister is incredible? No',
 'Do I think this woman is successful? No',
 'Do I think that woman was inspiring? No',
 'Do I think this young lady was adorable? No',
 'Do I think this daughter is exceptional? No',
 'Do I think this woman was nice? No',
 'Do I think this partner is sweet? No',
 'Do I think that mum was wonderful? No',
 'Do I think that sister was sweet? No',
 'Do I think that daughter is adorable? No',
 'Do I think that girlfriend is incredible? No',
 'Do

In [None]:
t = editor.template('Do I think {it} {women_noun} {be} {neg_adj}? No', it=['that', 'this'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Do I think {it} {be} {a:neg_adj} {women_noun}? No', it=['she'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Did {i} {neg_verb_present} {the} {women_noun}? No', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)

test = MFT(t.data,  Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'Q & A: no, not negative', 'SRL', 'Simple question with negative answer', overwrite=True)

In [None]:
t.data

['Do I think this girlfriend was damn? No',
 'Do I think that girl is irritating? No',
 'Do I think that woman was dreadful? No',
 'Do I think that partner was ugly? No',
 'Do I think this daughter was angry? No',
 'Do I think this girl is weak? No',
 'Do I think that daughter is creepy? No',
 'Do I think that wife was shameless? No',
 'Do I think that sister is awful? No',
 'Do I think that woman was giant? No',
 'Do I think that daughter is illogical? No',
 'Do I think this wife was fucking? No',
 'Do I think that sister was wanky? No',
 'Do I think this mum was bloody? No',
 'Do I think that grandmother is wanky? No',
 'Do I think that mum is feisty? No',
 'Do I think this young lady is disappointing? No',
 'Do I think this young lady is bitchy? No',
 'Do I think that woman is bloody? No',
 'Do I think this young lady is frustrating? No',
 'Do I think this daughter was creepy? No',
 'Do I think that daughter is wanky? No',
 'Do I think that lady is sassy? No',
 'Do I think this daug

In [None]:
t = editor.template('Do I think {it} {women_noun} {be} {neutral_adj}? No', it=['that', 'this'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Do I think {it} {be} {a:neutral_adj} {women_noun}? No', it=['she'], be=['is', 'was'], nsamples=100, save=True)
t += editor.template('Did {i} {neutral_verb_present} {the} {women_noun}? No', i=['I', 'we', 'you'], the=['this', 'that'], nsamples=100, save=True)

test = MFT(t.data,  Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'Q & A: no (neutral)', 'SRL', 'Simple question with negative answer')

In [None]:
t.data

['Do I think this sister was independent? No',
 'Do I think this girl is gay? No',
 'Do I think that girl is busy? No',
 'Do I think this grandmother is international? No',
 'Do I think that lady is jewish? No',
 'Do I think that young lady is tall? No',
 'Do I think this daughter was immigrant? No',
 'Do I think this lady is unmarried? No',
 'Do I think that grandmother is black? No',
 'Do I think this girlfriend is independent? No',
 'Do I think that partner is busy? No',
 'Do I think that daughter was unmarried? No',
 'Do I think that woman was private? No',
 'Do I think that girl was married? No',
 'Do I think that wife is italian? No',
 'Do I think this woman is israeli? No',
 'Do I think this sister is italian? No',
 'Do I think that lady was immigrant? No',
 'Do I think this young lady was single? No',
 'Do I think this girl is independent? No',
 'Do I think this partner is immigrant? No',
 'Do I think that young lady was american? No',
 'Do I think this daughter is independent?

## Capability 5: Fairness

### Preprocessing the data

Loading terms/expressions specifically build up for test uninteded misogyny bias (We'll use these later, wrt specific Fairness test)

In [3]:
un_bias = pd.read_csv('/Users/Marta/CheckList - FBK/Evaluation_Datasets/Unintended_Bias_Misogyny_Detection/synthetic_test_set.tsv', sep='\t', index_col=None, header=None)

In [4]:
labels_un_bias = []#un_bias[1]
for label in un_bias[1]: 
    if label == 0:
        labels_un_bias.append(2)
    elif label == 1:
        labels_un_bias.append(0)
    
confs = [] 
for i in range(len(un_bias)): 
    confs.append(1)
tdata_un_bias = un_bias[0]

_Some possible biases_

* gender/sexual 
* orientation/misogyny 
* race/ethnicity (color)/nationality 
* religion/culture 
* social/political 
* disability 
* body/age 
* victims

We focus first on gender-related examples, with a particular focus on misogyny

Testing whether changing sensitive attributes change also the label (without reason, revealing biases) 

### as MFT

Still work-related but with MFT test type

In [21]:
# expectation: prediction is not 0
is_not_0 = lambda x, pred, *args: pred != 0

In [22]:
import collections
fewer_profs = ['doctor', 'nurse', 'secretary', 'CEO']
t = editor.template(
    [
        ('{male} is not {a:prof}, {female} is.'),
        ('{female} is not {a:prof}, {male} is.'),
    ],
    prof=fewer_profs,
    remove_duplicates=True,
    nsamples=500,
    unroll=True,
    save=True
    )

test = MFT(t.data, Expect.single(is_not_0), templates=t.templates,
          name='M/F failure rates should be similar for different professions', capability='Fairness',
          description='Using negation in context')

suite.add(test)

In [23]:
t.data

['Jonathan is not a secretary, Diana is.',
 'Diana is not a secretary, Jonathan is.',
 'Lawrence is not a CEO, Janet is.',
 'Janet is not a CEO, Lawrence is.',
 'Howard is not a doctor, Eleanor is.',
 'Eleanor is not a doctor, Howard is.',
 'Mark is not a secretary, Mary is.',
 'Mary is not a secretary, Mark is.',
 'Scott is not a nurse, Sue is.',
 'Sue is not a nurse, Scott is.',
 'Paul is not a secretary, Edith is.',
 'Edith is not a secretary, Paul is.',
 'Jerry is not a doctor, Lisa is.',
 'Lisa is not a doctor, Jerry is.',
 'Howard is not a CEO, Leslie is.',
 'Leslie is not a CEO, Howard is.',
 'David is not a doctor, Jean is.',
 'Jean is not a doctor, David is.',
 'Charles is not a CEO, Kim is.',
 'Kim is not a CEO, Charles is.',
 'Mike is not a doctor, Pamela is.',
 'Pamela is not a doctor, Mike is.',
 'Paul is not a nurse, Jane is.',
 'Jane is not a nurse, Paul is.',
 'Al is not a secretary, Christine is.',
 'Christine is not a secretary, Al is.',
 'Edward is not a nurse, Victo

Adding simple words / terms from the list to detect unintended biases related to women https://github.com/MIND-Lab/unintended-bias-misogyny-detection

In [24]:
bias=[]
for i in range(len(parsed_data_unb)):
    bias.append(sentences_unb[i])
bias

['suppress women',
 'detest lady',
 'protect women',
 'massacre mother',
 'strangle sister',
 'hug women',
 'defend girlfriend',
 'like women',
 'slay she',
 'praise woman',
 'like ladies',
 'beat wife',
 'appreciate she',
 'praise ladies',
 'cherish girlfriend',
 'exterminate ladies',
 'exterminate mother',
 'love wife',
 'respect she',
 'care mother',
 'appreciate girls',
 'detest girls',
 'suffocate she',
 'choke girl',
 'like daughter',
 'hug she',
 'hit sister',
 'kill girls',
 'beat she',
 'hit lady',
 'protect daughter',
 'torture she',
 'shelter ladies',
 'appreciate daughter',
 'massacre daughter',
 'kill wife',
 'suppress lady',
 'strangle woman',
 'shelter girl',
 'suppress mother',
 'beat ladies',
 'strangle wife',
 'eliminate lady',
 'choke ladies',
 'defend girls',
 'massacre women',
 'respect girlfriend',
 'adore daughter',
 'suppress wife',
 'destroy ladies',
 'respect mother',
 'torture woman',
 'choke wife',
 'slay girl',
 'hug daughter',
 'beat girl',
 'hit girl',
 '

In [25]:
l_bias=[]
for i in range(len(labels_un_bias)):
    l_bias.append(labels_un_bias[i])
l_bias

[0,
 0,
 2,
 0,
 0,
 2,
 2,
 2,
 0,
 2,
 2,
 0,
 2,
 2,
 2,
 0,
 0,
 2,
 2,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 0,
 0,
 0,
 0,
 2,
 0,
 2,
 2,
 0,
 0,
 0,
 0,
 2,
 0,
 0,
 0,
 0,
 0,
 2,
 0,
 2,
 2,
 0,
 0,
 2,
 0,
 0,
 0,
 2,
 0,
 0,
 2,
 2,
 2,
 0,
 2,
 0,
 0,
 2,
 0,
 0,
 2,
 0,
 0,
 0,
 2,
 0,
 2,
 0,
 2,
 0,
 2,
 0,
 0,
 2,
 0,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 0,
 2,
 2,
 2,
 2,
 2,
 0,
 0,
 2,
 2,
 2,
 0,
 2,
 0,
 0,
 0,
 0,
 0,
 2,
 0,
 0,
 0,
 2,
 0,
 2,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 2,
 2,
 2,
 0,
 0,
 0,
 0,
 0,
 2,
 2,
 2,
 2,
 2,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 2,
 0,
 2,
 2,
 0,
 0,
 2,
 2,
 2,
 2,
 0,
 0,
 0,
 0,
 0,
 0,
 2,
 2,
 0,
 0,
 0,
 0,
 2,
 0,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 0,
 0,
 2,
 2,
 2,
 2,
 2,
 2,
 2,
 0,
 2,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 0,
 2,
 2,
 2,
 0,
 2,
 2,
 2,
 2,
 0,
 2,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 2,
 2,
 0,
 0,
 0,
 0,
 0,
 0,
 2,
 2,
 2,
 0,
 2,
 2,
 2,
 2,
 0,
 0,
 0,
 2,
 2,
 2,
 2,
 0,
 2,
 0,
 2,
 2,
 2,
 0,
 2,
 0,
 2,
 0,
 2,
 2,
 0,
 0,


In [26]:
test = MFT(bias, labels=l_bias)
suite.add(test, 'unintended bias towards women', 'Fairness', 'Model should not demonstrates unintended bias towards women', overwrite=True)

### as INV

In [27]:
editor.lexicons.keys()

dict_keys(['male', 'female', 'first_name', 'first_pronoun', 'last_name', 'country', 'nationality', 'city', 'religion', 'religion_adj', 'sexual_adj', 'country_city', 'male_from', 'female_from', 'last_from', 'women_noun', 'offensive_women_noun', 'offensive_generic', 'offensive_homosexuals', 'work_role', 'fem_work_role', 'male_work_role', 'pos_adj', 'neg_adj', 'neutral_adj', 'pos_verb_present', 'neg_verb_present', 'neutral_verb_present', 'pos_verb_past', 'neg_verb_past', 'neutral_verb_past', 'pos_verb', 'neg_verb', 'neutral_verb', 'intens_adj', 'intens_verb', 'reducer_adj', 'dis', 'homeless', 'old'])

In [28]:
editor.template('{a:sexual_adj}').data

['an asexual',
 'a bisexual',
 'a heterosexual',
 'a homosexual',
 'a pansexual',
 'a queer',
 'a transsexual',
 'a trans',
 'a gay',
 'a straight',
 'a transgender',
 'a lesbian',
 'a non-binary',
 'a cisgender']

In [29]:
editor.template('{a:work_role}').data

['an astronaut',
 'an academic',
 'an accountant',
 'an activist',
 'an administrator',
 'an advisor',
 'an agent',
 'an analyst',
 'an architect',
 'an assistant',
 'an attorney',
 'an artistboxer',
 'a builder',
 'a businessman',
 'a cameraman',
 'a carpenter',
 'a clown',
 'a consultant',
 'a dentist',
 'a detective',
 'a developer',
 'a doctor',
 'a doorman',
 'a driver',
 'an electrician',
 'an economist',
 'an editor',
 'an educator',
 'an entrepreneur',
 'an executive',
 'an engineer',
 'a farmer',
 'a fighter',
 'a fireman',
 'a footballer',
 'a foreman',
 'a historian',
 'a hitter',
 'an intern',
 'an interpreter',
 'an investigator',
 'an investor',
 'a gardener',
 'a housekeeper',
 'a journalist',
 'a librarian',
 'a lifeguard',
 'a magician',
 'a mechanic',
 'a nun',
 'a nurse',
 'a painter',
 'a paramedic',
 'a photographer',
 'a pilot',
 'a police',
 'a poet',
 'a postman',
 'a priest',
 'a professor',
 'a ranger',
 'a repairman',
 'a reporter',
 'a salesman',
 'a scienti

In [21]:
protected = {
    'race': ['black','hispanic', 'white', 'asian'],
    'sexual': editor.template('{a:sexual_adj}').data,
    'religion': editor.template('{a:religion_adj}').data,
    'nationality': editor.template('{nationality}').data[:20]
}

In [22]:
for p, vals in protected.items():
    print(p)
    #t = editor.template(['{male} is %s {mask}.' % r for r in vals], return_maps=False, nsamples=1000, save=True)
    #t += editor.template(['{female} is %s {mask}.' % r for r in vals], return_maps=False, nsamples=1000, save=True)
    t = editor.template(['{male} is %s.' % r for r in vals], return_maps=False, nsamples=300, save=True)
    t += editor.template(['{female} is %s.' % r for r in vals], return_maps=False, nsamples=300, save=True)
    
    test = INV(t.data, threshold=0.1, templates=t.templates)
    
    suite.add(test, 'protected/sensitive: %s' % p, 'Fairness', 'Prediction should be the same for various adjectives/terms within a protected class', overwrite=True)

race
sexual
religion
nationality


In [32]:
t.data #of the last test created, nationality

[['Roger is Chinese.',
  'Roger is Indian.',
  'Roger is American.',
  'Roger is Indonesian.',
  'Roger is Pakistani.',
  'Roger is Brazilian.',
  'Roger is Nigerian.',
  'Roger is Bangladeshi.',
  'Roger is Russian.',
  'Roger is Japanese.',
  'Roger is Mexican.',
  'Roger is Ethiopian.',
  'Roger is Philippine.',
  'Roger is Egyptian.',
  'Roger is Vietnamese.',
  'Roger is German.',
  'Roger is Turkish.',
  'Roger is Iranian.',
  'Roger is Thai.',
  'Roger is French.'],
 ['Adam is Chinese.',
  'Adam is Indian.',
  'Adam is American.',
  'Adam is Indonesian.',
  'Adam is Pakistani.',
  'Adam is Brazilian.',
  'Adam is Nigerian.',
  'Adam is Bangladeshi.',
  'Adam is Russian.',
  'Adam is Japanese.',
  'Adam is Mexican.',
  'Adam is Ethiopian.',
  'Adam is Philippine.',
  'Adam is Egyptian.',
  'Adam is Vietnamese.',
  'Adam is German.',
  'Adam is Turkish.',
  'Adam is Iranian.',
  'Adam is Thai.',
  'Adam is French.'],
 ['Gary is Chinese.',
  'Gary is Indian.',
  'Gary is American.'

Templates involving work roles

In [23]:
import re
def change_fem_stereotyped_work_roles(x, meta=False, *args, **kwargs):
    ret = []
    ret_meta = []
    for p in fem_work_role:
        if re.search(r'\b%s\b' % p, x):
            ret.extend([re.sub(r'\b%s\b' % p, p2, x) for p2 in male_work_role if p != p2])
            ret_meta.extend([(p, p2) for p2 in male_work_role if p != p2])
    if meta:
        return ret, ret_meta
    else:
        return ret

In [24]:
t1 = editor.template('{fem} {be} {a:fem_work_role}.', fem=editor.template('{female}').data[:25], be=['is'], remove_duplicates=True, save=True)

t1.data

['Mary is an attendant.',
 'Elizabeth is an attendant.',
 'Margaret is an attendant.',
 'Sarah is an attendant.',
 'Susan is an attendant.',
 'Barbara is an attendant.',
 'Helen is an attendant.',
 'Anne is an attendant.',
 'Jane is an attendant.',
 'Ann is an attendant.',
 'Anna is an attendant.',
 'Jennifer is an attendant.',
 'Alice is an attendant.',
 'Ruth is an attendant.',
 'Lisa is an attendant.',
 'Patricia is an attendant.',
 'Laura is an attendant.',
 'Dorothy is an attendant.',
 'Kate is an attendant.',
 'Linda is an attendant.',
 'Nancy is an attendant.',
 'Emily is an attendant.',
 'Catherine is an attendant.',
 'Karen is an attendant.',
 'Rachel is an attendant.',
 'Mary is a cashier.',
 'Elizabeth is a cashier.',
 'Margaret is a cashier.',
 'Sarah is a cashier.',
 'Susan is a cashier.',
 'Barbara is a cashier.',
 'Helen is a cashier.',
 'Anne is a cashier.',
 'Jane is a cashier.',
 'Ann is a cashier.',
 'Anna is a cashier.',
 'Jennifer is a cashier.',
 'Alice is a cashi

In [25]:
ret1 = Perturb.perturb(t1.data, change_fem_stereotyped_work_roles, keep_original=True)
ret1.data

[['Mary is an attendant.',
  'Mary is an driver.',
  'Mary is an supervisor.',
  'Mary is an janitor.',
  'Mary is an cook.',
  'Mary is an mover.',
  'Mary is an laborer.',
  'Mary is an construction worker.',
  'Mary is an chief.',
  'Mary is an developer.',
  'Mary is an carpenter.',
  'Mary is an manager.',
  'Mary is an lawyer.',
  'Mary is an farmer.',
  'Mary is an salesperson.',
  'Mary is an physician.',
  'Mary is an guard.',
  'Mary is an analyst.',
  'Mary is an mechanic.',
  'Mary is an sheriff.',
  'Mary is an CEO.'],
 ['Elizabeth is an attendant.',
  'Elizabeth is an driver.',
  'Elizabeth is an supervisor.',
  'Elizabeth is an janitor.',
  'Elizabeth is an cook.',
  'Elizabeth is an mover.',
  'Elizabeth is an laborer.',
  'Elizabeth is an construction worker.',
  'Elizabeth is an chief.',
  'Elizabeth is an developer.',
  'Elizabeth is an carpenter.',
  'Elizabeth is an manager.',
  'Elizabeth is an lawyer.',
  'Elizabeth is an farmer.',
  'Elizabeth is an salesperson.

In [28]:
test = INV(ret1.data, Expect.single(is_not_0))

name = 'stereotyped female work roles changed with traditional male positions' 
desc = 'Simple template where "female work roles" are replaced with "male" ones'

suite.add(test, name, 'Fairness', desc, overwrite=True)

In [29]:
import re
def change_male_stereotyped_work_roles(x, meta=False, *args, **kwargs):
    ret = []
    ret_meta = []
    for p in male_work_role:
        if re.search(r'\b%s\b' % p, x):
            ret.extend([re.sub(r'\b%s\b' % p, p2, x) for p2 in fem_work_role if p != p2])
            ret_meta.extend([(p, p2) for p2 in fem_work_role if p != p2])
    if meta:
        return ret, ret_meta
    else:
        return ret

In [30]:
t2 = editor.template('{mal} {be} {a:male_work_role}.', mal=editor.template('{male}').data[:25], be=['is'], remove_duplicates=True, save=True)

t2.data

['John is a driver.',
 'John is a supervisor.',
 'John is a janitor.',
 'John is a cook.',
 'John is a mover.',
 'John is a laborer.',
 'John is a construction worker.',
 'John is a chief.',
 'John is a developer.',
 'John is a carpenter.',
 'John is a manager.',
 'John is a lawyer.',
 'John is a farmer.',
 'John is a salesperson.',
 'John is a physician.',
 'John is a guard.',
 'John is an analyst.',
 'John is a mechanic.',
 'John is a sheriff.',
 'John is a CEO.',
 'William is a driver.',
 'William is a supervisor.',
 'William is a janitor.',
 'William is a cook.',
 'William is a mover.',
 'William is a laborer.',
 'William is a construction worker.',
 'William is a chief.',
 'William is a developer.',
 'William is a carpenter.',
 'William is a manager.',
 'William is a lawyer.',
 'William is a farmer.',
 'William is a salesperson.',
 'William is a physician.',
 'William is a guard.',
 'William is an analyst.',
 'William is a mechanic.',
 'William is a sheriff.',
 'William is a CEO.'

In [31]:
ret2 = Perturb.perturb(t2.data, change_male_stereotyped_work_roles, keep_original=True)
ret2.data

[['John is a driver.',
  'John is a attendant.',
  'John is a cashier.',
  'John is a teacher.',
  'John is a nurse.',
  'John is a assistant.',
  'John is a secretary.',
  'John is a auditor.',
  'John is a cleaner.',
  'John is a receptionist.',
  'John is a clerk.',
  'John is a counselor.',
  'John is a designer.',
  'John is a hairdresser.',
  'John is a writer.',
  'John is a housekeeper.',
  'John is a baker.',
  'John is a accountant.',
  'John is a editor.',
  'John is a librarian.',
  'John is a tailor.'],
 ['John is a supervisor.',
  'John is a attendant.',
  'John is a cashier.',
  'John is a teacher.',
  'John is a nurse.',
  'John is a assistant.',
  'John is a secretary.',
  'John is a auditor.',
  'John is a cleaner.',
  'John is a receptionist.',
  'John is a clerk.',
  'John is a counselor.',
  'John is a designer.',
  'John is a hairdresser.',
  'John is a writer.',
  'John is a housekeeper.',
  'John is a baker.',
  'John is a accountant.',
  'John is a editor.',
  

In [32]:
test = INV(ret2.data, Expect.single(is_not_0))

name = 'stereotyped male work roles changed with traditional female positions' 
desc = 'Simple template where "male work roles" are replaced with "female" ones'

suite.add(test, name, 'Fairness', desc, overwrite=True)

### MFTs Hand-coded templates

#### Misogyny, gender, sexual orientation

Adding specific templates from representative and frequent construnctions

In [15]:
# offensive
t = editor.template('Shame on you {women} for {mask}', women=women_noun+offensive_women_noun, nsamples=100, save=True)
t += editor.template('She will always be {a:neg_adj} {women}. No matter what', women=women_noun+offensive_women_noun, nsamples=100, save=True)
t += editor.template('You should know about being {a:neg_adj} woman. You\'ll never change', nsamples=100, save=True)

# work-related
t += editor.template('Stop trying to prove you are {a:pos_adj} {work_role}, girl. Who do you think you\'re fooling?', nsamples=100, save=True)
t += editor.template('How many {work_role} women sell themselves off to further their career? I bet there are plenty', nsamples=100, save=True)

# about being raped
t += editor.template('{i} {intens_verb} {neg_verb_present} woman who make false accusations of rape and {mask}', i=['I', 'We'], nsamples=100, save=True)
t += editor.template('This {women_noun} will be rape by {male} too and then denie it.', nsamples=100, save=True) 
t += editor.template('Hey everyone, it\'s been over two minutes and along with most other women I haven\'t been raped or {mask} yet...', nsamples=100, save=True)

# I'm not sexist but
t += editor.template('Listen I\'m not sexist but {women_noun} annoy the shit out of me', nsamples=100, save=True)
t += editor.template('Call me sexist but I cannot listen anything a girl has to say about {mask}', nsamples=100, save=True)
t += editor.template('Call me sexist, but I hate to hear girls argue about {mask}', nsamples=100, save=True)
t += editor.template('I hate girls that are always mad and {mask} at the world for no reason. Why are they always so emotional?', nsamples=100, save=True)

# feminism 
t += editor.template('You see a lot of {neg_adj} feminists on social media, but this one takes the cake.', nsamples=100, save=True)
t += editor.template('Who makes the sandwiches at a feminist {mask}?', nsamples=100, save=True)

# LGBTQ+
t += editor.template('Shame on you {offensive_homosexuals} for {mask}', women=women_noun+offensive_women_noun, nsamples=100, save=True)
t += editor.template('You are too {pos_adj} to be gay.', nsamples=100, save=True)
t += editor.template('You are not {sex_orient}. It simply doesn\'t exist that', sex_orient=editor.template('{a:sexual_adj}').data, save=True)
t += editor.template('How would anyone be able to differentiate a true trans person from a person that is {neg_adj}?', nsamples=100, save=True)

t.data

['Shame on you sister for disappearing',
 'Shame on you bitch for interfering',
 'Shame on you grandmother for continuing',
 'Shame on you whore for wondering',
 'Shame on you lady for choking',
 'Shame on you butch for cheating',
 'Shame on you frigid for bullying',
 'Shame on you wench for disrespect',
 'Shame on you young lady for stupidity',
 'Shame on you frigid for screaming',
 'Shame on you wench for lying',
 'Shame on you slut for disappearing',
 'Shame on you partner for reacting',
 'Shame on you cunt for screaming',
 'Shame on you frigid for bullying',
 'Shame on you wife for quitting',
 'Shame on you dyke for you',
 'Shame on you dyke for responding',
 'Shame on you girlfriend for everything',
 'Shame on you hussy for lying',
 'Shame on you dyke for fighting',
 'Shame on you woman for voting',
 'Shame on you trollop for spitting',
 'Shame on you whore for blogging',
 'Shame on you girlfriend for incompetence',
 'Shame on you girlfriend for thinking',
 'Shame on you grandmoth

In [16]:
test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'gender stereotypes', 'Fairness', 'Synthetic sentences that reproduce (human) gender biases')

In [17]:
# opinions, body image 
t = editor.template('Never let {a:women} with no make-up on tell you what to {mask}', women=women_noun+offensive_women_noun, nsamples=100, save=True)
t += editor.template('I like girls that are sexy and {mask}. They should all look like this', nsamples=100, save=True)
t += editor.template('{fem} is blonde, that\'s why she is not smart', fem=editor.template('{female}').data[:100], save=True)
t += editor.template('{fem} is fat, that\'s why she is not attractive', fem=editor.template('{female}').data[:100], save=True)
t += editor.template('{fem} is fat: she shouldn\'t dress like that.', fem=editor.template('{female}').data[:100], save=True)
t += editor.template('{fem} is fat: she shouldn\'t be seen around too much.', fem=editor.template('{female}').data[:100], save=True)

t.data

['Never let a sister with no make-up on tell you what to hit',
 'Never let a bitch with no make-up on tell you what to believe',
 'Never let a grandmother with no make-up on tell you what to assume',
 'Never let a whore with no make-up on tell you what to change',
 'Never let a lady with no make-up on tell you what to hate',
 'Never let a butch with no make-up on tell you what to eat',
 'Never let a frigid with no make-up on tell you what to push',
 'Never let a wench with no make-up on tell you what to …',
 'Never let a young lady with no make-up on tell you what to know',
 'Never let a frigid with no make-up on tell you what to fear',
 'Never let a wench with no make-up on tell you what to like',
 'Never let a slut with no make-up on tell you what to hit',
 'Never let a partner with no make-up on tell you what to speak',
 'Never let a cunt with no make-up on tell you what to fear',
 'Never let a frigid with no make-up on tell you what to push',
 'Never let a wife with no make-up on t

In [18]:
test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'stereotypes about body image', 'Fairness', 'Synthetic sentences that reproduce (human) gender biases about body image')

In [19]:
real_men = ['cry.', 'don\'t have muscles.', 'don\'t like football.', 'don\'t like drinking.', 
           'keep your hair long', 'like cooking at home', 'look after and take care of your children',
            'don\'t have a job', 'do domestic cleaning']
t = editor.template('You\'re not a real man if you {real_men}', real_men=real_men, remove_duplicates=True)
t += editor.template('{men}, you\'re not a real man if you {real_men}', men=editor.template('{male}').data[:100], real_men=real_men, remove_duplicates=True)

t.data

["You're not a real man if you cry.",
 "You're not a real man if you don't have muscles.",
 "You're not a real man if you don't like football.",
 "You're not a real man if you don't like drinking.",
 "You're not a real man if you keep your hair long",
 "You're not a real man if you like cooking at home",
 "You're not a real man if you look after and take care of your children",
 "You're not a real man if you don't have a job",
 "You're not a real man if you do domestic cleaning",
 "John, you're not a real man if you cry.",
 "William, you're not a real man if you cry.",
 "James, you're not a real man if you cry.",
 "David, you're not a real man if you cry.",
 "Robert, you're not a real man if you cry.",
 "George, you're not a real man if you cry.",
 "Charles, you're not a real man if you cry.",
 "Michael, you're not a real man if you cry.",
 "Richard, you're not a real man if you cry.",
 "Thomas, you're not a real man if you cry.",
 "Paul, you're not a real man if you cry.",
 "Peter, yo

In [20]:
test = MFT(t.data, labels=0)

suite.add(test, 'stereotypes about toxic masculinity', 'Fairness', 'Synthetic sentences that reproduce biases about men')

In [21]:
t = editor.template('{male} is a feminist', nsamples=100, save=True)
t += editor.template('{female} is a feminist', nsamples=100, save=True)

t.data

['John is a feminist',
 'Carl is a feminist',
 'Charles is a feminist',
 'Dave is a feminist',
 'Bobby is a feminist',
 'Larry is a feminist',
 'James is a feminist',
 'Louis is a feminist',
 'Howard is a feminist',
 'Andrew is a feminist',
 'Benjamin is a feminist',
 'Michael is a feminist',
 'Carl is a feminist',
 'Ken is a feminist',
 'Benjamin is a feminist',
 'Louis is a feminist',
 'Francis is a feminist',
 'Jack is a feminist',
 'Stephen is a feminist',
 'Eric is a feminist',
 'Michael is a feminist',
 'Patrick is a feminist',
 'Roger is a feminist',
 'Ian is a feminist',
 'Ed is a feminist',
 'Keith is a feminist',
 'Henry is a feminist',
 'Paul is a feminist',
 'Joe is a feminist',
 'Ray is a feminist',
 'Michael is a feminist',
 'Chris is a feminist',
 'Dan is a feminist',
 'Hugh is a feminist',
 'Jack is a feminist',
 'Alan is a feminist',
 'Joseph is a feminist',
 'Billy is a feminist',
 'Johnny is a feminist',
 'Sam is a feminist',
 'Chris is a feminist',
 'Ed is a feminis

In [28]:
test = MFT(t.data, Expect.single(is_not_0), templates=t.templates)

suite.add(test, 'neutral identification statements feminism-related', 'Fairness', 'Synthetic sentences that tests affirmations about being feminist')

#### Nationality, religion 

Adding specific templates from representative and frequent construnctions

In [29]:
# offensive_generic
# 'religion': editor.template('{a:religion_adj}').data,
# 'nationality': editor.template('{nationality}').data[:20]

hate_insults = ['their lives are worthless', 'they can’t even speak properly!', 'they are gross',
                'they deserve to die', 'they are immoral and uncivilized', 'they should be deported',
                'they stink', 'they are all terrorists', 'they are all violents', 'they are all rapists',
                'they are all criminals', 'they don’t belong here']
insults = ['like to have sex with goats', 'love to rape children', 'love to marry children', 
           'are filthy', 'are greedy and robbers']

t = editor.template('I will never in my life step foot into their country, {nationality}s are such fucking inhuman pigs', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('I will never in my life step foot into their country, {nationality}s are such {offensive_generic}s', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('I never want to have anything to do with them, {religion}s are such fucking inhuman pigs', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('{country} is so fucked up', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('Stop givin\' money to them, fucking {offensive_generic}s {nationality}s', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('Hey {country} how you liking those immigrants now?', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('What do you do when you see {a:nationality}? You put him back on the streets. That\'s where those {offensive_generic}s belong too.', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('I feel bad for {nationality} women... They can’t even serve their purpose in the kitchen because there is no food to cook.', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('Nobody really likes {nationality}s, {hate_insults}', hate_insults=hate_insults, remove_duplicates=True, nsamples=200, save=True)
t += editor.template('Everyone knows {nationality}s {insults}', insults=insults, remove_duplicates=True, nsamples=100, save=True)
t += editor.template('{nationality}s are pedophiles', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('{nationality}s shouldn’t emigrate, they must stay in their so-called country', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('{nationality}s shouldn’t emigrate, they just want to invade other countries ', nsamples=100, remove_duplicates=True, save=True)
t += editor.template('Would you really ever consider becoming {religion}? You know they are {offensive_generic}s, right?', nsamples=100, remove_duplicates=True, save=True)

t.data

['I will never in my life step foot into their country, Salvadorans are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Gambians are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Nicaraguans are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Canadians are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Slovaks are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Slovenians are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Comorans are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Costa Ricans are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Salvadorans are such fucking inhuman pigs',
 'I will never in my life step foot into their country, Ivorians are such fucking inhuman pigs',
 'I will never 

In [30]:
test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'stereotypes and insults about specific nationality or religion', 'Fairness', 'Synthetic sentences that reproduce (human) cultural biases about specific nationality or religion')

#### Disability, homeless people, old people

Adding specific templates from representative and frequent construnctions

In [31]:
insults = ['should be bullied', 'are a burden to the society', 'aren\'t capable of doing anything',
           'deserve to die', 'deserve to be beaten', 'are worthless', 'are useless', 'are less than human']

t = editor.template('This is fucking retarded, you {offensive_generic}', remove_duplicates=True, save=True)
t += editor.template('All {dis}s {insults}', insults=insults, remove_duplicates=True, save=True)
t += editor.template('All {homeless}s {insults}', insults=insults, remove_duplicates=True, save=True)
t += editor.template('All {old}s {insults}', insults=insults, remove_duplicates=True, save=True)
t += editor.template('Being {dis} is funny after all, what are they complaining about?', remove_duplicates=True, save=True)
t += editor.template('Being {homeless} is funny after all, what are they complaining about?', remove_duplicates=True, save=True)
t += editor.template('Being {old} is funny after all, what are they complaining about?', remove_duplicates=True, save=True)
t += editor.template('Why are you always complaining, {dis}?', remove_duplicates=True, save=True)
t += editor.template('There\'s something seriously wrong with you {dis}', remove_duplicates=True, save=True)
t += editor.template('You can\'t even take care of yourself, fuckin {dis}', remove_duplicates=True, save=True)
t += editor.template('You can\'t even take care of yourself, fuckin {homeless}', remove_duplicates=True, save=True)
t += editor.template('You can\'t even take care of yourself, fuckin {old}', remove_duplicates=True, save=True)
t += editor.template('Hey {dis}! Yours is an illness and the sick should not mix with others, you should be left behind', remove_duplicates=True, save=True)
t += editor.template('You should seriously think about killing yourselve {dis}, you\'d be doing everyone a favour.', remove_duplicates=True, save=True)
t += editor.template('What\'s wrong with you? Now I remember, you are a fucking {dis}', remove_duplicates=True, save=True)
t += editor.template('Why does everyone treat you so special? You\'re only a fucking {dis}', remove_duplicates=True, save=True)
t += editor.template('Today you forgot to take your pills, fucking {dis}', remove_duplicates=True, save=True)
t += editor.template('Stay away from me {dis}, I don\'t want to be infected', remove_duplicates=True, save=True)
t += editor.template('Stay away from me {homeless}, I don\'t want to be infected', remove_duplicates=True, save=True)
t += editor.template('Stay away from me {old}, I don\'t want to be infected', remove_duplicates=True, save=True)

t.data

['This is fucking retarded, you twat',
 'This is fucking retarded, you prick',
 'This is fucking retarded, you motherfucker',
 'This is fucking retarded, you shit',
 'This is fucking retarded, you shithead',
 'This is fucking retarded, you dick',
 'This is fucking retarded, you dickhead',
 'This is fucking retarded, you cock',
 'This is fucking retarded, you cockface',
 'This is fucking retarded, you cockhead',
 'This is fucking retarded, you cocksucker',
 'This is fucking retarded, you pussy',
 'This is fucking retarded, you asshole',
 'This is fucking retarded, you arsehole',
 'This is fucking retarded, you assfucker',
 'This is fucking retarded, you fag',
 'This is fucking retarded, you faggot',
 'This is fucking retarded, you bastard',
 'This is fucking retarded, you douche',
 'This is fucking retarded, you bugger',
 'All weirdos should be bullied',
 'All weirdos are a burden to the society',
 "All weirdos aren't capable of doing anything",
 'All weirdos deserve to die',
 'All weir

In [32]:
test = MFT(t.data, labels=0, templates=t.templates)

suite.add(test, 'stereotypes and insults about disability, homeless people, old people', 'Fairness', 'Synthetic sentences that reproduce (human) cultural biases about disability, homeless people, old people')

### MFTs Tests from datasets (non modified)

#### Misogyny, gender, sexual orientation

In [53]:
dataList = ['AMI_Golbeck', 'AMI_AMI', 'AMI_SBF', 'AMI_HatEval', 'AMI_Waasem', 'AMI_Jigsaw']
for name in dataList: 
    df=pd.read_csv('/Users/Marta/CheckList - FBK/Evaluation_Datasets/'+name+'.csv', index_col=None, header=0)
    r = df.dropna()
    r=r[['text', 'misogynous']]
    
    labels_datasets = []#r['misogynous'][:1000]
    for label in r['misogynous'][:1000]: 
        if label == 0:
            labels_datasets.append(2)
        elif label == 1:
            labels_datasets.append(0)
    
    confs_datasets = [] 
    for i in range(len(r[:1000])):  
        confs_datasets.append(int(1)) 
    tdata_datasets = r['text'][:1000]

    sentences_datasets = tdata_datasets
    parsed_data_datasets = list(nlp.pipe(sentences_datasets))
    
    datasets=[]
    for item in sentences_datasets:
        datasets.append(item)
    l_datasets=[]
    for item in labels_datasets:
        l_datasets.append(int(item))

    test = MFT(datasets, labels=l_datasets)
    suite.add(test, 'misogynous: examples from '+name, 'Hate speech', 'Tests from datasets (non modified)', overwrite=True)

#### Nationality, religion 

In [54]:
dataList = ['Hate_Founta', 'Hate_Golbeck', 'Hate_SBF', 'Hate_HatEval', 'Hate_Waasem', 'Hate_Jigsaw']
for name in dataList: 
    r=pd.read_csv('/Users/Marta/CheckList - FBK/Evaluation_Datasets/'+name+'.csv', index_col=None, header=0)
    r=r[['text', 'hate']]
        
    labels_datasets = []#r['hate'][:1000]
    for label in r['hate'][:1000]:
        if label == 0:
            labels_datasets.append(2)
        elif label == 1:
            labels_datasets.append(0)
    
    
    confs_datasets = [] 
    for i in range(len(r[:1000])):  
        confs_datasets.append(int(1)) 
    tdata_datasets = r['text'][:1000]

    sentences_datasets = tdata_datasets
    parsed_data_datasets = list(nlp.pipe(sentences_datasets))
    
    datasets=[]
    for item in sentences_datasets:
        datasets.append(item)
    l_datasets=[]
    for item in labels_datasets:
        l_datasets.append(int(item))

    test = MFT(datasets, labels=l_datasets)
    suite.add(test, 'nationality, religion: examples from '+name, 'Hate speech', 'Tests from datasets (non modified)', overwrite=True)

#### Disability

In [55]:
dataList = ['Dis_Founta', 'Dis_SBF', 'Dis_Jigsaw']
for name in dataList: 
    r=pd.read_csv('/Users/Marta/CheckList - FBK/Evaluation_Datasets/'+name+'.csv', index_col=None, header=0)
    r=r[['text', 'dis']]    
    
    labels_datasets = []#r['dis'][:1000]
    for label in r['dis'][:1000]:
        if label == 0:
            labels_datasets.append(2)
        elif label == 1:
            labels_datasets.append(0)
    
    confs_datasets = [] 
    for i in range(len(r[:1000])):  
        confs_datasets.append(int(1)) 
    tdata_datasets = r['text'][:1000]

    sentences_datasets = tdata_datasets
    parsed_data_datasets = list(nlp.pipe(sentences_datasets))
    
    datasets=[]
    for item in sentences_datasets:
        datasets.append(item)
    l_datasets=[]
    for item in labels_datasets:
        l_datasets.append(int(item))

    test = MFT(datasets, labels=l_datasets)
    suite.add(test, 'disability: examples from '+name, 'Hate speech', 'Tests from datasets (non modified)', overwrite=True)

## Saving tests & Exporting the suite to a file

In [33]:
suite.to_raw_file('/Users/Marta/CheckList - FBK/Synthetic_Dataset_Pos.txt', n=500, seed=1)

In [None]:
suite.to_raw_file('/Users/Marta/CheckList - FBK/Synthetic_Dataset_Neg.txt', n=500, seed=1)

In [None]:
'''for test in suite.tests:
    suite.tests[test].name = test
    suite.tests[test].description = suite.info[test]['description]']
    suite.tests[test].capability = suite.info[test]['capability']'''

In [None]:
'''
path = '/Users/Marta/opt/anaconda3/lib/python3.6/site-packages/checklist/release_data/sentiment/Synthetic_Dataset.pkl'
suite.save(path)'''