NOTE: some of the texts in this notebook were copied from the [Dreaddit dataset for stress detection](https://arxiv.org/abs/1911.00133), and may contain some **offensive words**

# Setup

The original code and notebook is aligned with a legacy version allennlp==0.9.0 (and spacy) - the new versions have non backward compatible APIs and requiers a more significant refactoring.

In [None]:
!pip install allennlp==0.9.0 spacy==2.1.4 overrides==3.1.0 -q

[K     |████████████████████████████████| 7.6 MB 4.6 MB/s 
[K     |████████████████████████████████| 29.8 MB 10.5 MB/s 
[K     |████████████████████████████████| 51 kB 534 kB/s 
[K     |████████████████████████████████| 125 kB 9.1 MB/s 
[K     |████████████████████████████████| 158 kB 42.3 MB/s 
[K     |████████████████████████████████| 592 kB 48.4 MB/s 
[K     |████████████████████████████████| 53 kB 1.1 MB/s 
[K     |████████████████████████████████| 5.8 MB 37.1 MB/s 
[K     |████████████████████████████████| 48 kB 4.4 MB/s 
[K     |████████████████████████████████| 45 kB 1.6 MB/s 
[K     |████████████████████████████████| 235 kB 45.8 MB/s 
[K     |████████████████████████████████| 132 kB 46.4 MB/s 
[K     |████████████████████████████████| 123 kB 15.7 MB/s 
[K     |████████████████████████████████| 184 kB 52.7 MB/s 
[K     |████████████████████████████████| 2.1 MB 41.1 MB/s 
[K     |████████████████████████████████| 3.2 MB 40.8 MB/s 
[K     |████████████████████████

In [None]:
#warmup of spacy
from allennlp.common.util import get_spacy_model
try:
  get_spacy_model("en", pos_tags=False, parse=True, ner=False)
except:
  pass

Spacy models 'en' not found.  Downloading and installing.


[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_sm')
[38;5;2m✔ Linking successful[0m
/usr/local/lib/python3.7/dist-packages/en_core_web_sm -->
/usr/local/lib/python3.7/dist-packages/spacy/data/en
You can now load the model via spacy.load('en')


In [None]:
# !rm -rf contextual_embedding_bias_measure/

In [None]:
# !git clone https://github.com/keitakurita/contextual_embedding_bias_measure.git
!git clone https://github.com/lleviraz/contextual_embedding_bias_measure.git
# !cp -rf contextual_embedding_bias_measure/lib/ .

Cloning into 'contextual_embedding_bias_measure'...
remote: Enumerating objects: 158, done.[K
remote: Counting objects: 100% (20/20), done.[K
remote: Compressing objects: 100% (20/20), done.[K
remote: Total 158 (delta 7), reused 0 (delta 0), pack-reused 138[K
Receiving objects: 100% (158/158), 5.04 MiB | 12.24 MiB/s, done.
Resolving deltas: 100% (74/74), done.


In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import sys
sys.path.append("contextual_embedding_bias_measure/lib")

In [None]:
import torch
import numpy as np
import pandas as pd
from typing import *
from pathlib import Path
from itertools import product
import matplotlib.pyplot as plt
from collections import defaultdict
from scipy.stats import ttest_ind, ranksums
from mlxtend.evaluate import permutation_test
from bert_utils import Config, BertPreprocessor
from bias_utils import BiasUtils
%matplotlib inline

# Bert for MaskedLM Model

In [None]:
config = Config(
    model_type="bert-base-uncased",
    max_seq_len=128,
)

In [None]:
processor = BertPreprocessor(config.model_type, config.max_seq_len)

100%|██████████| 231508/231508 [00:00<00:00, 4173123.65B/s]


In [None]:
from pytorch_pretrained_bert import BertConfig, BertForMaskedLM
model = BertForMaskedLM.from_pretrained(config.model_type)
model.eval() # Important! Disable dropout
print()

100%|██████████| 407873900/407873900 [00:16<00:00, 24619723.12B/s]





In [None]:
bias_utils = BiasUtils(model,processor)

In [None]:
#sentence in index:706 from dreaddit train set in Bert embedding of size 768
bias_utils.get_word_vector("daughter father abusive first however became left found two weeks later pregnant course begged pleaded second chance believed ", "abusive").shape

(768,)

# Showing Bias in Bert Embeddings
Some examples of the Bert embedding gender bias using the MaskedLM templates

It shows the probability of every word being the correct word for completion in the template

In [None]:
bias_utils.get_mask_fill_logits("the [MASK] was hurt", ["man", "woman"])

{'man': 0.17923708, 'woman': 0.07181804}

Common gender bias in Bert embeddings

**Post 1**

In [None]:
original_post = "Regarding my abuser, I still feel my heart ache with compassion, and the understanding that no human being is born a monster;\
 it is an environment poisoned with violence they grew up in that guides them into life as an abuser.\
  I have prayed every single day for John Doe and his foster family, in the hopes that he can overcome his\
   inner demons and lead a life of peace. But if and when he finds a new woman to victimize and rob her of dignity,\
    worth, and benevolence, I feel proud that I have taken steps to ensure that even if she is too afraid to step forward,\
     I have taken that first step in her honor. I feel proud to have set the precedent to lead him on a path for growth \
     and healing if he chooses. I feel proud to have broken that silence."

masked_sent_parts = ["But if and when [MASK] finds a new woman to victimize and rob her of dignity, worth, and benevolence",
"I feel proud that I have taken steps to ensure that even if [MASK] is too afraid to step forward,",
"a path for growth and healing if [MASK] chooses. I feel proud to have broken that silence."]


for part in masked_sent_parts:
  print('>>>',part)
  print(bias_utils.get_mask_fill_logits(part, ["he", "she"]))
  print(100*'-')

>>> But if and when [MASK] finds a new woman to victimize and rob her of dignity, worth, and benevolence
{'he': 0.41180527, 'she': 0.3293651}
----------------------------------------------------------------------------------------------------
>>> I feel proud that I have taken steps to ensure that even if [MASK] is too afraid to step forward,
{'he': 0.260943, 'she': 0.108762614}
----------------------------------------------------------------------------------------------------
>>> a path for growth and healing if [MASK] chooses. I feel proud to have broken that silence.
{'he': 0.42510986, 'she': 0.28122655}
----------------------------------------------------------------------------------------------------


**Post 2**

In [None]:
original_post2 = ["First off I'm male but my relationship with the woman is strictly platonic stemming from professional, we are co-workers.",
"It's no secret to anyone that she's in an abusive relationship ",
"as she's come to work several times over the last several months with blackeyes etc and admits to being abused.",
"She been in the relationship for 1.5 years which turned info is out of the way,",
"here's the new twist that has brought me to split roads.",
"Yesterday she showed me a text from him detailing how angry he'd been at her 12 yr old",
"and that if he didn't have so much will power he would have choked her ",
"son to death and that she needs to get her son to behave so he doesn't have to crack his skull.",
" I encouraged her to show that to Leo and have him removed from the apartment as well as get an order of protection,",
" she seemed to be considering that but has expressed fear of retaliation in the past."]

masked_sent_parts2 = [
"It's no secret to anyone that [MASK]'s in an abusive relationship ",
"and that if [MASK] didn't have so much will power he would have choked her son",
"[MASK] seemed to be considering that but has expressed fear of retaliation in the past."]

for part2 in masked_sent_parts2:
  print('>>>',part2)
  print(bias_utils.get_mask_fill_logits(part2, ["he", "she"]))
  print(100*'-')

>>> It's no secret to anyone that [MASK]'s in an abusive relationship 
{'he': 0.1602246, 'she': 0.032410935}
----------------------------------------------------------------------------------------------------
>>> and that if [MASK] didn't have so much will power he would have choked her son
{'he': 0.4551006, 'she': 0.4184933}
----------------------------------------------------------------------------------------------------
>>> [MASK] seemed to be considering that but has expressed fear of retaliation in the past.
{'he': 0.49272013, 'she': 0.19589876}
----------------------------------------------------------------------------------------------------


**Post 3**

In [None]:
original_post3 = [" but I just need some help with how to respond to him and how to process this whole scenario.",
"We seem to be talking and accidentally being together more often in school, making what I think are feelings towards her only stronger.",
" I can't bring myself to bring this up with her because I'm scared that we will have a repeat of February again.",
" I love her so much but I feel that if I have these feelings about other girls",
"This is in no way her fault",
" I am reluctant to bring this up with her because I'm worried that she might break up with me ",
"I do truly still love her I'm just wondering "]

masked_sent_parts3 = [
" I can't bring myself to bring this up with [MASK] because I'm scared that we will have a repeat of February again.",
" I love [MASK] so much but I feel that if I have these feelings about other",# girls",
"This is in no way [MASK] fault",
" I am reluctant to bring this up with [MASK] because I'm worried that ",#she might break up with me ",
"I do truly still love [MASK] I'm just wondering "]

for part3 in masked_sent_parts3:
  print('>>>',part3)
  print(bias_utils.get_mask_fill_logits(part3, ["him", "her"]))
  print(100*'-')

>>>  I can't bring myself to bring this up with [MASK] because I'm scared that we will have a repeat of February again.
{'him': 0.01347866, 'her': 0.0077441162}
----------------------------------------------------------------------------------------------------
>>>  I love [MASK] so much but I feel that if I have these feelings about other
{'him': 0.1677441, 'her': 0.122786246}
----------------------------------------------------------------------------------------------------
>>> This is in no way [MASK] fault
{'him': 4.86839e-05, 'her': 0.03440234}
----------------------------------------------------------------------------------------------------
>>>  I am reluctant to bring this up with [MASK] because I'm worried that 
{'him': 0.05159374, 'her': 0.030551665}
----------------------------------------------------------------------------------------------------
>>> I do truly still love [MASK] I'm just wondering 
{'him': 0.03675802, 'her': 0.02954131}
----------------------------------

In [None]:
rev_vocab = {v:k for k, v in processor.full_vocab.items()}

In [None]:
def to_words(wlist, filter_oov=True):
    return [w.strip() for w in wlist.lower().replace("\n", " ").split(", ") if w.strip() in rev_vocab or not filter_oov]

# Topics: Stress vs. Relaxed

Showing gender bias for specific topics of interest

In [None]:
k=20
female_words = list({'girl': 1.0, 'lot': 0.9322033898305084, 'mom': 0.9152542372881356, 'even': 0.8898305084745762, 'never': 0.8559322033898306, 'feel': 0.8220338983050848, 'much': 0.7796610169491526, 'mother': 0.7288135593220338, 'need': 0.7033898305084746, 'going': 0.6779661016949152, 'still': 0.5677966101694916, 'help': 0.5338983050847458, 'take': 0.5254237288135594, 'went': 0.5, 'back': 0.4915254237288136, 'make': 0.4830508474576271, 'well': 0.4745762711864407, 'point': 0.4661016949152542, 'p': 0.4576271186440678, 'woman': 0.4322033898305085, 'able': 0.423728813559322, 'wife': 0.423728813559322, 'times': 0.4152542372881356, 'sister': 0.4152542372881356, 'work': 0.4067796610169492}.keys())[:k]
male_words = list({'friend': 1.0, 'people': 0.6820809248554913, 'guy': 0.5260115606936416, 'family': 0.3786127167630058, 'person': 0.34971098265895956, 'love': 0.32947976878612717, 'good': 0.3092485549132948, 'life': 0.2976878612716763, 'someone': 0.28034682080924855, 'help': 0.2745664739884393, 'boyfriend': 0.2658959537572254, 'relationship': 0.2658959537572254, 'work': 0.23699421965317918, 'brother': 0.23121387283236994, 'anyone': 0.23121387283236994, 'man': 0.22832369942196531, 'dad': 0.22832369942196531, 'thought': 0.2254335260115607, 'old': 0.22254335260115607, 'kid': 0.2138728323699422, 'feel': 0.20809248554913296, 'father': 0.20809248554913296, 'new': 0.1994219653179191, 'thank': 0.1936416184971098, 'parent': 0.1936416184971098}.keys())[:k]
stress_words = list({'feel': 1.0, 'anxiety': 0.46534653465346537, 'feeling': 0.36386138613861385, 'trying': 0.349009900990099, 'bad': 0.3217821782178218, 'abuse': 0.297029702970297, 'hate': 0.2376237623762376, 'fear': 0.21534653465346534, 'need': 0.19306930693069307, 'someone': 0.1905940594059406, 'fucking': 0.1782178217821782, 'scared': 0.17574257425742573, 'us': 0.17574257425742573, 'panic': 0.17326732673267325, 'boyfriend': 0.16831683168316833, 'think': 0.16831683168316833, 'problem': 0.16584158415841585, 'attack': 0.16336633663366337, 'worse': 0.15594059405940594, 'anyone': 0.1485148514851485, 'angry': 0.14603960396039603, 'afraid': 0.14603960396039603, 'wrong': 0.14356435643564355, 'abusive': 0.14108910891089108, 'pain': 0.13861386138613863}.keys())[:k]
relaxed_words = list({'help': 1.0, 'good': 0.9748743718592965, 'started': 0.7135678391959799, 'well': 0.6834170854271356, 'able': 0.5376884422110553, 'made': 0.5276381909547738, 'together': 0.49246231155778897, 'thank': 0.48743718592964824, 'went': 0.4723618090452261, 'make': 0.3869346733668342, 'found': 0.3768844221105528, 'best': 0.37185929648241206, 'relationship': 0.35175879396984927, 'work': 0.34673366834170855, 'friend': 0.33668341708542715, 'used': 0.32663316582914576, 'first': 0.32160804020100503, 'another': 0.32160804020100503, 'helped': 0.32160804020100503, 'life': 0.3065326633165829, 'new': 0.3015075376884422, 'told': 0.3015075376884422, 'took': 0.2864321608040201, 'decided': 0.2814070351758794, 'though': 0.2814070351758794}.keys())[:k]

In [None]:
# def list_2_str(list):
#   return [','.join(list)]

for l in [female_words,male_words,stress_words,relaxed_words]:
  # print(list_2_str(l))
  print(l)


['girl', 'lot', 'mom', 'even', 'never', 'feel', 'much', 'mother', 'need', 'going', 'still', 'help', 'take', 'went', 'back', 'make', 'well', 'point', 'p', 'woman']
['friend', 'people', 'guy', 'family', 'person', 'love', 'good', 'life', 'someone', 'help', 'boyfriend', 'relationship', 'work', 'brother', 'anyone', 'man', 'dad', 'thought', 'old', 'kid']
['feel', 'anxiety', 'feeling', 'trying', 'bad', 'abuse', 'hate', 'fear', 'need', 'someone', 'fucking', 'scared', 'us', 'panic', 'boyfriend', 'think', 'problem', 'attack', 'worse', 'anyone']
['help', 'good', 'started', 'well', 'able', 'made', 'together', 'thank', 'went', 'make', 'found', 'best', 'relationship', 'work', 'friend', 'used', 'first', 'another', 'helped', 'life']


In [None]:
male_words = to_words("he, man, father, boy, brother")
female_words = to_words("she, woman, mother, girl, sister")
male_plural_words = to_words("boys, men, fathers, brothers")
female_plural_words = to_words("girls, women, mothers, sisters")


Since male words are simply more likely than female words, the code `bias_score` includes corrections for this by masking the target word and measuring the prior probabilities

In [None]:
def calc_bias_for_topic(topic_words_list):
  df1 = pd.concat([
      pd.DataFrame([bias_utils.bias_score("GGG likes XXX.", [male_words, female_words], w) for w in topic_words_list]),
      pd.DataFrame([bias_utils.bias_score("GGG like XXX.", [male_plural_words, female_plural_words], w) for w in topic_words_list]),
      # pd.DataFrame([bias_utils.bias_score("GGG is interested in XXX.", [male_words, female_words], w) for w in topic_words_list])
      pd.DataFrame([bias_utils.bias_score("GGG is feeling XXX.", [male_words, female_words], w) for w in topic_words_list])
  ])
  return df1

In [None]:
df1 = calc_bias_for_topic(stress_words)
df1[-k:]

Unnamed: 0,stimulus,bias,prior_correction,bias_prior_corrected
0,feel,0.470788,0.294753,0.176036
1,anxiety,0.021616,0.294753,-0.273137
2,feeling,0.614186,0.294753,0.319434
3,trying,0.065343,0.294753,-0.22941
4,bad,-0.168169,0.294753,-0.462921
5,abuse,-0.294708,0.294753,-0.58946
6,hate,0.268701,0.294753,-0.026052
7,fear,0.201475,0.294753,-0.093278
8,need,0.182734,0.294753,-0.112018
9,someone,0.074385,0.294753,-0.220367


In [None]:
df1["bias_prior_corrected"].mean()

0.040313516890428856

In [None]:
df2 = calc_bias_for_topic(relaxed_words)
df2[-k:]

Unnamed: 0,stimulus,bias,prior_correction,bias_prior_corrected
0,help,-0.319322,0.294753,-0.614074
1,good,-0.159975,0.294753,-0.454728
2,started,0.22632,0.294753,-0.068433
3,well,-0.191389,0.294753,-0.486142
4,able,0.090212,0.294753,-0.20454
5,made,-0.326633,0.294753,-0.621386
6,together,-0.193437,0.294753,-0.48819
7,thank,0.052191,0.294753,-0.242562
8,went,0.105725,0.294753,-0.189028
9,make,-0.040059,0.294753,-0.334812


In [None]:
df2["bias_prior_corrected"].mean()

-0.15399659529319856

In [None]:
bias_utils.get_effect_size(df1, df2)

0.41601162434616906

In [None]:
ttest_ind(df1["bias_prior_corrected"], df2["bias_prior_corrected"])

Ttest_indResult(statistic=2.320174525439556, pvalue=0.022049762070651367)

In [None]:
ranksums(df1["bias_prior_corrected"], df2["bias_prior_corrected"])

RanksumsResult(statistic=3.1754264805429417, pvalue=0.0014961642897455493)

In [None]:
bias_utils.exact_mc_perm_test(df1["bias_prior_corrected"], df2["bias_prior_corrected"], )

0.02184

Calculating WEAT metric for comparison

In [None]:
wvs1 = [
    bias_utils.get_word_vector(f"[MASK] like {x}", x) for x in stress_words
] + [
    bias_utils.get_word_vector(f"[MASK] likes {x}", x) for x in stress_words
] + [
    bias_utils.get_word_vector(f"[MASK] is feeling {x}", x) for x in stress_words
]
wvs2 = [
    bias_utils.get_word_vector(f"[MASK] like {x}", x) for x in relaxed_words
] + [
    bias_utils.get_word_vector(f"[MASK] likes {x}", x) for x in relaxed_words    
] + [
    bias_utils.get_word_vector(f"[MASK] is feeling {x}", x) for x in relaxed_words
]

In [None]:
wv_fm = bias_utils.get_word_vector("women like [MASK]", "women")
wv_fm2 = bias_utils.get_word_vector("she likes [MASK]", "she")
sims_fm1 = [bias_utils.cosine_similarity(wv_fm, wv) for wv in wvs1] +\
           [bias_utils.cosine_similarity(wv_fm2, wv) for wv in wvs1]
sims_fm2 = [bias_utils.cosine_similarity(wv_fm, wv) for wv in wvs2] +\
           [bias_utils.cosine_similarity(wv_fm2, wv) for wv in wvs2]
mean_diff = np.mean(sims_fm1) - np.mean(sims_fm2)
std_ = np.std(sims_fm1 + sims_fm1)
effect_sz_female_stress_relaxed = mean_diff / std_; effect_sz_female_stress_relaxed

0.6244386

In [None]:
wv_m = bias_utils.get_word_vector("men like [MASK]", "men")
wv_m2 = bias_utils.get_word_vector("he likes [MASK]", "he")
sims_m1 = [bias_utils.cosine_similarity(wv_m, wv) for wv in wvs1]+\
           [bias_utils.cosine_similarity(wv_m2, wv) for wv in wvs1]
sims_m2 = [bias_utils.cosine_similarity(wv_m, wv) for wv in wvs2] +\
           [bias_utils.cosine_similarity(wv_m2, wv) for wv in wvs2]
mean_diff = np.mean(sims_m1) - np.mean(sims_m2)
std_ = np.std(sims_m1 + sims_m1)
effect_sz_male_stress_relaxed = mean_diff / std_; effect_sz_male_stress_relaxed

-0.072254725

In [None]:
bias_utils.exact_mc_perm_test(sims_fm1, sims_m1)

2e-05

In [None]:
bias_utils.exact_mc_perm_test(sims_fm2, sims_m2)

0.69994

In [None]:
ttest_ind(sims_fm1, sims_m1)

Ttest_indResult(statistic=4.391573642296149, pvalue=1.6940074842264283e-05)

In [None]:
ranksums(sims_fm1, sims_m1)

RanksumsResult(statistic=4.235986578825402, pvalue=2.2755036903632495e-05)

In [None]:
ttest_ind(sims_fm2, sims_m2)

Ttest_indResult(statistic=-0.38583028260432833, pvalue=0.6999670210228992)

In [None]:
ranksums(sims_fm2, sims_m2)

RanksumsResult(statistic=-0.4834752021486412, pvalue=0.6287583630536298)