<a href="https://colab.research.google.com/github/wendyku/gender-neutral-captioning/blob/master/bias_amplification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Evaluating mean bias amplification

Definition : It is the amplification of bias in the model in the evaluation/test set. \

Bias on training set : $b^{*}(o, g)$ \
Bias on test set :  $\tilde b(o, g)$

If $o$ is positively correlated with $g$ (i.e,
$b^{*}(o, g) > 1/||G||$) and $\tilde b(o, g)$ is larger than
$b^{*}(o, g)$, we say bias has been amplified. For
example, if $b^{*}(cooking, woman) = .66$, and $\tilde b(cooking, woman) = .84$, then the bias of woman toward cooking has been amplified.


<b> Mean bias amplification =$$\frac{1}{|O|}\sum\limits_{g}\sum\limits_{o\epsilon\{o\epsilon O|b^{*}(o,g)>1/||G||\}}\tilde b(o,g) - b^{*}(o,g) $$ </b>


This score estimates the average magnitude of bias
amplification for pairs of $o$ and $g$ which exhibited
bias.

Since we consider gender binary, $G$ = $\{man,woman\}$ and $||G||$ = 2

In [2]:
from amp_utils import *
from bias_analysis import *
from pprint import pprint
import glob

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\parva\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\parva\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


### Calculating bias amplification for initial unbalanced dataset with sample test captions 

(Loading training bias dictionaries $b^{*}(o,g)$ from training bias/ folder.)

In [2]:
test_captions = ['a group of men playing a game of baseball .','a man holding a tennis racquet on a tennis court .','a baseball player holding a bat near home plate .']

In [3]:
train_man_dict, train_woman_dict = get_bias_dict('nouns')
pprint(train_man_dict)
pprint(train_woman_dict)

mean_bias_amp = bias_amplification([],test_captions=test_captions,train_dict_man = train_man_dict,train_dict_woman=train_woman_dict)
# print(mean_bias_amp)

./training bias\female_nouns.txt
./training bias\male_nouns.txt
{'bags': 0.8333,
 'beach': 0.8529,
 'beer': 0.8889,
 'carriage': 0.8889,
 'dirt': 0.8947,
 'grass': 0.8889,
 'kite': 0.875,
 'road': 0.84,
 'skateboard': 0.8919,
 'sunglasses': 0.8333,
 'surfboard': 0.8333,
 'tennis': 0.9231,
 'wagon': 0.8333}
{'bed': 0.9048,
 'bridle': 0.8333,
 'curb': 0.8,
 'device': 0.8,
 'dress': 0.9286,
 'face': 0.7778,
 'fire': 0.8182,
 'flowers': 0.875,
 'lap': 0.9091,
 'leash': 0.7778,
 'mouth': 0.8571,
 'teeth': 0.8}


In [32]:
test_captions = ['a group of men playing a game of baseball .','a man holding a tennis racquet on a tennis court .','a baseball player holding a bat near home plate .']

In [33]:
train_man_dict, train_woman_dict = get_bias_dict('nouns')
mean_bias_amp = bias_amplification([],test_captions=test_captions,train_dict_man = train_man_dict,train_dict_woman=train_woman_dict)
print(mean_bias_amp)

./training bias\female_nouns.txt
./training bias\male_nouns.txt
-0.8135791666666666


The bias amplification gives highly negative values since we are using sample test captions. The test set needs to be identically distributed as the training set (assumption).