# Replications of WEAT Experiments

Reference:

> Knoche, M., Popović, R., Lemmerich, F., & Strohmaier, M. (2019, September). Identifying biases in politically biased wikis through word embeddings. In Proceedings of the 30th ACM conference on hypertext and social media (pp. 253-257).

In [1]:
import os
import pickle

import numpy as np

from typing import Dict, List

## Load the embeddings
Each wiki folder contains eight separately-trained embeddings from the same corpus. Minor variations in randomized starting conditions can result in different embeddings being generated with the same data set and algorithm (Antoniak and Mimno, 2018). The effect sizes will be averaged across the embeddings.

In [9]:
combined_embeddings = {}
for wiki in os.listdir('embeddings'):
    combined_embeddings[wiki] = []
    for pkl in os.listdir(f'embeddings/{wiki}'):
        if pkl[-3:] == 'pkl':
            with open(f'embeddings/{wiki}/{pkl}', 'rb') as f:
                embeddings = pickle.load(f)
            combined_embeddings[wiki].append(
                {word: np.array(vec) for word, vec in embeddings}
            )

## Define Attribute and Target Sets

In [10]:
attributes = {'male_names': ['john', 'paul', 'mike', 'kevin',
                             'steve', 'greg', 'jeff', 'bill'],
              'female_names': ['amy', 'joan', 'lisa', 'sarah',
                                'diana', 'kate', 'ann', 'donna'],
              'male_terms': ['male', 'man', 'boy', 'brother',
                             'he', 'him', 'his', 'son', 'father',
                             'uncle', 'grandfather'],
              'female_terms': ['female', 'woman', 'girl', 'sister',
                                'she', 'her', 'hers', 'daughter',
                                'mother', 'aunt', 'grandmother'],
              'white_names': ['adam', 'chip', 'harry', 'josh', 'roger', 'alan',
                             'frank', 'ian', 'justin', 'ryan', 'andrew', 'fred',
                             'jack', 'matthew', 'stephen', 'brad', 'greg', 'jed',
                             'paul', 'todd', 'brandon', 'hank', 'jonathan',
                             'peter', 'wilbur', 'amanda', 'courtney', 'heather',
                             'melanie', 'sara', 'amber', 'crystal', 'katie',
                             'meredith', 'shannon', 'betsy', 'donna', 'kristin',
                             'nancy', 'stephanie', 'bobbie-sue', 'ellen', 'lauren',
                             'peggy', 'sue-ellen', 'colleen', 'emily', 'megan',
                             'rachel', 'wendy', 'brendan', 'geoffrey', 'brett',
                             'jay', 'neil', 'anne', 'carrie', 'jill', 'laurie',
                             'kristen', 'sarah'],
              'black_names': ['alonzo', 'jamel', 'lerone', 'percell', 'theo',
                             'alphonse', 'jerome', 'leroy', 'rasaan', 'torrance',
                             'darnell', 'lamar', 'lionel', 'rashaun', 'tyree',
                             'deion', 'lamont', 'malik', 'terrence', 'tyrone',
                             'everol', 'lavon', 'marcellus', 'terryl', 'wardell',
                             'aiesha', 'lashelle', 'nichelle', 'shereen', 'temeka',
                             'ebony', 'latisha', 'shaniqua', 'tameisha', 'teretha',
                             'jasmine', 'latonya', 'shanise', 'tanisha', 'tia',
                             'lakisha', 'latoya', 'sharise', 'tashika', 'yolanda',
                             'lashandra', 'malika', 'shavonn', 'tawanda', 'yvette',
                             'hakim', 'jermaine', 'kareem', 'jamal', 'rasheed',
                             'aisha', 'keisha', 'kenya', 'tamika'],
              'christianity_words': ['baptism', 'messiah', 'catholicism', 'resurrection',                       
                                     'christianity', 'salvation', 'protestant', 'gospel',                        
                                     'trinity', 'jesus', 'christ', 'christian', 'cross',                         
                                     'catholic', 'church'],
              'islam_words': ['allah', 'ramadan', 'turban', 'emir', 'salaam', 'sunni',                   
                              'koran', 'imam', 'sultan', 'prophet', 'veil', 'ayatollah',                 
                              'shiite', 'mosque', 'islam', 'sheik', 'muslim', 'muhammad'],
              'atheism_words': ['atheism', 'atheist', 'atheistic', 'heliocentric',                         
                                'evolution', 'darwin', 'galilei', 'agnostic',                              
                                'agnosticism', 'pagan', 'science', 'disbelief',                            
                                'scepticism', 'philosophy', 'university', 'kopernikus']}

attributes['male'] = attributes['male_terms'] + attributes['male_names']
attributes['female'] = attributes['female_terms'] + attributes['female_names']

targets = {'career': ['executive', 'management', 'professional', 'corporation',
                      'salary', 'office', 'business', 'career'],
           'family': ['home', 'parents', 'children', 'family', 'cousins',
                      'marriage', 'wedding', 'relatives'],
           'pleasant': ['caress', 'freedom', 'health', 'love', 'peace', 'cheer', 'friend', 'heaven', 'loyal',
                        'pleasure', 'diamond', 'gentle', 'honest', 'lucky', 'rainbow', 'diploma', 'gift', 'honor',
                        'miracle', 'sunrise', 'family', 'happy', 'laughter', 'paradise', 'vacation', 'joy',
                        'wonderful'],
           'unpleasant': ['abuse', 'crash', 'filth', 'murder', 'sickness', 'accident', 'death', 'grief', 'poison',
                          'stink', 'assault', 'disaster', 'hatred', 'pollute', 'tragedy', 'divorce', 'jail', 'poverty',
                          'ugly', 'cancer', 'kill', 'rotten', 'vomit', 'agony', 'prison', 'terrible', 'horrible',
                          'nasty', 'evil', 'war', 'awful', 'failure'],
           'science': ['math', 'algebra', 'geometry', 'calculus', 'equations', 'computation', 'numbers', 'addition',
                       'science', 'technology', 'physics', 'chemistry', 'einstein', 'nasa', 'experiment', 'astronomy'],
           'art': ['poetry', 'art', 'dance', 'literature', 'novel', 'symphony', 'drama', 'sculpture', 'shakespeare'],
           'intellectual_words': ['precocious', 'resourceful', 'inquisitive', 'sagacious', 'inventive', 'astute',
                                  'adaptable', 'reflective', 'discerning', 'intuitive', 'inquiring', 'judicious',
                                  'analytical', 'luminous', 'venerable', 'imaginative', 'shrewd', 'thoughtful', 'sage',
                                  'smart', 'ingenious', 'clever', 'brilliant', 'logical', 'intelligent', 'apt',
                                  'genius', 'wise', 'stupid', 'dumb', 'dull', 'clumsy', 'foolish', 'naive', 'unintelligent', 'trivial', 'unwise', 'idiotic'],
           'appearance_words': ['alluring', 'voluptuous', 'blushing', 'homely', 'plump', 'sensual', 'gorgeous', 'slim',
                                'bald', 'athletic', 'fashionable', 'stout', 'ugly', 'muscular', 'slender', 'feeble',
                                'handsome', 'healthy', 'attractive', 'fat', 'weak', 'thin', 'pretty', 'beautiful',
                                'strong']}

In [11]:
target_order = [
    ("pleasant", "unpleasant"),
    ("science", "art"),
    ("intellectual_words", "appearance_words"),
    ("career", "family"),
]

attribute_order = [
    ("male", "female"),
    ("white_names", "black_names"),
    ("christianity_words", "islam_words"),
    ("christianity_words", "atheism_words"),
]

In [19]:
from sklearn.metrics.pairwise import cosine_similarity

def calculate_mcs(w: np.array, A: List[np.array], B: List[np.array]) -> np.float64:
    """
    Calculate difference of the mean cosine similarity between the target word w
    and the two attribute sets
    w: Target word embedding
    A: Attribute set 1 
    B: Attribute set 2 
    Returns: np.number
    """
    return np.mean(cosine_similarity([w], A)) - \
        np.mean(cosine_similarity([w], B))

def calculate_weat_effect_size(X: List[np.array], Y: List[np.array], 
                               A: List[np.array], B: List[np.array]) -> np.float64:
    """
    Calculate WEAT effect size from provided embeddings
    X: Target set 1 
    Y: Target set 2 
    A: Attribute set 1 
    B: Attribute set 2 
    Returns: np.number
    """
    x_mean = np.mean([calculate_mcs(x, A, B) for x in X])
    y_mean = np.mean([calculate_mcs(y, A, B) for y in Y])
    std_dev = np.std([calculate_mcs(w, A, B) for w in X + Y])
    return (x_mean - y_mean) / std_dev
    
def to_e(W, embedding_map):
    return [embedding_map[w] for w in W]

### Filter attributes and targets for values for which we have an embedding

In [13]:
for wiki in combined_embeddings:
    for target_pair in target_order:
        for i in (0,1):
            revised_terms = []
            for term in targets[target_pair[i]]:
                if term in combined_embeddings[wiki][0]:
                    revised_terms.append(term)
            targets[target_pair[i]] = revised_terms

    for attribute_pair in attribute_order:
        for i in (0,1):
            revised_terms = []
            for term in attributes[attribute_pair[i]]:
                if term in combined_embeddings[wiki][0]:
                    revised_terms.append(term)
            attributes[attribute_pair[i]] = revised_terms

In [21]:
from tabulate import tabulate

headers = [
    "attribute pair", "male/female",
    "white/black", "christian/islam",
    "christian/atheist"
]

for wiki in combined_embeddings:
    print(wiki)
    print()
    data = []
    for target_pair in target_order:
        row = [f"{target_pair[0]}/{target_pair[1]}"]
        for attribute_pair in attribute_order:
            total = 0.0
            for embedding_map in combined_embeddings[wiki]:
                result = calculate_weat_effect_size(
                    to_e(targets[target_pair[0]], embedding_map),
                    to_e(targets[target_pair[1]], embedding_map),
                    to_e(attributes[attribute_pair[0]], embedding_map),
                    to_e(attributes[attribute_pair[1]], embedding_map)
                )
                total += result
            row.append(f"{total / 8:.3f}")
        data.append(row)

    print(tabulate(data, headers=headers))
    print()

conservapedia

attribute pair                         male/female    white/black    christian/islam    christian/atheist
-----------------------------------  -------------  -------------  -----------------  -------------------
pleasant/unpleasant                          0.1            0.704              0.665                1.03
science/art                                  1.535          0.337             -0.043               -1.557
intellectual_words/appearance_words          0.891         -0.375              0.447               -0.491
career/family                                1.682         -0.294             -1.418               -1.703

wikipedia

attribute pair                         male/female    white/black    christian/islam    christian/atheist
-----------------------------------  -------------  -------------  -----------------  -------------------
pleasant/unpleasant                         -0.11           0.388              0.418                0.528
science/art         

## Confirm my calculations of effect size using WEFE library (Badilla et al. 2020). 
### Details can be found at https://wefe.readthedocs.io/en/latest/index.html.

In [16]:
from gensim.models import KeyedVectors
from wefe import WordEmbeddingModel, Query, WEAT

for wiki in combined_embeddings:
    print(wiki)
    print()
    data = []
    for target_pair in target_order:
        row = [f"{target_pair[0]}/{target_pair[1]}"]
        for attribute_pair in attribute_order:
            total = 0.0
            for embedding_map in combined_embeddings[wiki]:
                kv = KeyedVectors(168)
                keys = list(embedding_map.keys())
                values = [np.array(vec) for vec in embedding_map.values()]
                kv.add_vectors(
                        keys=keys,
                        weights=values
                    )
                model = WordEmbeddingModel(kv)
                query = Query(
                    [targets[target_pair[0]], 
                     targets[target_pair[1]]], 
                    [attributes[attribute_pair[0]], 
                     attributes[attribute_pair[1]]])
                weat = WEAT()
                results = weat.run_query(
                    query, model, 
                    return_effect_size=True, 
                    calculate_p_value=False,
                    normalize=True
                )
                total += results["effect_size"]
            row.append(f"{total / 8:.3f}")
        data.append(row)

    print(tabulate(data, headers=headers))
    print()

conservapedia

attribute                              male/female    white/black    christian/islam    christian/atheist
-----------------------------------  -------------  -------------  -----------------  -------------------
pleasant/unpleasant                          0.1            0.704              0.665                1.03
science/art                                  1.535          0.337             -0.043               -1.557
intellectual_words/appearance_words          0.891         -0.375              0.447               -0.491
career/family                                1.682         -0.294             -1.418               -1.703

wikipedia

attribute                              male/female    white/black    christian/islam    christian/atheist
-----------------------------------  -------------  -------------  -----------------  -------------------
pleasant/unpleasant                         -0.11           0.388              0.418                0.528
science/art         

## Compare these results with the Table 1 on p. 256 of Knoche et al. 
### It is reported as Cohen's d or WEAT effect size.

In [30]:
print("conservapedia")
print()

data = [
    ["pleasant/unpleasant", 0.012, 0.727, 0.547, 1.025],
    ["science/art", 1.626, 0.304, 0.128, -2.123],
    ["intellectual_words/appearance_words", 0.882, -0.282, 0.545, -0.338],
    ["career/family", 2.420, -0.220, -1.095, -1.320]
]

print(tabulate(data, headers=headers))
print()

print("wikipedia")
print()

data = [
    ["pleasant/unpleasant", -0.114, 0.305, 0.273, 0.419],
    ["science/art", 1.324, -0.043, -0.610, -2.197],
    ["intellectual_words/appearance_words", 0.334, 0.096, 0.072, -1.219],
    ["career/family", 2.432, 0.010, -0.199, -1.087]
]

print(tabulate(data, headers=headers))
print()


print("rationalwiki")
print()

data = [
    ["pleasant/unpleasant", -0.112, 0.257, 0.271, 0.321],
    ["science/art", 0.732, -0.516, 0.516, -1.837],
    ["intellectual_words/appearance_words", 1.560, -0.150, 0.360, -1.258],
    ["career/family", 1.734, 0.002, -0.848, -1.291]
]

print(tabulate(data, headers=headers))


conservapedia

attribute pair                         male/female    white/black    christian/islam    christian/atheist
-----------------------------------  -------------  -------------  -----------------  -------------------
pleasant/unpleasant                          0.012          0.727              0.547                1.025
science/art                                  1.626          0.304              0.128               -2.123
intellectual_words/appearance_words          0.882         -0.282              0.545               -0.338
career/family                                2.42          -0.22              -1.095               -1.32

wikipedia

attribute pair                         male/female    white/black    christian/islam    christian/atheist
-----------------------------------  -------------  -------------  -----------------  -------------------
pleasant/unpleasant                         -0.114          0.305              0.273                0.419
science/art         

## References

<a id="1">[1]</a> Antoniak, M., & Mimno, D. (2018). Evaluating the stability of embedding-based word similarities. Transactions of the Association for Computational Linguistics, 6, 107-119.
<p>
<a id="2">[2]</a> Badilla, P., Bravo-Marquez, F., & Pérez, J. (2020, January). WEFE: The Word Embeddings Fairness Evaluation Framework. In IJCAI (pp. 430-436).
<p>
<a id="3">[3]</a> Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.