# Word2Vec4Analogies


In this notebook we:

- generate batch for skip-gram model
- implement two loss functions to train word embeddings
- tune the parameters for word embeddings
- apply best learned word embeddings to word analogy task
- calculate bias score on your best models
- create a new task on which to run WEAT test


## Setting up the data and needed libraries


In [None]:
# datafile
!wget http://mattmahoney.net/dc/text8.zip
!unzip text8.zip
!rm text8.zip

# imports
import collections
import json

import numpy as np

import torch
import torch.nn as nn

import math

from tqdm import tqdm

import os
import pickle

# seeds for repeatable experiments
np.random.seed(1234)
torch.manual_seed(1234)

hw = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
from importlib import util
if util.find_spec("torch_directml"):
    # directml support
    import torch_directml
    hw = torch_directml.device()

## Generating the Data


To train word vectors, we generate training instances from the given data in batches. For the skip-gram model, we slide a window and sample training instances from the data inside the window.

<b>For example:</b>

Suppose that we have a text: "The quick brown fox jumps over the lazy dog."
and batch_size = 8, window_size = 3

"<font color = red>[The quick brown]</font> fox jumps over the lazy dog"

Context word would be 'quick' and predicting words are 'The' and 'brown'.

This will generate training examples of the form context(x), predicted_word(y) like:

<ul>
      <li>(quick    ,       The)
      <li>(quick    ,     brown)
</ul>
And then move the sliding window.

"The <font color = red>[quick brown fox]</font> jumps over the lazy dog"

In the same way, we have two more examples:

<ul>
    <li>(brown, quick)
    <li>(brown, fox)
</ul>

Moving the window again:

"The quick <font color = red>[brown fox jumps]</font> over the lazy dog"

We get,

<ul>
    <li>(fox, brown)
    <li>(fox, jumps)
</ul>

Finally we get two more instances from the moved window,

"The quick brown <font color = red>[fox jumps over]</font> the lazy dog"

<ul>
    <li>(jumps, fox)
    <li>(jumps, over)
</ul>

Since now we have 8 training instances, which is the batch size, we will stop generating this batch and return batch data.


In [None]:
# Read the data into a list of strings.
def read_data(filename):
    with open(filename) as file:
        text = file.read()
        data = [token.lower() for token in text.strip().split(" ")]
    return data


def build_dataset(words, vocab_size):
    count = [["UNK", -1]]
    count.extend(collections.Counter(words).most_common(vocab_size - 1))
    # token_to_id dictionary, id_to_taken reverse_dictionary
    vocab_token_to_id = dict()
    for word, _ in count:
        vocab_token_to_id[word] = len(vocab_token_to_id)
    data = list()
    unk_count = 0
    for word in words:
        if word in vocab_token_to_id:
            index = vocab_token_to_id[word]
        else:
            index = 0  # dictionary['UNK']
            unk_count += 1
        data.append(index)
    count[0][1] = unk_count
    vocab_id_to_token = dict(zip(vocab_token_to_id.values(), vocab_token_to_id.keys()))
    return data, count, vocab_token_to_id, vocab_id_to_token


class Dataset:
    def __init__(self, data, batch_size=128, num_skips=8, skip_window=4):
        """
        @data_index: the index of a word. You can access a word using data[data_index]
        @batch_size: the number of instances in one batch
        @num_skips: the number of samples you want to draw in a window
                (In the below example, it was 2)
        @skip_window: decides how many words to consider left and right from a context word.
                    (So, skip_windows*2+1 = window_size)
        """

        self.data_index = 0
        self.data = data
        assert batch_size % num_skips == 0
        assert num_skips <= 2 * skip_window

        self.batch_size = batch_size
        self.num_skips = num_skips
        self.skip_window = skip_window

    def reset_index(self, idx=0):
        self.data_index = idx

    def generate_batch(self):
        """
        Collect the center and context words for each batch. Then return them as a tuple of tensors.

        batch will contain word ids for context words. Dimension is [batch_size].
        labels will contain word ids for predicting(target) words. Dimension is [batch_size, 1].
        """

        center_word = np.ndarray(shape=(self.batch_size), dtype=np.int32)
        context_word = np.ndarray(shape=(self.batch_size), dtype=np.int32)

        # stride: for the rolling window
        stride = 1

        # for each batch
        for i in range(self.batch_size):
            # get the center and context words
            center_word[i] = self.data[self.data_index]
            context_word[i] = self.data[self.data_index + stride]
            # update the stride and data_index
            stride += 1
            if stride > self.skip_window:
                stride = 1
            if self.data_index >= len(self.data):
                self.data_index = 0
            self.data_index += 1

        return torch.LongTensor(center_word).to(hw), torch.LongTensor(context_word).to(
            hw
        )

## Building the Model


<b>Negative Log Likelihood (NLL): </b>
This is the negative of the log likelihood function, which is a loss function since it measures how bad the current model is from the expected behavior. Please refer to Stanford's CS224n [Lecture Notes](http://web.stanford.edu/class/cs224n/readings/cs224n-2019-notes01-wordvecs1.pdf) for more details.

<b>Negative Sampling (NEG): </b>
The negative sampling formulates a slightly different classification task and corresponding loss. [This paper](https://papers.nips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf) describes the method in detail. The idea here is to build a classifier that can give high probabilities to words that are the correct target words and low probabilities to words that are incorrect target words. As with negative log likelihood loss, we define the classifier using a function that uses the word vectors of the context and target as free parameters. The key difference is that instead of using the entire vocabulary, we sample a set of k negative words for each instance and create an augmented instance, which is a collection of the true target word and k negative words. Now the vectors are trained to maximize the probability of this augmented instance. You may again refer to the Lecture Notes linked above for more details.


In [None]:
# clip x to avoid overflow and underflow problems
sigmoid = lambda x: 1 / (1 + torch.exp(-x.clamp(-10, 10)))


class WordVec(nn.Module):
    def __init__(
        self, V, embedding_dim, loss_func, counts, num_neg_samples_per_center=1
    ):
        super(WordVec, self).__init__()
        self.center_embeddings = nn.Embedding(
            num_embeddings=V, embedding_dim=embedding_dim
        )
        self.center_embeddings.weight.data.normal_(
            mean=0, std=1 / math.sqrt(embedding_dim)
        )
        self.center_embeddings.weight.data[self.center_embeddings.weight.data < -1] = -1
        self.center_embeddings.weight.data[self.center_embeddings.weight.data > 1] = 1

        self.context_embeddings = nn.Embedding(
            num_embeddings=V, embedding_dim=embedding_dim
        )
        self.context_embeddings.weight.data.normal_(
            mean=0, std=1 / math.sqrt(embedding_dim)
        )
        self.context_embeddings.weight.data[
            self.context_embeddings.weight.data < -1
        ] = (-1 + 1e-10)
        self.context_embeddings.weight.data[self.context_embeddings.weight.data > 1] = (
            1 - 1e-10
        )

        self.loss_func = loss_func
        self.counts = counts

        self.num_neg_samples_per_center = num_neg_samples_per_center

    def forward(self, center_word, context_word):
        if self.loss_func == "nll":
            return self.negative_log_likelihood_loss(center_word, context_word)
        elif self.loss_func == "neg":
            return self.negative_sampling(center_word, context_word)
        else:
            raise Exception("No implementation found for %s" % (self.loss_func))

    def negative_log_likelihood_loss(self, center_word, context_word):
        center_word_embeddings = self.center_embeddings(center_word)  # batches, dims
        context_word_embeddings = self.context_embeddings(context_word)  # batches, dims

        a = torch.sum(
            torch.mul(center_word_embeddings, context_word_embeddings), dim=1
        )  # batches
        # clamp @ to avoid overflow and underflow problems
        b = torch.log(
            torch.sum(
                torch.exp(
                    torch.clamp(
                        torch.mm(
                            center_word_embeddings, self.context_embeddings.weight.t()
                        ),
                        -10,
                        10,
                    )
                ),
                dim=1,
            )
        )
        loss = torch.mean(torch.log(1 + torch.exp(b - a)))

        return loss

    def negative_sampling(self, center_word, context_word):
        # control the number of negative samples for every positive sample
        num_neg_samples_per_center = self.num_neg_samples_per_center

        center_word_embeddings = self.center_embeddings(center_word)
        context_word_embeddings = self.context_embeddings(context_word)

        # get the number of batches and dimensions
        num_batches = center_word_embeddings.shape[0]
        num_dims = center_word_embeddings.shape[1]
        num_neg_samples = num_batches * num_neg_samples_per_center

        # sample negative samples
        neg_samples = torch.multinomial(
            torch.Tensor(self.counts).to(hw),
            num_samples=num_neg_samples,
            replacement=True,
        )
        neg_sample_embeddings = self.context_embeddings(neg_samples)

        # reshape embeddings
        center_word_embeddings = center_word_embeddings.view(num_batches, 1, num_dims)
        context_word_embeddings = context_word_embeddings.view(num_batches, 1, num_dims)
        neg_sample_embeddings = neg_sample_embeddings.view(
            num_batches, num_neg_samples_per_center, num_dims
        )

        # dot product
        center_context_dot_product = torch.sum(
            torch.mul(center_word_embeddings, context_word_embeddings), dim=2
        )
        center_neg_sample_dot_product = torch.sum(
            torch.mul(center_word_embeddings, neg_sample_embeddings), dim=2
        )

        # log sigmoid
        log_sigmoid_center_context = torch.log(sigmoid(center_context_dot_product))
        log_sigmoid_center_neg_sample = torch.log(
            sigmoid(-center_neg_sample_dot_product)
        )

        # loss
        loss = -torch.mean(
            log_sigmoid_center_context + torch.sum(log_sigmoid_center_neg_sample, dim=1)
        )

        return loss

    def print_closest(self, validation_words, reverse_dictionary, top_k=8):
        print("Printing closest words")
        embeddings = torch.zeros(self.center_embeddings.weight.shape).copy_(
            self.center_embeddings.weight
        )
        embeddings = embeddings.data.cpu().numpy()

        validation_ids = validation_words
        norm = np.sqrt(np.sum(np.square(embeddings), axis=1, keepdims=True))
        normalized_embeddings = embeddings / norm
        validation_embeddings = normalized_embeddings[validation_ids]
        similarity = np.matmul(validation_embeddings, normalized_embeddings.T)
        for i in range(len(validation_ids)):
            word = reverse_dictionary[validation_words[i]]
            nearest = (-similarity[i, :]).argsort()[1 : top_k + 1]
            print(word, [reverse_dictionary[nearest[k]] for k in range(top_k)])

## Training and Data Loading Loops


The code below uses the models and losses built above and runs the actual training process.


In [None]:
class Trainer:
    def __init__(self, model, ckpt_save_path, reverse_dictionary):
        self.model = model
        self.ckpt_save_path = ckpt_save_path
        self.reverse_dictionary = reverse_dictionary

    def training_step(self, center_word, context_word):
        loss = self.model(center_word, context_word)
        return loss

    def train(
        self, dataset, max_training_steps, ckpt_steps, validation_words, device=hw, lr=1
    ):
        optim = torch.optim.SGD(self.model.parameters(), lr=lr, weight_decay=1e-6)
        self.model.to(device)
        self.model.train()
        self.losses = []

        # trace errors
        torch.autograd.set_detect_anomaly(True)

        # register hook to clip gradients during backprop
        for p in self.model.parameters():
            p.register_hook(lambda grad: torch.clamp(grad, -1e6, 1e6))

        t = tqdm(range(max_training_steps))
        for curr_step in t:
            optim.zero_grad()
            center_word, context_word = dataset.generate_batch()
            loss = self.training_step(center_word.to(device), context_word.to(device))
            loss.backward()
            optim.step()
            self.losses.append(loss.item())
            if curr_step:
                t.set_description(
                    "Avg loss: %s"
                    % (round(sum(self.losses[-2000:]) / len(self.losses[-2000:]), 3))
                )
            if curr_step % 10000 == 0:
                self.model.print_closest(validation_words, self.reverse_dictionary)
            if curr_step % ckpt_steps == 0 and curr_step > 0:
                self.save_ckpt(curr_step)

    def save_ckpt(self, curr_step):
        torch.save(self.model, "%s/%s.pt" % (self.ckpt_save_path, str(curr_step)))

## Training


The following run_training function will train a model.


In [None]:
def create_path(path):
    if not os.path.exists(path):
        os.mkdir(path)
        print("Created a path: %s" % (path))


def run_training(
    model_type="nll",  # loss function: 'nll' or 'neg'
    lr=1.0,
    num_neg_samples_per_center=1,  # negative samples per center word
    checkpoint_model_path="./checkpoints",
    final_model_path="./final_model",
    skip_window=1,  # skip window size
    vocab_size=int(1e5),
    num_skips=2,  # num samples to be drawn from a window
    batch_size=64,  # number of x,y pairs in a batch
    embedding_size=128,  # embedding vector size
    checkpoint_step=50000,  # steps between saving checkpoints
    max_num_steps=200001,  # max number of steps to train
):
    # if checkpoint path exists, continue training from the checkpoint
    point = 0
    if os.path.exists(checkpoint_model_path):
        point = 1
    checkpoint_model_path = f"{checkpoint_model_path}_{model_type}/"
    create_path(checkpoint_model_path)

    # Read data
    words = read_data("./text8")
    data, count, vocab_token_to_id, vocab_id_to_token = build_dataset(words, vocab_size)

    print("Data size", len(words))
    print("Most common words (+UNK)", count[:5])
    print("Sample data", data[:10], [vocab_id_to_token[i] for i in data[:10]])

    # Calculate the probability of unigrams
    # unigram_cnt = [c for w, c in count]
    count_dict = dict(count)
    unigram_cnt = [
        count_dict[vocab_id_to_token[i]]
        for i in sorted(list(vocab_token_to_id.values()))
    ]
    unigram_cnt = np.array(unigram_cnt) ** (3 / 4)
    unigram_cnt = unigram_cnt / np.sum(unigram_cnt)

    dataset = Dataset(
        data, batch_size=batch_size, num_skips=num_skips, skip_window=skip_window
    )
    center, context = dataset.generate_batch()
    for i in range(8):
        print(
            center[i].item(),
            vocab_id_to_token[center[i].item()],
            "->",
            context[i].item(),
            vocab_id_to_token[context[i].item()],
        )
    dataset.reset_index()

    valid_size = 16  # random set of words to evaluate similarity on
    valid_window = 100  # only pick dev samples in the head of the distribution
    valid_examples = np.random.choice(valid_window, valid_size, replace=False)

    embedding_size = embedding_size
    model = WordVec(
        V=vocab_size,
        embedding_dim=embedding_size,
        loss_func=model_type,
        counts=np.array(unigram_cnt),
        num_neg_samples_per_center=num_neg_samples_per_center,
    )
    if point:
        model = torch.load(checkpoint_model_path)
        print("Loaded model from checkpoint")
    trainer = Trainer(model, checkpoint_model_path, vocab_id_to_token)
    device = hw
    print(f"Device: {device}")
    trainer.train(
        dataset, max_num_steps, checkpoint_step, valid_examples, device, lr=lr
    )
    model_path = final_model_path
    create_path(model_path)
    model_filepath = os.path.join(model_path, "word2vec_%s.model" % (model_type))
    pickle.dump(
        [vocab_token_to_id, model.center_embeddings.weight.detach().cpu().numpy()],
        open(model_filepath, "wb"),
    )

When choosing hyperparamters, one must be careful. If the learning rate it is too big, the function might not be able to reach the minimum of the gradient descent. If it is too small, it will take a long time to reach the minimum. Likewise, if the window size or number of negative samples are too big, the model will take a long time to train. If they are too small, the model will not be able to learn the context of the words.


In [None]:
run_training(
    model_type="neg",
    checkpoint_model_path="./neg_checkpoints",
    final_model_path="./final_neg_model",
    num_neg_samples_per_center=5,
    lr=6.4,
    skip_window=7,
)

run_training(
    model_type="nll",
    checkpoint_model_path="./nll_checkpoints",
    final_model_path="./final_nll_model",
    num_neg_samples_per_center=5,
    lr=6.4,
    skip_window=7,
)

## Testing: Analogies with Word Embeddings


The word vectors can now be used for word analogy tasks. Each one has example pairs of a certain relation; the most/least illustrative word pair of the relation is to be found. One simple method is measuring the similarities of difference vectors: if (a, b) and (c, d) pairs are analogous pairs then the transformation from a to b (i.e., some x vector when added to a gives b: a + x = b) should be highly similar to the transformation from c to d (i.e., some y vector when added to c gives d: c + y = d). In other words, the difference vector (b-a) should be similar to difference vector (d-c). This difference vector can be thought to represent the relation between the two words.

Due to the noisy annotation data, the expected accuracy is not high.

Each task is in the following form:

    Consider the following word pairs that share the same relation, R:

    pilgrim:shrine, hunter:quarry, assassin:victim, climber:peak

    Among these word pairs,

    (1) pig:mud
    (2) politician:votes
    (3) dog:bone
    (4) bird:worm

    Q1. Which word pairs has the MOST illustrative(similar) example of the relation R?
    Q2. Which word pairs has the LEAST illustrative(similar) example of the relation R?


In [None]:
def read_data_split(file_path):
    with open(file_path, "r") as f:
        data = f.readlines()

    candidate, test = [], []
    for line in data:
        a, b = line.strip().split("||")
        a = [i[1:-1].split(":") for i in a.split(",")]
        b = [i[1:-1].split(":") for i in b.split(",")]
        candidate.append(a)
        test.append(b)

    return candidate, test


def get_embeddings(examples, embeddings, dictionary):
    """
    For the word pairs in the 'examples' array, fetch embeddings and return.
    """

    norm = np.sqrt(np.sum(np.square(embeddings), axis=1, keepdims=True))
    normalized_embeddings = embeddings / norm

    embs = []
    for line in examples:
        temp = []
        for pairs in line:
            temp.append(
                [
                    normalized_embeddings[dictionary[pairs[0]]],
                    normalized_embeddings[dictionary[pairs[1]]],
                ]
            )
        embs.append(temp)

    result = embs

    return result


def evaluate_pairs(candidate_embs, test_embs):
    """
    Find the best and worst pairs and return that.
    """

    best_pairs = []
    worst_pairs = []

    # calculate the norm between the candidate and test embs
    for i, line in enumerate(test_embs):
        norms = []
        for j, pairs in enumerate(candidate_embs[i]):
            norms.append(
                np.linalg.norm(line[0] - pairs[0]) + np.linalg.norm(line[1] - pairs[1])
            )
        best_pairs.append(np.argmin(norms))
        worst_pairs.append(np.argmax(norms))

    return best_pairs, worst_pairs


def write_solution(best_pairs, worst_pairs, test, path):
    """
    Write best and worst pairs to a file, that can be evaluated by score_maxdiff.pl
    """

    ans = []
    for i, line in enumerate(test):
        temp = [f'"{pairs[0]}:{pairs[1]}"' for pairs in line]
        # handle the case where the best or worst pair is not in the candidate set
        try:
            temp.append(f'"{line[worst_pairs[i]][0]}:{line[worst_pairs[i]][1]}"')
            temp.append(f'"{line[best_pairs[i]][0]}:{line[best_pairs[i]][1]}"')
            ans.append(" ".join(temp))
        except:
            pass

    with open(path, "w") as f:
        f.write("\n".join(ans))


def run_word_analogy_eval(
    model_path="./final_model",  # path to the model
    input_filepath="word_analogy_dev.txt",  # word analogy file to evaluate on
    output_filepath="word_analogy_results.txt",  # predicted results
    model_type="nll",  # loss: NLL or NEG
):
    print(f"Model file: {model_path}/word2vec_{model_type}.model")
    model_filepath = os.path.join(model_path, "word2vec_%s.model" % (model_type))

    dictionary, embeddings = pickle.load(open(model_filepath, "rb"))

    candidate, test = read_data_split(input_filepath)

    candidate_embs = get_embeddings(candidate, embeddings, dictionary)
    test_embs = get_embeddings(test, embeddings, dictionary)

    best_pairs, worst_pairs = evaluate_pairs(candidate_embs, test_embs)

    out_filepath = output_filepath
    print(f"Output file: {out_filepath}")
    write_solution(best_pairs, worst_pairs, test, out_filepath)

Now we generate the results of the word analogy task and convert into numeric metrics.


In [None]:
run_word_analogy_eval(
    model_path="./final_neg_model",
    input_filepath="word_analogy_dev.txt",
    output_filepath="word_analogy_dev_results_neg.txt",
    model_type="neg",
)

run_word_analogy_eval(
    model_path="./final_neg_model",
    input_filepath="word_analogy_test.txt",
    output_filepath="word_analogy_test_results_neg.txt",
    model_type="neg",
)

run_word_analogy_eval(
    model_path="./final_nll_model",
    input_filepath="word_analogy_dev.txt",
    output_filepath="word_analogy_dev_results_nll.txt",
    model_type="nll",
)

run_word_analogy_eval(
    model_path="./final_nll_model",
    input_filepath="word_analogy_test.txt",
    output_filepath="word_analogy_test_results_nll.txt",
    model_type="nll",
)

!chmod 777 score_maxdiff.pl
!./score_maxdiff.pl word_analogy_dev_mturk_answers.txt word_analogy_dev_results_neg.txt score_neg.txt
!./score_maxdiff.pl word_analogy_dev_mturk_answers.txt word_analogy_dev_results_nll.txt score_nll.txt
!./score_maxdiff.pl word_analogy_dev_mturk_answers.txt word_analogy_test_results_neg.txt score_neg.txt
!./score_maxdiff.pl word_analogy_dev_mturk_answers.txt word_analogy_test_results_nll.txt score_nll.txt

## WEAT Test


While word embeddings can help us learn analogies, they can also be biased. The WEAT test provides a way to empirically measure said bias. [This paper](https://arxiv.org/pdf/1810.03611.pdf) describes the method in detail. It measures the degree to which a model associates sets of target words (e.g., African American names, European American names, flowers, insects) with sets of attribute words (e.g., ”stable”, ”pleasant” or ”unpleasant”). The association between two given words is defined as the cosine similarity between the embedding vectors for the words.


In [None]:
def str2bool(v):
    if isinstance(v, bool):
        return v
    if v.lower() in ("yes", "true", "t", "y", "1"):
        return True
    elif v.lower() in ("no", "false", "f", "n", "0"):
        return False


def unit_vector(vec):
    return vec / np.linalg.norm(vec)


def cos_sim(v1, v2):
    """
    Cosine Similarity between the 2 vectors
    """

    v1_u = unit_vector(v1)
    v2_u = unit_vector(v2)
    return np.clip(np.tensordot(v1_u, v2_u, axes=(-1, -1)), -1.0, 1.0)


def weat_association(W, A, B):
    """
    Compute Weat score for given target words W, along the attributes A & B.
    """

    return np.mean(cos_sim(W, A), axis=-1) - np.mean(cos_sim(W, B), axis=-1)


def weat_score(X, Y, A, B):
    """
    Compute differential weat score across the given target words X & Y along the attributes A & B.
    """

    x_association = weat_association(X, A, B)
    y_association = weat_association(Y, A, B)

    tmp1 = np.mean(x_association, axis=-1) - np.mean(y_association, axis=-1)
    tmp2 = np.std(np.concatenate((x_association, y_association), axis=0))

    return tmp1 / tmp2


def balance_word_vectors(vec1, vec2):
    diff = len(vec1) - len(vec2)

    if diff > 0:
        vec1 = np.delete(vec1, np.random.choice(len(vec1), diff, 0), axis=0)
    else:
        vec2 = np.delete(vec2, np.random.choice(len(vec2), -diff, 0), axis=0)

    return (vec1, vec2)


def get_word_vectors(words, model, vocab_token_to_id):
    """
    Return list of word embeddings for the given words using the passed model and tokeniser
    """

    output = []

    for word in words:
        try:
            output.append(model[vocab_token_to_id[word]])
        except:
            pass

    return np.array(output)


def compute_weat(weat_path, model, vocab_token_to_id):
    """
    Compute WEAT score for the task as defined in the file at `weat_path`, and generating word embeddings from the passed model and tokeniser.
    """

    with open(weat_path) as f:
        weat_dict = json.load(f)

    all_scores = {}

    for data_name, data_dict in weat_dict.items():
        # Target
        X_key = data_dict["X_key"]
        Y_key = data_dict["Y_key"]

        # Attributes
        A_key = data_dict["A_key"]
        B_key = data_dict["B_key"]

        X = get_word_vectors(data_dict[X_key], model, vocab_token_to_id)
        Y = get_word_vectors(data_dict[Y_key], model, vocab_token_to_id)
        A = get_word_vectors(data_dict[A_key], model, vocab_token_to_id)
        B = get_word_vectors(data_dict[B_key], model, vocab_token_to_id)

        if len(X) == 0 or len(Y) == 0:
            print("Not enough matching words in dictionary")
            continue

        X, Y = balance_word_vectors(X, Y)
        A, B = balance_word_vectors(A, B)

        score = weat_score(X, Y, A, B)
        all_scores[data_name] = str(score)

    return all_scores


def dump_dict(obj, output_path):
    with open(output_path, "w") as file:
        json.dump(obj, file)


def run_bias_eval(
    weat_file_path="weat.json",  # weat file where the tasks are defined
    out_file="weat_results.json",  # output JSON file where the output is stored
    model_path="/content/final_model/word2vec_nll.model",  # Full model path (including filename) to load from
):
    vocab_token_to_id, model = pickle.load(open(model_path, "rb"))

    bias_score = compute_weat(weat_file_path, model, vocab_token_to_id)

    print("Final Bias Scores")
    print(json.dumps(bias_score, indent=4))

    dump_dict(bias_score, out_file)

Since the models were trained and tested on the same data and not debiased, bias is expected. Further, the bias is expected to be in favor of the majority group and to follow societal stereotypes.


In [None]:
run_bias_eval(
    weat_file_path="weat.json",
    out_file="neg_bias_output.json",
    model_path="./final_neg_model",
)

run_bias_eval(
    weat_file_path="weat.json",
    out_file="nll_bias_output.json",
    model_path="./final_nll_model",
)

After running the WEAT tests, we can see that the NLL model's bias scores are all only slightly negative or positive, but the NEG model's are more extreme. They seem mostly in line with our society's cultural biases; the deviations are likely due to the limited size of the training data. There are many ways to remove bias, such as training on a more diverse set of data, with the help of the debiasing conceptor, data augmentation, and neutralization. However, it is important to note that bias cannot be completely removed, and that the methods used to remove bias may have unintended consequences (like introducing new biases).


## Credits


This code was written with the help of Heeyoung Kwon, Jun Kang, Mohaddeseh Bastan, Harsh Trivedi, Matthew Matero, Nikita Soni, Sharvil Katariya, Yash Kumar Lal, Adithya V. Ganesan, Sounak Mondal, Saqib Hasan, Jasdeep Grover, and others. It is not subject to the license of this repository.
