# Word Embedding Models Bias Assessment with Lens
Word embeddings models generate a real-valued vector representation of text data and are mainstream in many AI systems that involve natural language data. However, they have been claimed to exhibit a range of human-like social biases ([Garg et al. 2018](https://www.pnas.org/content/115/16/E3635)). This demo illustrates how Lens can be used to assess them for such biases.

### Find the code
This notebook can be found on [github](https://github.com/credo-ai/credoai_lens/blob/develop/docs/notebooks/module_demos/fairness_nlp.ipynb).

In [None]:
from credoai.assessment.model_modules.fairness_nlp import NLPEmbeddingAnalyzer
from credoai.utils.nlp_constants import OCCUPATIONS, ISLAM, CHRISTIAN
from pytorch_transformers import BertTokenizer, BertModel, BertForMaskedLM
import pandas as pd
import torch

### Set up BERT

In [None]:
model = BertModel.from_pretrained('bert-base-uncased',
           output_hidden_states = True,)
# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

In [None]:
# from https://towardsdatascience.com/3-types-of-contextualized-word-embeddings-from-bert-using-transfer-learning-81fcefe3fe6d
def bert_text_preparation(text, tokenizer):
    """Preparing the input for BERT
    
    Takes a string argument and performs
    pre-processing like adding special tokens,
    tokenization, tokens to ids, and tokens to
    segment ids. All tokens are mapped to seg-
    ment id = 1.
    
    Args:
        text (str): Text to be converted
        tokenizer (obj): Tokenizer object
            to convert text into BERT-re-
            adable tokens and ids
        
    Returns:
        list: List of BERT-readable tokens
        obj: Torch tensor with token ids
        obj: Torch tensor segment ids
    
    
    """
    marked_text = "[CLS] " + text + " [SEP]"
    tokenized_text = tokenizer.tokenize(marked_text)
    indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
    segments_ids = [1]*len(indexed_tokens)

    # Convert inputs to PyTorch tensors
    tokens_tensor = torch.tensor([indexed_tokens])
    segments_tensors = torch.tensor([segments_ids])

    return tokenized_text, tokens_tensor, segments_tensors

def get_bert_embeddings(tokens_tensor, segments_tensors, model):
    """Get embeddings from an embedding model
    
    Args:
        tokens_tensor (obj): Torch tensor size [n_tokens]
            with token ids for each token in text
        segments_tensors (obj): Torch tensor size [n_tokens]
            with segment ids for each token in text
        model (obj): Embedding model to generate embeddings
            from token and segment ids
    
    Returns:
        list: List of list of floats of size
            [n_tokens, n_embedding_dimensions]
            containing embeddings for each token
    
    """
    
    # Gradient calculation id disabled
    # Model is in inference mode
    with torch.no_grad():
        outputs = model(tokens_tensor, segments_tensors)
        # Removing the first hidden state
        # The first state is the input state
        hidden_states = outputs[2][1:]

    # Getting embeddings from the final BERT layer
    token_embeddings = hidden_states[-1]
    
    # collapse tensor and conver tto numpy
    return token_embeddings.squeeze().numpy()

def get_bert_embedding(word):
    tokenized_text, tokens_tensor, segments_tensors = bert_text_preparation(word, tokenizer)
    return get_bert_embeddings(tokens_tensor, segments_tensors, model)[1, :]

### Run Assessment

This module evaluates model embeddings.

In [None]:
nlp_assessment = NLPEmbeddingAnalyzer(get_bert_embedding)
nlp_assessment.run('male', 'female')

Custom categories can be included. A category is a set of words that reflect the category.

In [None]:
superheroes = {'superheroes': ['batman', 'superman', 'marvel', 'dc', 'wonderwoman', 'justice league']}
nlp_assessment.set_comparison_categories(include_default=False, custom_categories=superheroes)
nlp_assessment.run('male', 'female')

Custom categories can be single words. Below we evaluate the association between the male/female access and a number of occupation labels.

In [None]:
nlp_assessment.set_comparison_categories(custom_categories={k:k for k in OCCUPATIONS})
pd.Series(nlp_assessment.run('male', 'female')).sort_values()

The group categories can also be changed. Each group category is associated with a set of words, which is used to define the average *group embedding vector*. The default is male/female, but other groups can be created.

In [None]:
nlp_assessment.set_comparison_categories()
nlp_assessment.set_group_embeddings({'islam': ISLAM, 
                                  'christian': CHRISTIAN})

In [None]:
nlp_assessment.run('islam', 'christian')