# 👩‍💻 Examining Occupational Gender Stereotypes in Sentiment Analysis with Rubrix

This brief tutorial will use Rubrix for reproducing and extending the analysis presented in [Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis](https://aclanthology.org/W19-3809/), a research paper by Jayadev Bhaskaran and Isha Bhallamudi (*Proceedings of the First Workshop on Gender Bias in Natural Language Processing, ACL 2019*). 

<video width="100%" controls><source src="https://github.com/recognai/rubrix-materials/raw/main/tutorials/videos/stereotypes.mp4" type="video/mp4"></video>

## TL;DR


## Why its important


## Brief summary of the original paper

In the paper, the authors investigate the presence of occupational gender stereotypes in sentiment analysis models. For this research, they've built and released a gendered-balanced dataset of 800 sentences with specific professions. Their research approach is summarized in the following figure (extracted from the paper):

<img src="img/gender-bias-paper.png" alt="Occupational Gender Stereotypes in Sentiment Analysis" width="500"/>

In the paper, they evaluate three models (a *logistic regression baseline model*, an *LSTM-based model*, and a *pre-trained BERT model*) all trained/fine-tuned on the SST-2 sentiment analysis dataset, a widely-known sentiment analysis dataset. Their main findings are:

1. The pre-trained BERT shows a statistically significant **higher predicted positive class probability for sentences with male nouns**.

2. The other two models show a **higher predicted positive class probabilities for sentences with female nouns**, which is in line with the distribution of positive examples with female nouns in the training set (SST-2).

3. Given the above, the authors hypothesize that: (1) for pre-trained models biases might propagate from the pretraining phase (i.e., the large corpus used for language modeling pre-training), and (2) "shallower" models might propagate biases more directly from the training set.


## Setup Rubrix

If you are new to Rubrix, check out the ⭐ [Github repository](https://github.com/recognai/rubrix).

If you have not installed and launched Rubrix, check the [Setup and Installation guide](../getting_started/setup&installation.rst).

Once installed, you only need to import Rubrix:

In [1]:
import rubrix as rb

## Load the Gendered sentiment dataset


The dataset in the original paper is available at https://github.com/jayadevbhaskaran/gendered-sentiment, let's load it into Pandas:

In [2]:
import pandas as pd

In [3]:
df = pd.read_csv("https://raw.githubusercontent.com/jayadevbhaskaran/gendered-sentiment/master/data/gender_corpus.tsv", sep="\t")

In [4]:
df.head()

Unnamed: 0,id,sentence,gender,occupation,noun phrase
0,0,He is a doctor.,male,doctor,He
1,1,This boy is a doctor.,male,doctor,This boy
2,2,This man is a doctor.,male,doctor,This man
3,3,My father is a doctor.,male,doctor,My father
4,4,My son is a doctor.,male,doctor,My son


## Extend the Gendered sentiment dataset with programmers

As today is `#ProgrammersDay` and triggered by tweets like this one:

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">I am not a &quot;Female Developer&quot;<br><br>I am not a &quot;Girl who can code&quot;<br><br>I am a Developer. <br>That&#39;s it. <br>That&#39;s my tag.<br>Call me nothing else♥️</p>&mdash; timpratim (@BhosalePratim) <a href="https://twitter.com/BhosalePratim/status/1437088890502873088?ref_src=twsrc%5Etfw">September 12, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

Let's add "programmer" as occupation to the original dataset, so we can "extend" the original paper analysis:

In [5]:
programmers = []
for i,r in df.query("occupation == 'doctor'").iterrows():
    programmers.append(
        {
            "sentence": r.sentence.replace('doctor', 'programmer'),
            "gender": r.gender,
            "noun phrase": r['noun phrase'],
            "occupation": 'programmer'
        }
    )

In [8]:
programmers_df = pd.DataFrame(programmers) ; programmers_df.head()

Unnamed: 0,sentence,gender,noun phrase,occupation
0,He is a programmer.,male,He,programmer
1,This boy is a programmer.,male,This boy,programmer
2,This man is a programmer.,male,This man,programmer
3,My father is a programmer.,male,My father,programmer
4,My son is a programmer.,male,My son,programmer


In [9]:
gendered_occupations = pd.concat([df,programmers_df]).reset_index(drop=True)

## Log default sentiment analysis pipeline predictions (`distilbert-finetuned-sst-2`)

In [83]:
from transformers import pipeline

In [84]:
def make_record(row):
    prediction = [(p['label'], p['score']) for p in nlp(row.sentence)]

    return rb.TextClassificationRecord(
        inputs={"text": row.sentence},#, "gender": row.gender, "occupation": row.occupation},
        prediction=prediction,
        metadata={"gender": row.gender, "occupation": row.occupation, "noun_phrase": row["noun phrase"]},
        prediction_agent=prediction_agent
    )

In [85]:
nlp = pipeline(task="sentiment-analysis")
prediction_agent = "sst2"

In [86]:
records = gendered_occupations.apply(make_record, axis=1)

In [87]:
rb.log(records, name="gender_sentiment_base")

BulkResponse(dataset='gender_sentiment_base', processed=40, failed=0)

## Log predictions from twitter sentiment analysis pipeline (CardiffNLP)

In [88]:
def make_record(row):
    prediction = [(mapping[p['label']], p['score']) for p in nlp(row.sentence)]

    return rb.TextClassificationRecord(
        inputs={"text": row.sentence},#, "gender": row.gender, "occupation": row.occupation},
        prediction=prediction,
        metadata={"gender": row.gender, "occupation": row.occupation, "noun_phrase": row["noun phrase"]},
        prediction_agent=prediction_agent
    )

In [89]:
nlp = pipeline(
    task="sentiment-analysis", 
    model="cardiffnlp/twitter-roberta-base-sentiment"
)
mapping = {"LABEL_0": "NEGATIVE", "LABEL_1": "NEUTRAL", "LABEL_2":"POSITIVE" }
prediction_agent = "cardiffnlp/twitter-roberta-base-sentiment"

In [90]:
records = gendered_occupations.apply(make_record, axis=1)

In [91]:
rb.log(records, name="gender_sentiment_base")

BulkResponse(dataset='gender_sentiment_base', processed=40, failed=0)

## Building an interactive dashboard to analyse and compare the models

Rubrix default installation includes Kibana, which can be used to build monitoring and analytical dashboards on top of your model predictions.

In this case, we'll be building a Kibana dashboard to examine biases in pre-trained sentiment models.

### Setting up Kibana indexes

A detailed guide for configuring Kibana with Rubrix indexes is coming soon, stay tuned! For now, let's describe the basic steps to get started:

1. If you are running Rubrix locally, open the following URL: https://localhost:5601
2. Then go to http://localhost:5601/app/management and go to Kibana / Index Patterns.
3. Click create index pattern and input the following pattern `.rubrix.dataset.*`, this will make all your Rubrix dataset available for building visualizations and dashboards.
4. Then you can explore Kibana by yourself, you can start creating a Dashboad by going to http://localhost:5601/app/dashboards 




### Our dashboard

<img src="img/sentiment-bias/gender_dashboard.png" alt="Occupational Gender Stereotypes in Sentiment Analysis"/>

## Reproducing the examples from the paper


###  Social Stereotypes of Occupations

The authors note that the highest scoring profession with the Bert pre-trained model is scientist and the lowest is truck driver, which confirmed in our experiment, as shown below:

<img src="img/sentiment-bias/bias-occupations.png" alt="Occupational Gender Stereotypes in Sentiment Analysis" height="40%" width="40%"/>

### Gendered Stereotypes
The authors noticed that:

1. Pilot has the highest positive difference between female and male noun sentences (i.e., female is higher), and
2. Flight attendant has the most negative difference (i.e., male is higher).

In this case, let's watch a quick video to show you how to create a custom visualization on top of your model's predictions:

<video width="100%" controls><source src="https://github.com/recognai/rubrix-materials/raw/main/tutorials/videos/create_visualization_bias.mp4" type="video/mp4"></video>

Following the process above, we generate the following interactive visualizations:

<img src="img/sentiment-bias/ratios_by_occupation.png" alt="Ratios by occupation"/>

Looking at the results from our experiment:

1. Pilot examples DO NOT have higher positive difference for females.
2. Flight attendant has a slightly higher negative difference.



## Appendix: Model explainability with `transformers-interpret`

The main idea is to log the token attributions for each prediction to potentially detect/confirm biases associated to certain words. For this type of use case, in Rubrix you can log the token attributions together with the predictions. Later, you can browse this information and consume as another dimension of your Kibana dashboards.

Unfortunately, this is a work in progress because we've identified a number of issues in the `transformers-interpret` library and did not manage to get meaningful results (see https://github.com/cdpierse/transformers-interpret/issues/65).

In [None]:
%pip install transformers_interpret

In [17]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from transformers_interpret import SequenceClassificationExplainer

model_name = "distilbert-base-uncased-finetuned-sst-2-english"#"cardiffnlp/twitter-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

cls_explainer = SequenceClassificationExplainer(model, tokenizer)

records = []

In [18]:
word_attributions = cls_explainer("I love you, I like you")

In [19]:
cls_explainer.visualize()

True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,POSITIVE (1.00),POSITIVE,2.08,"[CLS] i love you , i like you [SEP]"
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
1.0,POSITIVE (1.00),POSITIVE,2.08,"[CLS] i love you , i like you [SEP]"
,,,,


In [20]:
word_attributions = cls_explainer("My girlfriend is a programmer.")

In [21]:
cls_explainer.visualize()

True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,NEGATIVE (0.56),NEGATIVE,-0.67,[CLS] my girlfriend is a programmer . [SEP]
,,,,


True Label,Predicted Label,Attribution Label,Attribution Score,Word Importance
0.0,NEGATIVE (0.56),NEGATIVE,-0.67,[CLS] my girlfriend is a programmer . [SEP]
,,,,


## Log attributions into a Rubrix dataset

In [104]:
from rubrix import TextClassificationRecord, TokenAttributions

records = []
for i,example in programmers_df.iterrows():
    word_attributions = cls_explainer(example["sentence"])
    
    token_attributions = [ 
        TokenAttributions(
            token=token, 
            attributions={cls_explainer.predicted_class_name: score}
        )
        for token, score in word_attributions[1:-1] # ignore first (CLS) and last (SEP) tokens
    ]
    record = TextClassificationRecord(
        inputs=example["sentence"],
        prediction=[(cls_explainer.predicted_class_name, cls_explainer.pred_probs)],
        prediction_agent="",
        explanation={"text": token_attributions},
        metadata={"gender": example["gender"], "occupation": example["occupation"], "noun_phrase": example["noun phrase"]},
    )
    records.append(record)

In [105]:
rb.log(records, name="gender_sentiment_sst2_interpret")

BulkResponse(dataset='gender_sentiment_sst2_interpret', processed=40, failed=0)