# German Gender Bias Examples

The following example demonstrates how to use any sample gender query on any word embedding model considering a fairness metric. The workflow may be broken 
down into three steps which are:

- Download and install the word embedding model in any desired language.
- Structure the query based on the target set and the attribute set for that particular language using google translator.
- Execute the queries utilizing the fairness metric through the Word Embedding Model.

> **Note:** The words sets used in this notebook were translated using google translator. Therefore, it is possible that some concepts may have been mistranslated and may require some correction. The original English concepts can be loaded using [load_weat](https://wefe.readthedocs.io/en/latest/generated/dataloaders/wefe.load_weat.html#wefe.load_weat) util. Use the notebook and its results with caution!


### How to interpret the results

Any score in WEAT, WEAT-EZ and RNSB metrics greater than 0 suggests that there is indeed a bias for the query in consideration for the particular language. On the other hand, when using ECT metric values closer to 1 are better as they represent less bias.

While results are expected to be consistent across all metrics, some queries may vary in their results as a result of their formulation.
Visit the [WEFE API](https://wefe.readthedocs.io/en/latest/api.html#metrics) documentation for more information on the formulation of the metrics.

### Notes about the models and `flair`

In majority of the cases, a score will be achieved. Nevertheless, in a few occurrences, certain queries will contain about 20% missing words and the results will 
reflect that and the query declares itself invalid and returns NaN. In such cases, a `lost_vocabulary_threshold` parameter is introduced which can limit the number
of words lost or missing. A 30% loss is permitted in the following scenario:


These examples are based on embeddings trained by the authors of the flair library.
To learn more about the library, visit the [flair official website](https://github.com/flairNLP/flair).

Also, visit https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/CLASSIC_WORD_EMBEDDINGS.md
to check the list of embeddings in other languages.

## Preparing the Model and the Queries 

In [1]:
# uncomment the following code to install/update flair.
# !pip install -U flair

%cd ../..

/home/pablo/Proyectos/WEFE/wefe


### Load the model

In [2]:
from flair.embeddings import WordEmbeddings

from wefe.word_embedding_model import WordEmbeddingModel
from wefe.query import Query
from wefe.metrics import RNSB, WEAT, ECT
from wefe.utils import flair_to_gensim, run_queries, plot_queries_results

flair_model_name = "de"

model = flair_to_gensim(WordEmbeddings(flair_model_name))
wefe_model = WordEmbeddingModel(model, flair_model_name)

# we will test using only one model. You could add more later...
models = [wefe_model]


### Load word sets

In [3]:
### -----------------------------------------------------------------------------------
### Gender Query 1: Male Terms and Female Terms wrt career and family
### -----------------------------------------------------------------------------------

# create the word sets
target_sets1 = [
    ["männlich", "Mann", "Junge", "Bruder", "er", "ihm", "seine", "Sohn"],
    ["weiblich", "Frau", "Mädchen", "Schwester", "Sie", "Sie", "ihres", "Tochter"],
]
target_sets_names1 = ["Male Terms", "Female Terms"]
attribute_sets1 = [
    [
        "Führungskraft",
        "Verwaltung",
        "Fachmann",
        "Konzern",
        "Gehalt",
        "Büro",
        "Unternehmen",
        "Werdegang",
    ],
    [
        "Heimat",
        "Eltern",
        "Kinder",
        "Familie",
        "Cousinen",
        "Hochzeit",
        "Hochzeit",
        "Verwandten",
    ],
]
attribute_sets_names1 = ["career", "family"]
# create the query
gender_query_1 = Query(
    target_sets1, attribute_sets1, target_sets_names1, attribute_sets_names1
)
### -----------------------------------------------------------------------------------
### Gender query 2: Male Terms and Female Terms wrt Science and Arts
### -----------------------------------------------------------------------------------

# create the word sets
target_sets2 = [
    ["männlich", "Mann", "Junge", "Bruder", "er", "ihm", "seine", "Sohn"],
    ["weiblich", "Frau", "Mädchen", "Schwester", "Sie", "Sie", "ihres", "Tochter"],
]
target_sets_names2 = ["Male Terms", "Female Terms"]
attribute_sets2 = [
    [
        "Wissenschaft",
        "Technologie",
        "Physik",
        "Chemie",
        "Einstein",
        "NASA",
        "Experiment",
        "Astronomie",
    ],
    [
        "Poesie",
        "Kunst",
        "tanzen",
        "Literatur",
        "Roman",
        "Symphonie",
        "Theater",
        "Skulptur",
    ],
]
attribute_sets_names2 = ["Science", "Arts"]
# create the query
gender_query_2 = Query(
    target_sets2, attribute_sets2, target_sets_names2, attribute_sets_names2
)

### -----------------------------------------------------------------------------------
### Gender query 2: Male Terms and Female Terms wrt Maths and Arts2
### -----------------------------------------------------------------------------------

# create the word sets
target_sets3 = [
    ["männlich", "Mann", "Junge", "Bruder", "er", "ihm", "seine", "Sohn"],
    ["weiblich", "Frau", "Mädchen", "Schwester", "Sie", "Sie", "ihres", "Tochter"],
]
target_sets_names3 = ["Male Terms", "Female Terms"]
attribute_sets3 = [
    [
        "Mathematik",
        "Algebra",
        "Geometrie",
        "Infinitesimalrechnung",
        "Gleichungen",
        "Berechnung",
        "Zahlen",
        "Zusatz",
    ],
    [
        "Poesie",
        "Kunst",
        "Shakespeare",
        "tanzen",
        "Literatur",
        "Roman",
        "Symphonie",
        "Theater",
    ],
]
attribute_sets_names3 = ["Maths", "Arts2"]
# create the query
gender_query_3 = Query(
    target_sets3, attribute_sets3, target_sets_names3, attribute_sets_names3
)

gender_queries = [gender_query_1, gender_query_2, gender_query_3]


## Run the Queries

### Run the queries using WEAT


*The closer the value is to 0 the less biased.*

The last column of the dataframe represents the average of the absolute values of each query.

For further information, check the formulation of the metric in [WEAT in the WEFE API](https://wefe.readthedocs.io/en/latest/generated/wefe.WEAT.html#wefe.WEAT).


In [9]:
weat = WEAT()

WEAT_gender_results = run_queries(
    WEAT,
    gender_queries,
    models,
    lost_vocabulary_threshold=0.3,
    metric_params={"preprocessors": [{"lowercase": True}]},
    aggregate_results=True,
    queries_set_name="Gender Queries",
).round(3)

display(WEAT_gender_results)
plot_queries_results(WEAT_gender_results).show()

Unnamed: 0_level_0,Male Terms and Female Terms wrt career and family,Male Terms and Female Terms wrt Science and Arts,Male Terms and Female Terms wrt Maths and Arts2,WEAT: Gender Queries average of abs values score
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
de,0.51,0.078,-0.004,0.197


### Run the queries using WEAT effect size

*The closer the value is to 0 the less biased.*

The last column of the dataframe represents the average of the absolute values of each query.

For further information, check the formulation of the metric in [WEAT in the WEFE API](https://wefe.readthedocs.io/en/latest/generated/wefe.WEAT.html#wefe.WEAT).


In [5]:
WEAT_EZ_gender_results = run_queries(
    WEAT,
    gender_queries,
    models,
    lost_vocabulary_threshold=0.3,
    metric_params={"preprocessors": [{"lowercase": True}], "return_effect_size": True,},
    aggregate_results=True,
    queries_set_name="Gender Queries",
).round(3)

display(WEAT_EZ_gender_results)
plot_queries_results(WEAT_EZ_gender_results).show()

Unnamed: 0_level_0,Male Terms and Female Terms wrt career and family,Male Terms and Female Terms wrt Science and Arts,Male Terms and Female Terms wrt Maths and Arts2,WEAT: Gender Queries average of abs values score
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
de,0.938,0.506,0.207,0.55


### Run the queries using RNSB

*The closer the value is to 0 the less biased.*

The last column of the dataframe represents the average of the absolute values of each query.

For further information, check the formulation of the metric in [RNSB in the WEFE API](https://wefe.readthedocs.io/en/latest/generated/wefe.RNSB.html).

In [6]:
RNSB_gender_results = run_queries(
    RNSB,
    gender_queries,
    models,
    lost_vocabulary_threshold=0.3,
    metric_params={"preprocessors": [{"lowercase": True}]},
    aggregate_results=True,
    queries_set_name="Gender Queries",
).round(3)
display(RNSB_gender_results)
plot_queries_results(RNSB_gender_results).show()


Unnamed: 0_level_0,Male Terms and Female Terms wrt career and family,Male Terms and Female Terms wrt Science and Arts,Male Terms and Female Terms wrt Maths and Arts2,RNSB: Gender Queries average of abs values score
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
de,0.03,0.007,0.005,0.014


### Run the queries using ECT


*The closer the value is to **1**, the less biased the query is.*

In this case, the metric only accepts 2 target sets and one attribute set as input, so `run_queries` generates subqueries of that size.

The last column of the dataframe represents the average of the absolute values of each query.

For further information, check the formulation of the metric in [ECT in the WEFE API](https://wefe.readthedocs.io/en/latest/generated/wefe.ECT.html).

In [7]:
ECT_gender_results = run_queries(
    ECT,
    gender_queries,
    models,
    lost_vocabulary_threshold=0.3,
    metric_params={"preprocessors": [{"lowercase": True}]},
    aggregate_results=True,
    queries_set_name="Gender Queries",
    generate_subqueries=True
).round(3)

display(ECT_gender_results)
plot_queries_results(ECT_gender_results).show()


Unnamed: 0_level_0,Male Terms and Female Terms wrt career,Male Terms and Female Terms wrt family,Male Terms and Female Terms wrt Science,Male Terms and Female Terms wrt Arts,Male Terms and Female Terms wrt Maths,Male Terms and Female Terms wrt Arts2,ECT: Gender Queries average of abs values score
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
de,0.524,0.964,0.905,0.905,0.69,0.929,0.819
