<a href="https://colab.research.google.com/github/mariagrandury/sesgos-en-modelos-del-lenguaje/blob/main/detecci%C3%B3n_y_mitigaci%C3%B3n_de_sesgos_en_Word_Embeddings_con_WEFE.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Detección y Mitigación de Sesgos en Word Embeddings

Para medir y mitigar sesgos en word embeddings vamos a utilizar la biblioteca open-source [WEFE: "Word Embeddings Fairness Evaluation"](https://github.com/dccuchile/wefe).

In [None]:
!pip install wefe

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting wefe
  Downloading wefe-0.4.0-py3-none-any.whl (7.9 MB)
[K     |████████████████████████████████| 7.9 MB 9.9 MB/s 
Collecting semantic-version
  Downloading semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)
Installing collected packages: semantic-version, wefe
Successfully installed semantic-version-2.10.0 wefe-0.4.0


In [None]:
from wefe.datasets.datasets import load_weat
from wefe.query import Query
from wefe.metrics.WEAT import WEAT
from wefe.utils import run_queries
from wefe.word_embedding_model import WordEmbeddingModel

## Descargar los modelos

Vamos a comparar diferentes Word Embeddings de palabras en espanol creados a partir de 3 bases de datos:
- Spanish Unannotated Corpora (SUC)
- Spanish Billion Word Corpus (SBWC)
- Spanish Wikipedia (SW)

Estos modelos fueron creados con [fastText](https://github.com/facebookresearch/fastText) por un equipo del
Departamento de Ciencias de la Computación de la Universidad de Chile.

Repo: https://github.com/dccuchile/spanish-word-embeddings

In [None]:
!wget https://zenodo.org/record/3234051/files/embeddings-l-model.vec
!wget http://dcc.uchile.cl/~jperez/word-embeddings/fasttext-sbwc.vec.gz
!wget https://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.es.vec

--2022-12-18 19:03:43--  https://zenodo.org/record/3234051/files/embeddings-l-model.vec
Resolving zenodo.org (zenodo.org)... 188.185.124.72
Connecting to zenodo.org (zenodo.org)|188.185.124.72|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3446609027 (3.2G) [application/octet-stream]
Saving to: ‘embeddings-l-model.vec’


2022-12-18 19:06:49 (17.9 MB/s) - ‘embeddings-l-model.vec’ saved [3446609027/3446609027]

--2022-12-18 19:06:49--  http://dcc.uchile.cl/~jperez/word-embeddings/fasttext-sbwc.vec.gz
Resolving dcc.uchile.cl (dcc.uchile.cl)... 192.80.24.11
Connecting to dcc.uchile.cl (dcc.uchile.cl)|192.80.24.11|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.dcc.uchile.cl/~jperez/word-embeddings/fasttext-sbwc.vec.gz [following]
--2022-12-18 19:06:50--  https://www.dcc.uchile.cl/~jperez/word-embeddings/fasttext-sbwc.vec.gz
Resolving www.dcc.uchile.cl (www.dcc.uchile.cl)... 192.80.24.11, 200.9.99.213
Connectin

In [None]:
from gensim.models import KeyedVectors

suc_embeddings = KeyedVectors.load_word2vec_format('embeddings-l-model.vec')
suc_model = WordEmbeddingModel(suc_embeddings, 'suc_model')

sbwc_embeddings = KeyedVectors.load_word2vec_format('fasttext-sbwc.vec.gz')
sbwc_model = WordEmbeddingModel(sbwc_embeddings, 'sbwc_model')

sw_embeddings = KeyedVectors.load_word2vec_format('wiki.es.vec')
sw_model = WordEmbeddingModel(sw_embeddings, 'sw_model')

In [None]:
models = [suc_model, sbwc_model, sw_model]

In [None]:
models

[<WordEmbeddingModel named 'suc_model' with 1313423 word embeddings of 300 dims>,
 <WordEmbeddingModel named 'sbwc_model' with 855380 word embeddings of 300 dims>,
 <WordEmbeddingModel named 'sw_model' with 985667 word embeddings of 300 dims>]

In [None]:
# list(suc_model.vocab.keys())[:5]

## Métric WEAT

Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183–186, 2017. [DOI: 10.1126/science.aal4230](https://www.science.org/doi/10.1126/science.aal4230)

- The closer its value is to 0, the less biased the model is, values usually lie between 0 and 2
- The more positive the value given by the WEATscore, the more the target set 1 will be related to attribute set 1 and second target set 2 to attribute set 2
- You can return the p-value with parameters `calculate_p_value = True,p_value_iterations = 1000`

In [None]:
weat = WEAT()

## Detectar sesgos de género

> El género es un espectro. Para facilitar el estudio de sesgos, vamos a considerar solo dos c

### Crear la consulta

1. Definir los grupos de palabras que queremos comparar
2. Ejecutar la consulta

In [None]:
target_sets_names=["Términos Femeninos", "Términos Masculinos"]
target_sets=[
    ["ella", "mujer", "chica", "niña", "hermana", "madre", "hija", "amiga"],
    ["él", "hombre", "chico", "niño", "hermano", "padre", "hijo", "amigo"]
]

In [None]:
familia_vs_carrera = Query(
    target_sets_names=target_sets_names,
    target_sets=target_sets,
    attribute_sets_names=["Familia", "Carrera Profesional"],
    attribute_sets=[
        [
            "hogar", "casa", "crianza", "familia", "cariño", "matrimonio",
            "boda", "pareja", "cuidar"
        ],
        [
            "dirección", "oficina", "profesional", "corporación", "salario",
            "empresa", "carrera", "responsabilidad", "éxito",
        ],
    ]
)

In [None]:
result = weat.run_query(familia_vs_carrera, sbwc_model)
result

{'query_name': 'Términos Femeninos and Términos Masculinos wrt Familia and Carrera Profesional',
 'result': 0.38658484195669485,
 'weat': 0.38658484195669485,
 'effect_size': 0.8784279893293436,
 'p_value': nan}

### Evaluar los modelos con diferentes consultas

In [None]:
eleccion_carrera = Query(
    target_sets_names=target_sets_names,
    target_sets=target_sets,
    attribute_sets_names=["Carreras 1", "Carreras 2"],
    attribute_sets=[
        [
            "enfermería", "magisterio", "psicología", "pedagogía",
            "literatura", "peluquería", 
        ],
        [
            "medicina", "derecho", "matemáticas", "física", "ingeniería",
            "arquitectura", "doctorado",
        ],
    ]
)

In [None]:
colores = Query(
    target_sets_names=target_sets_names,
    target_sets=target_sets,
    attribute_sets_names=["Colores 1", "Colores 2"],
    attribute_sets=[
        [
            "rosa", "morado", "fucsia", "lila", "turquesa"
        ],
        [
            "azul", "marino", "verde",
        ],
    ]
)

In [None]:
hobbies = Query(
    target_sets_names=target_sets_names,
    target_sets=target_sets,
    attribute_sets_names=["Hobbies 1", "Hobbies 2"],
    attribute_sets=[
        [
            "bailar", "pintar", "gimnasia", "fotografía", "volley", "cantar",
            "escribir", "maquillar", "peinar", "manualidades"
        ],
        [
            "fútbol", "rugby", "boxeo", "gimnasio", "coches", "motos",
            "conducir", "beisbol", "tenis", "ajedrez", "billar"
        ],
    ]
)

In [None]:
gender_queries = [familia_vs_carrera, eleccion_carrera, colores, hobbies]

In [None]:
WEAT_gender_results = run_queries(
    metric=WEAT, queries=gender_queries, models=models, queries_set_name="Gender Queries"
)

WEAT_gender_results

query_name,Términos Femeninos and Términos Masculinos wrt Familia and Carrera Profesional,Términos Femeninos and Términos Masculinos wrt Carreras 1 and Carreras 2,Términos Femeninos and Términos Masculinos wrt Colores 1 and Colores 2,Términos Femeninos and Términos Masculinos wrt Hobbies 1 and Hobbies 2
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
suc_model,0.152173,0.27453,0.799884,0.561708
sbwc_model,0.386585,0.27589,0.475496,0.423254
sw_model,0.441849,0.288998,0.825989,0.431055


### Visualizar los resultados


In [None]:
from wefe.utils import plot_queries_results, run_queries

plot_queries_results(WEAT_gender_results).show()

### Agregar los resultados

In [None]:
WEAT_gender_results_agg = run_queries(
    WEAT,
    gender_queries,
    models,
    metric_params={"preprocessors": [{"lowercase": True}]},
    aggregate_results=True,
    aggregation_function="abs_avg",
    # return_only_aggregation=True,
    queries_set_name="Gender Queries",
)
WEAT_gender_results_agg

Unnamed: 0_level_0,Términos Femeninos and Términos Masculinos wrt Familia and Carreras,Términos Femeninos and Términos Masculinos wrt Carreras 1 and Carreras 2,WEAT: Gender Queries average of abs values score
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
suc_model,0.152173,0.27453,0.213351
sbwc_model,0.386585,0.27589,0.331237
sw_model,0.441849,0.288998,0.365423


In [None]:
plot_queries_results(WEAT_gender_results_agg).show()


### Crear un ranking de los modelos

In [None]:
from wefe.utils import create_ranking

# create the ranking
gender_ranking = create_ranking(
    [WEAT_gender_results_agg]
)

gender_ranking

Unnamed: 0_level_0,WEAT: Gender Queries average of abs values score
model_name,Unnamed: 1_level_1
suc_model,1.0
sbwc_model,2.0
sw_model,3.0


In [None]:
from wefe.utils import plot_ranking

fig = plot_ranking(gender_ranking)
fig.show()

## Mitigación del sesgo

In [None]:
from wefe.datasets import fetch_debiaswe
from wefe.debias.hard_debias import HardDebias

In [None]:
debiaswe_wordsets = fetch_debiaswe()

definitional_pairs = debiaswe_wordsets["definitional_pairs"]
equalize_pairs = debiaswe_wordsets["equalize_pairs"]
gender_specific = debiaswe_wordsets["gender_specific"]

In [None]:
definitional_pairs = [
    ["ella", "él"],
    ["mujer", "hombre"],
    ["chica", "chico"],
    ["madre", "padre"],
    ["hija", "hijo"],
    ["hermana", "hermano"],
    ["amiga", "amigo"],
]
A = ["doctora", "profesora", "pintora", "actora", "autora", "escritora", "escultora", "cantautora", "lectora"]
B = ["abogado", "arquitecto"]
equalize_pairs = definitional_pairs + [[palabra, palabra[:-1]] for palabra in A] + [[palabra, palabra[:-1] + "o"] for palabra in B]
equalize_pairs

[['ella', 'él'],
 ['mujer', 'hombre'],
 ['chica', 'chico'],
 ['madre', 'padre'],
 ['hija', 'hijo'],
 ['hermana', 'hermano'],
 ['amiga', 'amigo'],
 ['doctora', 'doctor'],
 ['profesora', 'profesor'],
 ['pintora', 'pintor'],
 ['actora', 'actor'],
 ['autora', 'autor'],
 ['escritora', 'escritor'],
 ['escultora', 'escultor'],
 ['cantautora', 'cantautor'],
 ['lectora', 'lector'],
 ['abogado', 'abogado'],
 ['arquitecto', 'arquitecto']]

In [None]:
hd = HardDebias(verbose=False, criterion_name="género").fit(
    sbwc_model,
    definitional_pairs=definitional_pairs,
    equalize_pairs=equalize_pairs,
)

In [None]:
gender_debiased_model = hd.transform(sbwc_model, ignore=gender_specific, copy=True)

Model copy created successfully.


100%|██████████| 855380/855380 [00:14<00:00, 59379.83it/s]


In [None]:
import pandas as pd
biased_results_1 = weat.run_query(gender_query, sbwc_model, normalize=True)
debiased_results_1 = weat.run_query(gender_query, gender_debiased_model, normalize=True)
result_df = pd.DataFrame([biased_results_1, debiased_results_1])
result_df = result_df.assign(model = ['original', 'debiased'])
result_df

Unnamed: 0,query_name,result,weat,effect_size,p_value,model
0,Términos Femeninos and Términos Masculinos wrt...,0.386585,0.386585,0.878428,,original
1,Términos Femeninos and Términos Masculinos wrt...,0.042829,0.042829,0.105684,,debiased


### Guardar los modelos mejorados

In [None]:
gender_debiased_model.wv.save("gender_debiased_glove.kv")