Gender Bias in Masked Language Models for Multiple Languages

Code and data for the paper: "Gender Bias in Masked Language Models for Multiple Languages" (In NAACL 2022). If you use any part of this work, make sure you include the following citation:

@inproceedings{Kaneko:NAACL2022,
    title = "Gender Bias in Masked Language Models for Multiple Languages",
    author = "Kaneko, Masahiro  and
      Imankulova, Aizhan  and
      Bollegala, Danushka  and
      Okazaki, Naoaki",
    booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)",
    month = July,
    year = "2022",
    address = "Seattle",
    publisher = "Association for Computational Linguistics",
}

🛠 Setup

All requirements can be found in requirements.txt. You can install all required packages with pip install -r requirements.txt.

🖥 Evaluating MLMs in multiple languages

You can evaluate the bias using --corpus to select the a parallel corpus and --lang to select the languages.

python eval.py --corpus [ted, news] --lang [de, ja, ar, es, pt, ru, id, zh] --method aula

💻 Japanese and Russian corpora to evaluate social biases

japanese.json and russian.json are manually translated data from Crows-Pairs into Japanese and Russian, respectively. You can use this code to evaluate bias for them.

📜 License

See the LICENSE file

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
parallel_data		parallel_data
translated_data		translated_data
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gender Bias in Masked Language Models for Multiple Languages

🛠 Setup

🖥 Evaluating MLMs in multiple languages

💻 Japanese and Russian corpora to evaluate social biases

📜 License

About

Releases

Packages

Languages

License

kanekomasahiro/bias_eval_in_multiple_mlm

Folders and files

Latest commit

History

Repository files navigation

Gender Bias in Masked Language Models for Multiple Languages

🛠 Setup

🖥 Evaluating MLMs in multiple languages

💻 Japanese and Russian corpora to evaluate social biases

📜 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages