Evaluating Gender Bias in Machine Translation

This repository is an extension of the work presented in Evaluating Gender Bias in Machine Translation by Gabriel Stanovsky, Noah A. Smith, and Luke Zettlemoyer (ACL 2019), and Gender Coreference and Bias Evaluation at WMT 2020 by Tom Kocmi, Tomasz Limisiewicz, and Gabriel Stanovsky (WMT2020).

Our project builds upon the foundational research by addressing additional biases and incorporating support for Portuguese, reflecting our commitment to enhancing fairness in machine translation across diverse languages.

Requirements

fast_align: install and point an environment variable called FAST_ALIGN_BASE to its root folder (the one containing the build folder).

Installation

Create a Conda environment:

conda create -n mypython3 python=3.8
source activate mypython3
conda install anaconda

Clone the mt_gender and fast_align repositories:

git clone https://github.com/gabrielStanovsky/mt_gender.git
git clone https://github.com/clab/fast_align.git
conda install cmake

Compile fast_align:

cd fast_align
mkdir -p build
cd build
cmake ..
make

Check if it was installed properly:

cd ../../ && fast_align/build/fast_align

Set the environment variable FAST_ALIGN_BASE to the root folder of fast_align:
```
export FAST_ALIGN_BASE=/path/to/fast_align
```

Project Changes

In this updated version of the project, the following significant enhancements have been made:

Error Correction: Numerous errors identified in the original project have been corrected to enhance the stability and accuracy of the evaluations.
Language Support: Added comprehensive support for the Portuguese language, facilitating the assessment of gender bias in Portuguese translations, thereby broadening the applicability of the project.
Project unbIAs: These changes were made as part of the initiative under the unbIAs project, which aims to reduce biases in artificial intelligence systems. This alignment with unbIAs underscores our commitment to promoting fairness in AI technologies.

How to Run

After completing the installation steps:

Ensure all dependencies are installed by running:
```
pip install -r requirements.txt
```
Configure the necessary environment variables as described in the Installation section.

For the general gender accuracy number, run:

 cd /content/mt_gender/src &&  ../scripts/evaluate_all_languages.sh ../data/aggregates/en.txt ../../winomtout &> ../../winomtout/baseline

For the general gender accuracy number, run:

 cd /content/mt_gender/src &&  ../scripts/evaluate_all_languages.sh ../data/aggregates/en_pro.txt ../../winomtout &> ../../winomtout/pro

For the general gender accuracy number, run:

 cd /content/mt_gender/src &&  ../scripts/evaluate_all_languages.sh ../data/aggregates/en_anti.txt ../../winomtout &> ../../winomtout/anti

For detailed step-by-step instructions, refer to the provided notebook (WinoMT_Scores_add_portuguese.ipynb), which includes specific configurations and examples.

License

This project uses the following license: MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
logs		logs
results		results
results_anti		results_anti
results_baseline		results_baseline
results_pro		results_pro
scripts		scripts
src		src
translations		translations
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
WinoMT_Scores_add_portuguese.ipynb		WinoMT_Scores_add_portuguese.ipynb
czech-morfflex-pdt-161115.zip		czech-morfflex-pdt-161115.zip
install.sh		install.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating Gender Bias in Machine Translation

Requirements

Installation

Project Changes

How to Run

License

About

Releases

Packages

Languages

License

ramos-ai/winopt

Folders and files

Latest commit

History

Repository files navigation

Evaluating Gender Bias in Machine Translation

Requirements

Installation

Project Changes

How to Run

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages