Skip to content

kanekomasahiro/evaluate_bias_in_mlm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evaluate bias in MLM

Code for the paper: Unmasking the Mask -- Evaluating Social Biases in Masked Language Models. If you use any part of this work, please cite the following citation:

@InProceedings{Kaneko:AUL:2022,
  author={Masahiro Kaneko and Danushka Bollegala},
  title={Unmasking the Mask -- Evaluating Social Biases in Masked Language Models},
  booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence},
  year      = {2022},
  month     = {February},
  address   = {Vancouver, BC, Canada}
}

🛠 Setup

You can install all required packages with following command.

pip install -r requirements.txt

You can downlaod CrowS-Pairs (CP) and StereoSet (SS) datasets and preprocess them with following commands.

mkdir -p data
wget -O data/cp.csv https://raw.githubusercontent.com/nyu-mll/crows-pairs/master/data/crows_pairs_anonymized.csv
wget -O data/ss.json https://raw.githubusercontent.com/moinnadeem/StereoSet/master/data/dev.json
python -u preprocess.py --input crows_pairs --output data/paralled_cp.json
python -u preprocess.py --input stereoset --output data/paralled_ss.json

🧑🏻‍💻 How to evaluate

You can evaluate MLMs (BERT, RoBERTa and ALBERT) on AULA, AUL, CP score(CPS) and SS score(SSS)-intrasentence on CP and SS datasets with following command. You also can specify pre-trained MLM path using --model.

python evaluate.py --data [cp, ss] --output /Your/output/path --model [bert, roberta, albert] --method [aula, aul, cps, sss]

📜 License

See the LICENSE file

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages