Multilingual large language models leak human stereotypes across language boundaries

This repository contains the code and dataset from our paper [].

Dataset

We conduct human study in English, Russian, Chinese, and Hindi to collect human judgments on stereotypes towards 30 social groups within three categories, across all 16 pairs of traits from the ABC model. Please read the datasheet in the dataset folder for details of the dataset.

Measure stereotypic associations in LLMs

Run python asso.py to get the group-trait associations scores with respect to each template.
Run python aggre_tem.py to get the aggregated group-trait association scores.

Measure stereotype leakage

Run python mixed_effect.py to get results from the mixed effect analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
dataset		dataset
LICENSE		LICENSE
README.md		README.md
aggre_tem.py		aggre_tem.py
asso.py		asso.py
fixed_lps.py		fixed_lps.py
mixed_effect.py		mixed_effect.py
multiclassUpdate.py		multiclassUpdate.py
quick_sens_v2.py		quick_sens_v2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataset

dataset

LICENSE

LICENSE

README.md

README.md

aggre_tem.py

aggre_tem.py

asso.py

asso.py

fixed_lps.py

fixed_lps.py

mixed_effect.py

mixed_effect.py

multiclassUpdate.py

multiclassUpdate.py

quick_sens_v2.py

quick_sens_v2.py

Repository files navigation

Multilingual large language models leak human stereotypes across language boundaries

Dataset

Measure stereotypic associations in LLMs

Measure stereotype leakage

About

Releases

Packages

Contributors 2

Languages

License

AnnaSou/Stereotype_Leakage

Folders and files

Latest commit

History

Repository files navigation

Multilingual large language models leak human stereotypes across language boundaries

Dataset

Measure stereotypic associations in LLMs

Measure stereotype leakage

About

Resources

License

Stars

Watchers

Forks

Languages