Mitigating Source Bias for Fairer Weak Supervision

This repository provides the official implementation of the following paper:

Mitigating Source Bias for Fairer Weak Supervision (NeurIPS 2023)
Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala

Installation

We recommend you create a conda environment as follows.

conda env create -f environment.yml

and activate it with

conda activate fair-ws

Datasets

Preprocessed datasets and cached SBM mappings can be downloaded here. Dataset folders should be located under data folder, i.e.

```
fair-ws/
    data/
        adult/
        bank_marketing/
        CelebA/
        census/
        CivilComments
        hateXplain
        imdb
        tennis
        UTKFace
    fairws
    notebook
```

It can be done by the following command lines. It may take ~30 minutes (6.24GB).

cd fair-ws
mkdir data
cd data
wget https://www.dropbox.com/sh/7hwqrigdn5oehn7/AACBq7q-eKuOkKQi5nxM01IKa -O datasets.zip
unzip datasets.zip

Sample Usage

from fairws.data_util import load_dataset
from fairws.sbm import find_sbm_mapping, correct_bias

# load data and LF; a_train and a_test are group variables
x_train, y_train, a_train, x_test, y_test, a_test = load_dataset(...)
L = load_LF(...))

# Source bias mitigation (SBM)
ot_type = "linear"
sbm_diff_threshold = 0.05
sbm_mapping = find_sbm_mapping(x_train, a_train, ot_type)
L = correct_bias(L, a_train, sbm_mapping, sbm_diff_threshold)                

# Standard weak supervision pipeline
label_model = LabelModel(...)
label_model.fit(L_train=L, ...)
y_train_pseudo = label_model.predict(L, ...)

Experiments

Real experiment - Section 5.1.
Synthetic experiment - Section 5.2.
FairML compatibility experiment - Section 5.3.
SBM + Domino experiment - Section 5.4.

Citation

If you find our repository useful for your research, please consider citing our paper:

@article{shin2023mitigating,
  title={Mitigating Source Bias for Fairer Weak Supervision},
  author={Shin, Changho and Cromp, Sonia and Adila, Dyah and Sala, Frederic},
  journal={arXiv preprint arXiv:2303.17713},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
fairws		fairws
notebook		notebook
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

fairws

fairws

notebook

notebook

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

setup.py

setup.py

Repository files navigation

Mitigating Source Bias for Fairer Weak Supervision

Installation

Datasets

Sample Usage

Experiments

Citation

About

Releases

Packages

Languages

License

SprocketLab/fair-ws

Folders and files

Latest commit

History

Repository files navigation

Mitigating Source Bias for Fairer Weak Supervision

Installation

Datasets

Sample Usage

Experiments

Citation

About

Resources

License

Stars

Watchers

Forks

Languages