W4D

Official code release for the ICML 2026 paper "Test-Time Debiasing with Probabilistic Prompts via Wasserstein Distance in Vision-Language Models".

📰 News

[April 2026] W4D has been accepted by ICML 2026!

🌊 Method Overview

W4D is a lightweight test-time debiasing framework for vision-language models. Instead of correcting a query with a single debiased point, W4D models a distribution of prompt-induced query perturbations and aligns the resulting query distribution with group reference distributions using a Wasserstein-based objective.

🧩 Repository Structure

.
├── create_dataset.py          # Precompute CLIP image embeddings and 5-fold splits
├── envs.txt                   # Full environment snapshot from the original setup
├── experimental_configs/      # YAML configs for all released experiments
├── queries.py                 # Query class definitions
├── query_templates/           # Query / augmentation templates
├── runner.py                  # Convenience launcher for multi-fold experiments
├── w4d.py                     # Main W4D evaluation script
└── w4d_utils.py               # Metrics and helper utilities

⚙️ Environment Setup

We recommend Python 3.10+ and a separate Conda environment.

conda create -n w4d python=3.10 -y
conda activate w4d

pip install --upgrade pip
pip install -r requirements.txt

envs.txt is the full package dump from the original environment. requirements.txt is a cleaned dependency list for this repository.

If you use GPU acceleration, install the PyTorch build that matches your CUDA version. The original environment used torch 2.8.0 and torchvision 0.23.0.

📦 Supported Data / Tasks

This release supports the datasets and query settings used in the paper:

CelebA: hair-color queries, debiasing with respect to gender
FairFace: stereotype queries, debiasing with respect to gender or race
UTKFace: stereotype queries, debiasing with respect to gender or race
FACET: job queries and stereotype queries, debiasing with respect to gender or skin tone

Note: the current config filenames facet_job_race_*.yml correspond to skin-tone debiasing because FACET provides skin-tone annotations rather than race labels.

🗂️ Data Preparation

This repository does not ship the raw datasets. After obtaining a dataset, organize your files locally and precompute CLIP image embeddings with create_dataset.py.

The script writes:

data/<dataset>_featurized_<clip-model>.jsonl
data/fold_indices/<dataset>_featurized_<clip-model>_folds.jsonl

Example for CelebA:

python create_dataset.py \
  --dataset_name celeba \
  --_MODEL_NAME clip-vit-base-patch16 \
  --data_path /path/to/celeba_root/ \
  --meta_data_file_name classification_label/CelebAMask-HQ-attribute-anno.txt

Example for FairFace:

python create_dataset.py \
  --dataset_name fairface \
  --_MODEL_NAME clip-vit-base-patch16 \
  --data_path /path/to/fairface_root/ \
  --meta_data_file_name fairface_label_train.csv

create_dataset.py currently contains dataset-specific metadata parsers for:

celeba
fairface
utkface
facet

Please make sure the metadata columns match the expectations inside the script.

🚀 Running W4D

Run one fold with a selected config:

python w4d.py \
  --enum 0 \
  --config experimental_configs/celeba_hair_gender_clip-vit-base-patch16.yml

Run the default 5-fold launcher:

python runner.py

To evaluate a different setting, either:

pass another YAML file to w4d.py, or
edit the configs list in runner.py

🤝 Acknowledgement

Parts of this codebase were inspired by and adapted from the excellent BEND-VLM repository:

https://github.com/waltergerych/bend_vlm

We thank the authors for open-sourcing their implementation and supporting reproducible research in debiasing for vision-language models.

✍️ Citation

If you use this code, please cite the corresponding paper once the final bibliographic information is available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

W4D

📰 News

🌊 Method Overview

🧩 Repository Structure

⚙️ Environment Setup

📦 Supported Data / Tasks

🗂️ Data Preparation

🚀 Running W4D

🤝 Acknowledgement

✍️ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
experimental_configs		experimental_configs
query_templates		query_templates
README.md		README.md
create_dataset.py		create_dataset.py
queries.py		queries.py
requirements.txt		requirements.txt
runner.py		runner.py
w4d.py		w4d.py
w4d_utils.py		w4d_utils.py

Folders and files

Latest commit

History

Repository files navigation

W4D

📰 News

🌊 Method Overview

🧩 Repository Structure

⚙️ Environment Setup

📦 Supported Data / Tasks

🗂️ Data Preparation

🚀 Running W4D

🤝 Acknowledgement

✍️ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages