Skip to content

QDRhhhh/W4D

Repository files navigation

W4D

Official code release for the ICML 2026 paper "Test-Time Debiasing with Probabilistic Prompts via Wasserstein Distance in Vision-Language Models".

📰 News

  • [April 2026] W4D has been accepted by ICML 2026!

🌊 Method Overview

W4D is a lightweight test-time debiasing framework for vision-language models. Instead of correcting a query with a single debiased point, W4D models a distribution of prompt-induced query perturbations and aligns the resulting query distribution with group reference distributions using a Wasserstein-based objective.

image-20260512152145856

🧩 Repository Structure

.
├── create_dataset.py          # Precompute CLIP image embeddings and 5-fold splits
├── envs.txt                   # Full environment snapshot from the original setup
├── experimental_configs/      # YAML configs for all released experiments
├── queries.py                 # Query class definitions
├── query_templates/           # Query / augmentation templates
├── runner.py                  # Convenience launcher for multi-fold experiments
├── w4d.py                     # Main W4D evaluation script
└── w4d_utils.py               # Metrics and helper utilities

⚙️ Environment Setup

We recommend Python 3.10+ and a separate Conda environment.

conda create -n w4d python=3.10 -y
conda activate w4d

pip install --upgrade pip
pip install -r requirements.txt

envs.txt is the full package dump from the original environment. requirements.txt is a cleaned dependency list for this repository.

If you use GPU acceleration, install the PyTorch build that matches your CUDA version. The original environment used torch 2.8.0 and torchvision 0.23.0.

📦 Supported Data / Tasks

This release supports the datasets and query settings used in the paper:

  • CelebA: hair-color queries, debiasing with respect to gender
  • FairFace: stereotype queries, debiasing with respect to gender or race
  • UTKFace: stereotype queries, debiasing with respect to gender or race
  • FACET: job queries and stereotype queries, debiasing with respect to gender or skin tone

Note: the current config filenames facet_job_race_*.yml correspond to skin-tone debiasing because FACET provides skin-tone annotations rather than race labels.

🗂️ Data Preparation

This repository does not ship the raw datasets. After obtaining a dataset, organize your files locally and precompute CLIP image embeddings with create_dataset.py.

The script writes:

  • data/<dataset>_featurized_<clip-model>.jsonl
  • data/fold_indices/<dataset>_featurized_<clip-model>_folds.jsonl

Example for CelebA:

python create_dataset.py \
  --dataset_name celeba \
  --_MODEL_NAME clip-vit-base-patch16 \
  --data_path /path/to/celeba_root/ \
  --meta_data_file_name classification_label/CelebAMask-HQ-attribute-anno.txt

Example for FairFace:

python create_dataset.py \
  --dataset_name fairface \
  --_MODEL_NAME clip-vit-base-patch16 \
  --data_path /path/to/fairface_root/ \
  --meta_data_file_name fairface_label_train.csv

create_dataset.py currently contains dataset-specific metadata parsers for:

  • celeba
  • fairface
  • utkface
  • facet

Please make sure the metadata columns match the expectations inside the script.

🚀 Running W4D

Run one fold with a selected config:

python w4d.py \
  --enum 0 \
  --config experimental_configs/celeba_hair_gender_clip-vit-base-patch16.yml

Run the default 5-fold launcher:

python runner.py

To evaluate a different setting, either:

  • pass another YAML file to w4d.py, or
  • edit the configs list in runner.py

🤝 Acknowledgement

Parts of this codebase were inspired by and adapted from the excellent BEND-VLM repository:

We thank the authors for open-sourcing their implementation and supporting reproducible research in debiasing for vision-language models.

✍️ Citation

If you use this code, please cite the corresponding paper once the final bibliographic information is available.


About

Official code release for the ICML 2026 paper "Test-Time Debiasing with Probabilistic Prompts via Wasserstein Distance in Vision-Language Models".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages