Skip to content

anzeameol/BiDPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compositional Text-to-Image Generation via Region-Aware Bimodal Direct Preference Optimization

1Shanghai Key Lab of Intell. Info. Processing, School of CS, Fudan University  2Shanghai Collaborative Innovation Center of Intelligent Visual Computing 
* Equal contributions  Corresponding author 

News

  • 2026-03-23: Code and BiComp dataset released.
  • 2026-02-21: BiDPO accepted by CVPR 2026! Data and code will be released soon.

Quick Start

  1. Create environment:
conda create -n bidpo python=3.10 -y
conda activate bidpo
pip install -r requirements.txt
  1. Download the BiComp dataset:
hf download "anzeameol/BiComp" --repo-type "dataset" --local-dir "./datasets/BiComp"

or

bash ./scripts/download/download_BiComp.sh

Data Visualization

  • Data visualization notebook: ./notebooks/data_visualization.ipynb

Training

Download Vismin dataset and pretrained Stable Diffusion weights

  1. Download VisMin dataset:
bash ./scripts/download/download_vismin.sh
  1. Download pretrained Stable Diffusion weights:
bash ./scripts/download/download_stable_diffusion.sh

Train

Training scripts are located in ./scripts/train/.

For example, to train SDXL with BiDPO:

bash ./scripts/train/train_sdxl_bidpo_region.sh

Inference and Evaluation

Download BiDPO checkpoints:

bash ./scripts/download/download_checkpoints.sh

We provide the inference/evaluation code for DPG-Bench, GenEval, T2I-CompBench in ./scripts/test/ for reference. If you want to evaluate on the corresponding benchmark, please install the corresponding benchmark and follow the instructions of the benchmark for evaluation.

For example, to infer on dpgbench, run:

bash ./scripts/test/infer_dpgbench.sh

Citation

@InProceedings{Liu_2026_CVPR,
  author    = {Liu, Zhuohan and Peng, Wujian and Chen, Yitong and Wu, Zuxuan},
  title     = {Compositional Text-to-Image Generation Via Region-aware Bimodal Direct Preference Optimization},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2026},
  pages     = {36604-36614}
}

About

[CVPR-26] Official repository of "Compositional Text-to-Image Generation via Region-Aware Bimodal Direct Preference Optimization“

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors