Skip to content

grenoble-zhang/Ctrl-U

Repository files navigation

Ctrl-U

Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Project Website  arXiv 

Authors: Guiyu Zhang*1,2, Huan-ang Gao*2, Zijian Jiang2, Hao Zhao†2, Zhedong Zheng†1

1 FST, University of Macau 2 AIR, Tsinghua University

News

[2025-2-19]: The code and models have been released 😊!

[2025-1-22]: Our Ctrl-U has been accepted by ICLR 2025 🎉 !

[2024-10-14]: We have released the technical report of Ctrl-U.

Getting Started

🛠️ Environments

git clone https://github.com/grenoble-zhang/Ctrl-U.git
cd Ctrl-U
conda create -n Ctrl-U python=3.10
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
pip3 install -r requirements.txt
pip3 install -U openmim
mim install mmengine
mim install "mmcv==2.1.0"
pip3 install "mmsegmentation>=1.0.0"
pip3 install mmdet

🕹️ Data Preperation

All the organized data has been uploaded to Hugging Face and will be automatically downloaded during training or evaluation. You can preview it in advance using the following links to check the data samples and the disk space required.

Task Training Data 🤗 Evaluation Data 🤗
LineArt, Hed Data, 1.14 TB Data, 2.25GB
Depth Data, 1.22 TB Data, 2.17GB
Segmentation ADE20K Data, 7.04 GB Same Path as Training Data
Segmentation COCOStuff Data, 61.9 GB Same Path as Training Data

😉 Training

bash train/ctrlu_ade20k.sh
bash train/ctrlu_cocostuff.sh
bash train/ctrlu_depth.sh
bash train/ctrlu_hed.sh
bash train/ctrlu_lineart.sh

🧐 Evaluation

Please download the model weights and put them into each subset of checkpoints:

model HF weights
Segmentation_ade20k model
Segmentation_cocostuff model
Depth model
Hed (SoftEdge) model
LineArt model

Please make sure the folder directory is consistent with the test script, then you can eval each model by:

bash eval/eval_ade20k.sh
bash eval/eval_cocostuff.sh
bash eval/eval_depth.sh
bash eval/eval_hed.sh
bash eval/eval_lineart.sh

Please refer to the code for evaluating CLIP-Score and FID

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

Our work is based on the following open-source projects. We sincerely thank the contributors for thoese great works!

Citation

If you find Ctrl-U is useful in your research or applications, please consider giving us a star ⭐ or cite us using:

@article{zhang2024ctrl,
  title={Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling},
  author={Zhang, Guiyu and Gao, Huan-ang and Jiang, Zijian and Zhao, Hao and Zheng, Zhedong},
  journal={arXiv preprint arXiv:2410.11236},
  year={2024}
}

About

[ICLR 2025] Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors