- 📝 Link to the paper
- 🤗 Link to Text-To-Pose model
- 🤗 Link to CLaPP (Contrastive Language-Pose Pretraining) model
- 🤗 Link to Pose Adapter model
- 🤗 Link to created COCO-2017 annotated dataset
This repository contains the code for the paper From Text to Pose to Image: Improving Diffusion Model Control and Quality, published at the NeurIPS 2024 Workshop on Compositional Learning: Perspectives, Methods, and Paths Forward (link to workshop).
Standard text-to-image generation | Ours: text-to-pose-to-image generation |
---|---|
If you use this paper in your work, please cite the paper using the following BibTeX entry:
@misc{bonnet2024textposeimageimproving,
title={From Text to Pose to Image: Improving Diffusion Model Control and Quality},
author={Clément Bonnet and Ariel N. Lee and Franck Wertel and Antoine Tamano and Tanguy Cizain and Pablo Ducru},
year={2024},
eprint={2411.12872},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.12872},
}