ParaDiffusion

Paragraph-to-Image Generation with Information-Enriched Diffusion Model

🎶 Updates

Mar. 24, 2024. The inference code have been released.
Nov. 28, 2023. ParaPrompts-400 and ParaImage-3k have been released.
Nov. 15, 2023. Rep initialization.

🐱 Abstract

ParaDiffusion an information-enriched diffusion model for paragraph-to-image generation task, which delves into the transference of the extensive semantic comprehension capabilities of large language models to the task of image generation. At its core is using a large language model (e.g., Llama V2) to encode long-form text, followed by fine-tuning with LORA to align the text-image feature spaces in the generation task. A high-quality paragraph-image pair dataset, namely ParaImage is proposed to facilitate the training of long-text semantic alignment.

🔧 Dependencies and Installation

Python >= 3.10 (Recommend to use Anaconda or Miniconda)
PyTorch >= 1.13.0+cu11.7
diffusers == 0.20.0

diffusers

git clone https://github.com/weijiawu/ParaDiffusion
cd ParaDiffusion

conda create -n ParaDiffusion python=3.8
conda activate ParaDiffusion
pip install -r requirements.txt

⏬ Download Models

Download our pretrained model for the ParaDiffusion:

mkdir -p weight
cd weight

# download the weight of DragAnything to ./weight
git lfs install
git clone https://huggingface.co/weijiawu/ParaDiffusion

We provide two sets of UNet weights, and you can choose the corresponding one for testing and inference.

💻 Inference

python demo.py

✏️ Paragraph-Image Dataset: ParaImage-Small

The proposed ParaImage dataset mainly includes two parts:

(a) ParaImage-Big: High-quality images with generative captions (ParaImage-Big) are primarily employed for the paragraph-image alignment learning in Stage 2.

(b) ParaImage-Small: Aesthetic images with manual long-term description (ParaImage- Small) are primarily used for quality-tuning in Stage 3.

ParaImage-Small is a few thousand high-quality images are thoughtfully selected from LAION-Aesthetics, adhering to common principles in photography, then professionally annotated by skilled annotators.

The ParaImage-Small can be download from Google Drive

✏️ New Prompts Eval: ParaPrompts-400

The current test prompts focus on short text-to-image generation, ignoring the evaluation for paragraph-to-image generation, we introduced a new evaluation set of prompts called ParaPrompts, including 400 long-text descriptions.

The previous prompts testing was mostly concentrated on text alignments within the range of 0-25 words, while our prompts extend to long-text alignments of 100 words or more.

📖BibTeX

@misc{wu2023paradiffusion,
      title={Paragraph-to-Image Generation with Information-Enriched Diffusion Model}, 
      author={Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang},
      year={2023},
      eprint={2311.14284},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🤗Acknowledgements

Thanks to Diffusers for the wonderful work and codebase.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
ParaPrompts-400		ParaPrompts-400
asset		asset
README.md		README.md
demo.py		demo.py
pipeline_stable_diffusion_llama.py		pipeline_stable_diffusion_llama.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParaDiffusion

Paragraph-to-Image Generation with Information-Enriched Diffusion Model

🎶 Updates

🐱 Abstract

🔧 Dependencies and Installation

⏬ Download Models

💻 Inference

✏️ Paragraph-Image Dataset: ParaImage-Small

✏️ New Prompts Eval: ParaPrompts-400

📖BibTeX

🤗Acknowledgements

About

Releases

Packages

Languages

weijiawu/ParaDiffusion

Folders and files

Latest commit

History

Repository files navigation

ParaDiffusion

Paragraph-to-Image Generation with Information-Enriched Diffusion Model

🎶 Updates

🐱 Abstract

🔧 Dependencies and Installation

⏬ Download Models

💻 Inference

✏️ Paragraph-Image Dataset: ParaImage-Small

✏️ New Prompts Eval: ParaPrompts-400

📖BibTeX

🤗Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages