Skip to content

ZiyuGuo99/Thinking-while-Generating

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thinking-while-Generating (TwiG)

Official repository for the paper "Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation".

[🌍 Project Page] [📖 Paper] [🤗 TwiG-50K Dataset]

💥 News

  • [2025.11.20] The paper “Thinking-while-Generating” is released on arXiv. 🚀

💭🎨 Thinking-while-Generating (TwiG)

Existing methods inject textual reasoning either before (pre-planning) or after (post-refinement) visual generation.
TwiG is the first framework to interleave textual reasoning throughout the entire visual synthesis process.

We weave textual thoughts directly into the unfolding canvas, providing on-the-fly semantic guidance and reflection during generation.

Interleaving textual reasoning throughout visual generation.

📌 Where is textual reasoning applied?

🚀 Framework

TwiG decouples generation into Scheduling (When to Think), Reasoning (What to Say), and Reflection (How to Refine).

💪 Get Started

Installation

Clone the repository:

git clone https://github.com/ZiyuGuo99/Thinking-while-Generating.git
cd Thinking-while-Generating

Create a conda environment:

conda create -n twig python=3.11
conda activate twig

Please follow the instructions here to install both PyTorch and TorchVision dependencies.

Install additional dependencies:

pip install -r requirements.txt

📊 Evaluation

  1. Download the Janus-Pro model weights from this link.

  2. Configure the model and output paths in twig.sh:

model_path=Janus-Pro-7B   # Local path to the pre-trained model weights
output_folder=twig        # Subfolder name for this evaluation task
base_output=./results     # Root directory for saving generated images and text results
  1. Run the following command:
bash twig.sh

🖼️ Visualizations

1. Qualitative Comparison

2. Reflection Capacity

3. The Thinking Process

✔️ Citation

Please cite us if you find this project helpful:

@article{guo2026thinking,
  title={Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation},
  author={Guo, Ziyu and Zhang, Renrui and Li, Hongyu and Zhang, Manyuan and Chen, Xinyan and Wang, Sifan and Feng, Yan and Pei, Peng and Heng, Pheng-Ann},
  journal={arXiv:2511.16671},
  year={2025}
}

📜 Related Work

  • Image Generation with CoT: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step.
  • T2I-R1: T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT.
  • DPO vs. GRPO: Delving into RL for Image Generation with CoT: A comprehensive comparison of DPO and GRPO.

About

The first Interleaved framework for textual reasoning within the visual generation process

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors