Skip to content

yejy53/GenClaw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

GenClaw: Code-Driven Agentic Image Generation

Paper arXiv GitHub stars

GenClaw explores code-driven agentic image generation: instead of only rewriting prompts, an image generation agent uses code as a controllable visual canvas before calling image generation models for final rendering.

The core idea is simple: think, sketch with code, then render.

Highlights

🎨 Code as a Visual Brush. The agent creates by writing executable visual sketches—SVG, HTML/CSS, Python, lightweight 3D code—turning object count, spatial layout, and text rendering into executable, verifiable, debuggable programs. Image synthesis shifts from implicit diffusion sampling to an explicit, reasoning-friendly process.

Draw as a Human Artist. We mirror the human creative loop—conceptualize → sketch → coloring → refine—and make every stage transparent: ideation, reference retrieval, drafting, and incremental rendering are all surfaced as inspectable, editable, revertible artifacts. Generation becomes an iterative collaboration rather than one-shot black-box inference.

🔌 Agent Harness for Image Generation. We plug an LLM agent's proven planning, tool-use, and reflection abilities directly into image synthesis, exploring an agent harness for image generation—so that creating images becomes a first-class capability inside the agent's toolbox, not an isolated standalone model.

Showcase

Visual Examples

Complex Scene Composition

Text Rendering and Poster Design

Physical Reasoning

Knowledge-Grounded Generation

Status

The technical report is available now. Code and demos are being prepared and will be released later.

Links

If you find this project interesting, please consider giving it a star and voting for the paper on Hugging Face.

Citation

If you find GenClaw useful, please consider citing our technical report:

@article{ye2026genclaw,
  title={GenClaw: Code-Driven Agentic Image Generation},
  author={Ye, Junyan and others},
  journal={arXiv preprint arXiv:2605.30248},
  year={2026}
}

About

GenClaw: Code-Driven Agentic Image Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors