Skip to content

abutbul/imgen

Repository files navigation

Enhanced Sprite Generation Pipeline with ControlNet

A Python script for generating consistent 4-directional sprite views using Stable Diffusion models with optional ControlNet support for improved pose consistency.

✨ New Features

  • 🎮 ControlNet Integration: Use pose control for better consistency across directional views
  • 🎭 Pose Templates: Predefined poses for each direction (north, south, east, west)
  • 🔧 Configurable Strength: Adjust ControlNet influence with --controlnet-strength
  • 📐 Automatic Pose Generation: Creates appropriate pose controls for each viewing angle

Features

  • 🎯 Generate 4-directional sprite views (north, south, east, west)
  • 🎮 NEW: ControlNet support for pose consistency
  • ⚡ Light mode for faster generation with --light flag
  • 🛑 Graceful interruption handling (Ctrl+C)
  • 🎨 Support for both SD v1.5 and SDXL models
  • 📁 Organized output with hashed folder names

Installation

Quick Setup

python setup.py

Manual Installation

pip install -r requirements.txt
python pose_templates.py  # Generate pose templates

Usage

Basic usage:

python pipeline.py "translucent jelly bear"

With ControlNet for better consistency:

python pipeline.py "translucent jelly bear" --controlnet

Fast generation with ControlNet:

python pipeline.py "translucent jelly bear" --light --controlnet

Adjust ControlNet influence:

python pipeline.py "translucent jelly bear" --controlnet --controlnet-strength 0.6

Command line options:

  • --light: Enable light mode (fewer inference steps, lower resolution for SDXL)
  • --models: Choose which models to use (v15, xl, or both)
  • --controlnet: Enable ControlNet for pose consistency
  • --controlnet-strength: Adjust ControlNet conditioning strength (0.0-1.0, default: 0.8)

ControlNet Features

How it Works

  1. Pose Templates: The system uses predefined pose skeletons for each direction

    • North: Front-facing pose
    • South: Back-facing pose
    • East: Right-facing profile
    • West: Left-facing profile
  2. Consistency: ControlNet ensures that the generated character maintains similar body proportions and pose across all directional views

  3. Flexibility: Adjust the --controlnet-strength parameter to balance between pose consistency and creative freedom

Benefits

  • More consistent character proportions across views
  • Better alignment for sprite sheet usage
  • Improved anatomical accuracy
  • Professional-quality sprite sets

Light Mode Optimizations

When using --light flag:

  • Reduces inference steps from 25 to 12
  • Lowers guidance scale from 7.5 to 5.5
  • Reduces SDXL resolution from 1024x1024 to 768x768
  • Enables memory optimizations (attention slicing, CPU offload)

Interruption Handling

You can safely interrupt the generation process with Ctrl+C. The script will:

  • Finish generating the current image
  • Save any completed images
  • Exit gracefully

Output Structure

Images are saved in a folder named: {prompt}_{hash}/

  • v1.5 images: v15_{prompt}_{direction}.png
  • SDXL images: xl-base_{prompt}_{direction}.png

About

Python script for generating consistent 4-directional sprite views using Stable Diffusion models with optional ControlNet support for improved pose consistency

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages