Train & run diffusion models 3x faster with 80% less VRAM
Optimized inference and fine-tuning framework for image & video diffusion models.
Train LoRAs on diffusion models with minimal code. Add your images, run the script, and export your trained LoRA.
from hypergen import model, dataset
m = model.load("stabilityai/stable-diffusion-xl-base-1.0")
ds = dataset.load("./my_images")
lora = m.train_lora(ds, steps=1000)
That's it! HyperGen handles optimization, memory management, and acceleration automatically.
Try HyperGen in interactive Jupyter notebooks:
Notebooks include:
- Loading datasets from HuggingFace
- Training LoRAs with real diffusion models
- Generating images with trained models
- Side-by-side comparisons
pip install hypergen
git clone https://github.com/ntegrals/hypergen.git
cd hypergen
pip install -e .
Model Family | Model ID | Type |
---|---|---|
FLUX.1-dev | black-forest-labs/FLUX.1-dev |
Image |
FLUX.1-schnell | black-forest-labs/FLUX.1-schnell |
Image (Fast) |
SDXL | stabilityai/stable-diffusion-xl-base-1.0 |
Image |
SDXL Turbo | stabilityai/sdxl-turbo |
Image (Fast) |
SD 3 Medium | stabilityai/stable-diffusion-3-medium-diffusers |
Image |
SD v1.5 | runwayml/stable-diffusion-v1-5 |
Image |
Universal Support: HyperGen works with any diffusers-compatible model from HuggingFace.
from hypergen import model, dataset
# Load model and dataset
m = model.load("stabilityai/stable-diffusion-xl-base-1.0")
m.to("cuda")
ds = dataset.load("./my_images")
# Train LoRA
lora = m.train_lora(ds, steps=1000)
from hypergen import dataset
# Load images from a folder
ds = dataset.load("./my_training_images")
print(f"Loaded {len(ds)} images")
# Supports captions! Just put a .txt file next to each image:
# my_images/
# photo1.jpg
# photo1.txt <- "A beautiful sunset"
# photo2.jpg
# photo2.txt <- "A mountain landscape"
# Customize everything
lora = m.train_lora(
ds,
steps=2000,
learning_rate=5e-5,
rank=32, # LoRA rank
alpha=64, # LoRA alpha
batch_size=2, # Or "auto"
save_steps=500, # Save checkpoints
output_dir="./checkpoints"
)
# Basic generation
image = m.generate("A cat holding a sign that says hello world")
# With options
images = m.generate(
["A sunset", "A mountain"],
num_inference_steps=30,
guidance_scale=7.5
)
HyperGen provides a production-ready API server with request queuing, similar to vLLM.
# Basic serving
hypergen serve stabilityai/stable-diffusion-xl-base-1.0
# With authentication
hypergen serve stabilityai/stable-diffusion-xl-base-1.0 --api-key token-abc123
# With LoRA
hypergen serve stabilityai/stable-diffusion-xl-base-1.0 --lora ./my_lora --api-key token-abc123
# Custom settings
hypergen serve black-forest-labs/FLUX.1-dev \
--port 8000 \
--dtype bfloat16 \
--max-queue-size 100 \
--max-batch-size 4
from openai import OpenAI
# Point to your HyperGen server
client = OpenAI(
api_key="token-abc123",
base_url="http://localhost:8000/v1"
)
# Generate images (OpenAI-compatible API)
response = client.images.generate(
model="sdxl",
prompt="A cat holding a sign that says hello world",
n=2,
size="1024x1024"
)
API Server Features:
- OpenAI-compatible drop-in replacement for image generation
- Automatic request batching and queuing
- Dynamic LoRA loading and switching
- Optional API key authentication
- Production-ready (FastAPI + uvicorn)
- Dead Simple API: Train LoRAs in 5 lines of code - simple for beginners, powerful for experts
- Universal Model Support: Works with FLUX, SDXL, SD3, and any diffusers-compatible model
- Optimized Performance: 3x faster training with 80% less VRAM
- Production Serving: OpenAI-compatible API server with request queuing
- Built on Best Practices: Leverages diffusers, PEFT, and PyTorch under the hood
Interactive Jupyter notebooks with complete tutorials in notebooks/:
- minimal_example.ipynb - 5-minute quickstart example
- train_lora_quickstart.ipynb - Complete end-to-end tutorial with HuggingFace dataset
Code samples in the examples/ directory:
- quickstart.py - Minimal 5-line training example
- complete_example.py - All features demonstrated
- serve_client.py - API client usage examples
- Model loading and management
- Dataset handling with caption support
- LoRA training implementation
- OpenAI-compatible API server
- Request queue management
- Training loop with noise prediction
- Gradient checkpointing
- Mixed precision training
- Flash Attention support
- Auto-configuration for optimal performance
- Request batching for inference
- Multi-GPU training support
- Multi-GPU serving
- Custom CUDA kernels
- Hot-swappable LoRAs
hypergen/
βββ model/ # Model loading and management
βββ dataset/ # Dataset handling with captions
βββ training/ # LoRA training pipelines
βββ serve/ # API server and queue management
βββ inference/ # Inference optimizations
βββ optimization/ # Performance improvements
pip install hypergen
git clone https://github.com/ntegrals/hypergen.git
cd hypergen
pip install -e .
Requirements: Python 3.10+
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.
Type | Links |
---|---|
π Examples | View Examples Directory |
π Issues | Report Issues |
π¬ Discussions | Join Discussions |
Note on Aura Voice: This repository previously hosted Aura Voice, an early tech demo showcasing AI voice capabilities. As the underlying technology evolved significantly beyond that initial demonstration, the demo is no longer representative of current capabilities and has been deprecated.
Thank you to everyone who supported and used Aura Voice! The original code remains accessible at commit 00c18d2 for reference.
HyperGen represents a new direction focused on optimized diffusion model training and serving.