FluxForge is a memory-efficient implementation of Flux/Schnell models with support for quantization, IP-Adapter, LoRA, and ControlNet (coming soon). Run high-quality image generation models with less than 17GB VRAM!
- β¨ Memory-efficient inference (< 17GB VRAM) - Diffuser based IP adapter implementation.
- π§ Multiple model loading options (base, quantized, safetensors)
- π 8-bit quantization support
- π― IP-Adapter support (experimental)
- π LoRA and ControlNet support (coming soon)
- π Training script (coming soon)
# Install dependencies
pip install -r requirements.txtBefore running the inference code, you need to convert the model to FP8 format for optimal memory usage:
- First, run the conversion script:
python convert.py --model_path path/to/flux_model --output_path path/to/flux_model/flux-fp8 --quantization_type qfloat8- After conversion, update your config to use the quantized model in the fluxforge_main.py or externally:
from fluxforge_main import EnhancedFluxForge, ModelConfig, ModelType
# Basic configuration
config = ModelConfig(
model_type=ModelType.FLUX,
model_path="path/to/flux_model",
enable_quantization=True,
transformer_path="path/to/flux_model/flux-fp8",
transformer_loading_mode="quantized"
)
# Initialize and run
forge = FluxForge(config)
image = forge.generate(
prompt="A majestic statue clutching a glowing 'FluxForge' sign, standing proudly at the heart of a bustling railway station, its presence commanding attention amidst the flow of travelers.",
width=1024,
height=1024
)
image.save('output.png')The ModelConfig class supports various loading strategies:
# 1. You can Load quantized safetensors and base model with quantization but prefered mothod is pre-quantized model from convert.py
config = ModelConfig(
model_path="path/to/flux_model",
transformer_path="path/to/flux_model/flux-fp8",
transformer_loading_mode="quantized"
)# Configure for IP-Adapter
config = ModelConfig(
model_type=ModelType.FLUX_IP,
model_path="flux_model",
enable_quantization=True,
transformer_path="flux_model/flux-fp8",
transformer_loading_mode="quantized",
image_encoder_path="openai/clip-vit-large-patch14",
ip_ckpt="flux_model/flux_ip_adapter/ip_adapter.safetensors"
)
# Initialize forge
forge = EnhancedFluxForge(config)
# Load reference image for IP-Adapter
image = Image.open("assets/example_images/statue.jpg")
# Generate with IP-Adapter
output = forge.generate(
prompt="wearing glasses",
width=1024,
height=1024,
guidance_scale=4.0,
num_inference_steps=25,
seed=123456789,
image=image
)
output.save('luck5.png')
forge.memory_tracker.print_memory_stats("After Image Generation")The IP adapter is still not functioning as expected. I'm working on aligning its behavior to match the results of the IP adapter in ComfyUI for similar outcomes.

| Parameter | Description | Options |
|---|---|---|
model_type |
Model type | ModelType.FLUX, ModelType.FLUX_IP |
model_path |
Path to base model | Path string |
transformer_path |
Path to transformer | Path string (optional) |
dtype |
Model dtype | torch.bfloat16 (default) |
enable_quantization |
Enable quantization | True/False |
quantization_type |
Type of quantization | "qfloat8", "qint8" |
transformer_loading_mode |
Loading strategy | "base", "quantized", "safetensors" |
use_freeze |
Freeze model weights | True/False |
device_map |
Device mapping | "auto" (default) |
- Base Loading: Set
transformer_loading_mode="base"andenable_quantization=False - Quantization on Load: Set
transformer_loading_mode="base"andenable_quantization=True - Pre-quantized Model: Set
transformer_loading_mode="quantized" - Safetensors Loading: Set
transformer_loading_mode="safetensors"with.safetensorsfile
- LoRA support
- ControlNet integration
- Sub-16GB VRAM optimization
- Training scripts
- Full IP-Adapter integration in diffusers
- IP-Adapter is currently experimental
- Training scripts under development
- Some features may require slightly more VRAM than targeted
This repository uses Flux-dev, a non-commercial model by Black Forest Labs. By using this code, you agree to comply with the Flux-dev license agreement and terms of use. Please review the complete license terms at their official repository.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions and support, please open an issue in the GitHub repository.
Note: This project is under active development. Features and memory requirements may change as we continue to optimize the implementation.
