-
Notifications
You must be signed in to change notification settings - Fork 455
Description
Feature Summary
Support for the lastest Flux Image generation and editing model
Detailed Description
FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions. It uses a Mistral VLM (maybe pixtral? it's Mistral Small 3.2) as text encoder.
At first glance the architecture seems similar to Flux.1, with Double Blocks followed by Single Blocks
There are significant architecture change to the blocks themselves. I feel like it's going to need a separate implementation to Flux.1.
It uses a 32 Channel VAE (maybe 128 channels according to ComfyUI? I'm confused here).
Additional context
Announcement: https://bfl.ai/blog/flux-2
Model weights: https://huggingface.co/black-forest-labs/FLUX.2-dev
Inference code (Reference): https://github.com/black-forest-labs/flux2
Inference code (ComfyUI): comfyanonymous/ComfyUI#10879
Seems like a (smaller?) distilled version called Flux.2[klein] is also soon-to-be released.