This repository provides the implementation of MrFlow, a training-free staged sampling method for accelerating pretrained flow-matching text-to-image diffusion models.
MrFlow first samples a low-resolution image, upsamples the decoded result in pixel space with Real-ESRGAN, re-encodes the upsampled image, injects scheduler-consistent low-strength noise, and performs a short high-resolution refinement. The pipeline shifts most denoising cost from expensive high-resolution sampling to cheaper low-resolution sampling while preserving local detail quality.
- Training-free deployment. No finetuning, learned upsampler, or model-specific retraining is required.
- No custom kernels. The implementation uses standard PyTorch, Diffusers pipelines, and scheduler controls.
- Strong aggressive-speed regime. MrFlow reaches more than
10xend-to-end speedup on Qwen-Image while preserving visual quality. - Works with distilled models. The same pipeline can be combined with pretrained timestep-distilled models such as Pi-Flow and FLUX-schnell.
- Compact staged design. The implementation transfers across Qwen-Image, FLUX.1-dev, FLUX.2 Klein, and Z-Image families.
- [2026/07] 💡 We add a Practical Tips section and encourage everyone to share useful observations and takeaways with each other.
- [2026/07] 🌱 We add a community contribution area and welcome developers to share MrFlow ports, workflows, and experiments with each other.
- [2026/07] 📰 MrFlow is featured on Hugging Face Daily Papers.
- [2026/07] ⚡ We release the MrFlow ComfyUI plugin.
- [2026/07] 🔥 The MrFlow paper is available on arXiv, and the source code is released.
Create a Diffusers-compatible environment for the target backbone. The demos use:
- PyTorch
- Diffusers
- Transformers
- Real-ESRGAN
MrFlow uses Real-ESRGAN for x2 pixel-space super-resolution. Install Real-ESRGAN from the official project and download the x2 weights:
https://github.com/xinntao/Real-ESRGAN
The scripts contain placeholder checkpoint paths. Replace them with local paths to the pretrained text-to-image model and Real-ESRGAN x2 weights before running.
The repository root keeps only two minimal reference scripts plus the shared scheduler helper:
| Script | Model | Setting | Output |
|---|---|---|---|
qwen_image_mrflow.py |
Qwen-Image | MrFlow 12plus1 |
outputs/qwen_image_mrflow_12plus1/ |
flux1_mrflow.py |
FLUX.1-dev | MrFlow 12plus1 |
outputs/flux1_mrflow_12plus1/ |
Edit the checkpoint paths at the top of each script:
MODEL = "/path/to/Qwen-Image"
REALESRGAN_X2 = "/path/to/RealESRGAN_x2.pth"Run:
python qwen_image_mrflow.pypython flux1_mrflow.pyEach script saves:
stage1_low.png: low-resolution generated image.stage2_upscaled.png: Real-ESRGAN x2 upsampled image.stage3_refined.png: final high-resolution refined image.
| Setting | Low-resolution steps | Refinement steps | Direct sigma | Typical use |
|---|---|---|---|---|
12plus1 |
12 | 1 | 0.12 |
Aggressive acceleration. |
20plus1 |
20 | 1 | 0.12 |
Higher-quality operating point. |
The high-resolution refinement uses an explicit direct-sigma schedule. For example, 12plus1 denotes 12 low-resolution denoising steps followed by one high-resolution step from sigma=0.12 to 0.
Representative end-to-end speedups:
| Backbone | Setting | End-to-end speedup |
|---|---|---|
| FLUX.1-dev | 12 + 1 |
8.25x |
| Qwen-Image | 12 + 1 |
10.3x |
| FLUX.2 Klein Base 9B | 12 + 1 |
8.79x |
| Z-Image-Turbo | 8 + 1 |
21.0x |
| Qwen-Image + Pi-Flow | 4 + 1 |
up to 25x |
Speedups are measured end to end, including text encoding, VAE encode/decode, super-resolution, noise preparation, and diffusion forward passes.
Parameterized variants and additional model-family demos are available in examples/.
| Script | Backbone | Notes |
|---|---|---|
examples/flux1_mrflow.py |
FLUX.1-dev | Training-free MrFlow. |
examples/flux1_piflow_mrflow.py |
FLUX.1-dev + Pi-Flow | Combines MrFlow with distilled weights. |
examples/qwen_image_mrflow.py |
Qwen-Image | Training-free MrFlow. |
examples/qwen_image_piflow_mrflow.py |
Qwen-Image + Pi-Flow | Combines MrFlow with distilled weights. |
examples/flux2_mrflow.py |
FLUX.2 Klein | Base and non-base variants. |
examples/zimage_turbo_mrflow.py |
Z-Image-Turbo | Reduced-step model plus MrFlow refinement. |
Run all configured examples with:
bash examples/run_examples.shSee examples/README.md for command-line usage, FLUX.2 Klein presets, Z-Image-Turbo refinement defaults, and output filename conventions.
Pi-Flow examples are optional and require a separate local checkout of LakonLab. Set LAKONLAB_ROOT to that checkout before running the Pi-Flow scripts.
Qwen-Image generation examples. With 12 low-resolution steps and one high-resolution refinement step, MrFlow produces diverse 1024-resolution samples on Qwen-Image while reaching above 10x end-to-end speedup.
Accuracy-efficiency trade-off. On FLUX.1-dev and Qwen-Image, MrFlow offers a flexible trade-off between generation quality and measured end-to-end speedup, and remains effective where other training-free strategies degrade sharply.
Runtime breakdown. For Qwen-Image 12plus1, measured end-to-end latency is 4.77s versus 49.32s for native 50-step inference. The main cost is shifted from high-resolution sampling to cheaper low-resolution sampling, while SR and VAE overhead remain small.
The repository also includes ComfyUI-MrFlow/, a ComfyUI custom-node extension for Qwen-oriented MrFlow workflows. It provides helper nodes, editable workflow and API JSON examples, a reusable subgraph, and a model-link helper for split Qwen-Image bundles.
To use it, place or symlink ComfyUI-MrFlow/ into ComfyUI/custom_nodes/, restart ComfyUI, and open ComfyUI-MrFlow/examples/qwen_mrflow_workflow.json or load ComfyUI-MrFlow/subgraphs/qwen_mrflow.json.
We have seen strong community interest in adapting MrFlow to additional model families, ComfyUI loaders, and local workflows. To make these efforts easier to share early, community contributions are collected in community/experimental/ before selected pieces are tested and polished. Contributions that are ready for broader reuse may later move to a sibling community/verified/ area, or be promoted into the main examples or plugin folders when they become part of the official workflow.
Pull requests are preferred because they are easier to review and track. If you are not familiar with GitHub PRs, it is also fine to open an issue, link your code or workflow, and tag the maintainers directly.
A few notes from our open-source release and community testing:
- After releasing the open-source version, we have found that keeping the high-resolution refinement to a single step while using a larger direct sigma, such as
0.16-0.20, can often improve visual quality, especially for generations that include text. If this matches your experience, feel free to share your feedback anytime. - RealRebelAI appears to have found strong MrFlow results on Krea-2, one of the newer state-of-the-art models, and we encourage everyone to try MrFlow on more recent models and share what they discover.
- RealRebelAI also found that using
4x Foolhardy Remacrifor 2048-resolution output can work well with MrFlow. We encourage everyone to try different super-resolution ratios or stronger super-resolution models, since this kind of exploration is very much in the spirit of MrFlow.
If you find MrFlow useful, please cite our paper:
@misc{zheng2026multiresolutionflowmatchingtrainingfree,
title={Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling},
author={Xingyu Zheng and Xianglong Liu and Yifu Ding and Weilun Feng and Junqing Lin and Jinyang Guo and Haotong Qin},
year={2026},
eprint={2607.01642},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2607.01642},
}This implementation builds on the Diffusers ecosystem and uses Real-ESRGAN for pixel-space super-resolution.




