⚡️Qwen-Image-Fast

🔥Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs (< 48GiB)

⚙️Installation

pip3 install torch==2.9.0 # >= 2.7.1
pip3 install -U "cache-dit[all]" # >= 1.0.9
pip3 install git+https://github.com/huggingface/diffusers.git # latest main

📚Examples

We have release a Hybrid Acceleration example (📚qwen_image_fast.py) with 4.8x🎉 speedup in this repo for Qwen-Image, feel free to take a try (Hybrid Cache Acceleration + Context Parallelism + FP8 Weight Only + Torch Compile). For example:

# Baseline (NVIDIA L20 48GiB, ~120s w/ Model CPU Offload)
python3 qwen_image_fast.py --height 1024 --width 1024

# + (DBCache + TaylorSeer) 
# + Context Parallelism (Ulysses)
# + FP8 Weight Only (not require offload anymore) 
# + Torch Compile (NVIDIA L20x2, ~25s, ~4.8x speedup)
torchrun --nproc_per_node=2 qwen_image_fast.py \
         --height 1024 --width 1024 \
         --parallel-type ulysses --quantize \
         --cache --Fn 1 --rdt 0.12 --mcc 2 --taylorseer \
         --compile

🤖Baseline w/o Acceleration	🎉w/ Hybrid Acceleration
~120s, 60+ GiB per GPU	~25s, ~4.8x speedup, 36 GiB per GPU

This repo is based on cache-dit and diffusers. Many thanks to these awesome open-source projects.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
qwen_image_fast.py		qwen_image_fast.py
setup.cfg		setup.cfg
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚡️Qwen-Image-Fast

⚙️Installation

📚Examples

©️Acknowledgements

About

Uh oh!

Releases 1

Packages

Languages

License

xlite-dev/qwen-image-fast

Folders and files

Latest commit

History

Repository files navigation

⚡️Qwen-Image-Fast

⚙️Installation

📚Examples

©️Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages