wan_720.mp4
DPCache accelerates diffusion models via prediction-based caching, using dynamic programming with Path-Aware Cost Tensor (PACT) to optimize the caching schedule.
- Efficient Caching: Dynamic programming scheduling with Taylor expansion predictor
- Path-Aware Cost Tensor: 3D cost matrix for optimal step selection
- Easy Adaptation: Modular design with
CacheHelperfor quick integration - Reference Examples: FLUX (text-to-image) and Wan2.1 (image-to-video) implementations
- [2025-03] 🚀 Released with support for FLUX (text-to-image) and Wan2.1 (image-to-video).
- [2025-02] 🎉 DPCache has been accepted to CVPR 2026!
| Model | Type | Reference Implementation |
|---|---|---|
| FLUX.1-dev | Text-to-Image | dpcache_flux.py |
| Wan2.1-I2V-14B-720P | Image-to-Video | dpcache_wan.py |
- Python 3.10+
- CUDA 12.0+ (recommended)
- Additional GPU memory is required during calibration
Install dependencies:
pip install -r requirements.txtInference with pre-calibrated cost matrix:
python dpcache_flux_infer.py --mode infer \
--k 13 \
--first_full_steps 3 \
--dataset drawbench \
--sample_size 100 \
--output_path "flux_output"Calibration (generate custom cost matrix):
python dpcache_flux_infer.py --mode calibrate \
--cali_prefix "flux_calibration" \
--dataset drawbench \
--sample_rule fix \
--sample_size 10 \
--output_path "calibration_results"This will generate a cost matrix file named final_cost_matrix_flux_calibration.pkl. Use it in inference via --cost_matrix_path "final_cost_matrix_flux_calibration.pkl".
Inference with pre-calibrated cost matrix:
python dpcache_wan_infer.py --mode infer \
--image_path test.jpg \
--prompt "a blue car driving down a dirt road near train tracks" \
--k 12 Disable cache (baseline):
python dpcache_wan_infer.py --mode infer --no_cacheDPCache can be integrated into other diffusion models. Follow these steps:
- Import CacheHelper from
dpcache/cache_utils.py - Initialize cache using
init_cache()with your model's transformer blocks - Wrap forward pass using
CacheHelperto manage caching logic (customize config as needed) - Run calibration to generate cost matrix for your specific model/config
By default, this repo uses Taylor expansion-based predictors (as in TaylorSeer), but any other predictor can be plugged in, as long as the same predictor is used consistently in both calibration and inference.
Reference implementations:
dpcache_flux.py- Integration with FLUX transformer blocksdpcache_wan.py- Integration with Wan2.1 transformer blocks
For adapting other diffusion models, please refer directly to the above files for how to:
- initialize cache with
init_cache - manage caching logic via
CacheHelperin the main transformer forward - cache or reuse block outputs inside each transformer block
If you use this code, please cite:
@article{DPCache2026,
title={Denoising as Path Planning: Training-Free Acceleration of Diffusion Models with DPCache},
author={Cui, Bowen and Wang, Yuanbin and Xu, Huajiang and Chen, Biaolong and Zhang, Aixi and Jiang, Hao and Jin, Zhengzheng and Liu, Xu and Huang, Pipei},
journal={arXiv preprint arXiv:2602.22654},
year={2026}
}This repository is built upon Diffusers. We thank the following projects for their great work and contributions:
- Models: FLUX by Black Forest Labs, Wan2.1 by Wan-AI
- Methods: TaylorSeer for Taylor expansion-based feature prediction
- Datasets: DrawBench, PartiPrompts, VBench for quality evaluation
