Jung Yi
·
Wooseok Jang
·
Paul Hyunbin Cho
·
Jisu Nam
·
Heeji Yoon
·
Seungryong Kim
KAIST AI
Deep Forcing is a training-free framework that enables long-video generation in autoregressive video diffusion models by combining Deep Sink and Participative Compression. Deep Forcing achieves more than 12× length extrapolation (5s → 60s+) without fine-tuning.
-
Deep Sink maintains a substantially enlarged attention sink (~50% of cache), with temporal RoPE adjustment, ensuring temporal coherence between sink tokens and current frames.
-
Participative Compression selectively prunes redundant tokens by computing attention scores from recent frames, retains only the top-C most contextually relevant tokens while evicting redundant and degraded tokens.
We tested this repo on the following setup:
- Nvidia GPU with at least 24 GB memory (RTX 3090, A6000, and H100 are tested).
- Linux operating system.
- 64 GB RAM.
Other hardware setup could also work but hasn't been tested.
Create a conda environment and install dependencies:
conda create -n self_forcing python=3.10 -y
conda activate self_forcing
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
python setup.py develop
huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B --local-dir-use-symlinks False --local-dir wan_models/Wan2.1-T2V-1.3B
huggingface-cli download gdhe17/Self-Forcing checkpoints/self_forcing_dmd.pt --local-dir .
Note:
-
Our model works better with long, detailed prompts since it's trained with such prompts. It is recommended to use third-party LLMs (such as GPT-4o) to extend your prompt before providing to the model.
-
Currently demo.py is not supported for Deep Forcing. Stay tuned.
Example inference script:
bash DS_inference.sh
CUDA_VISIBLE_DEVICES=0 python inference.py \
--config_path configs/self_forcing_dmd/self_forcing_dmd_sink14.yaml \
--output_folder ./output/DS \
--checkpoint_path checkpoints/self_forcing_dmd.pt \
--data_path ./prompts/MovieGenVideoBench_txt/line_0010.txt \
--use_ema \
--is_ds_only 1
Note:
- Sink size 10–14 is recommended for Deep Sink–only inference (configs: self_forcing_dmd_sink10.yaml–self_forcing_dmd_sink14.yaml).
Example inference script:
bash DS_PC_inference.sh
CUDA_VISIBLE_DEVICES=0 python inference.py \
--config_path configs/self_forcing_dmd/self_forcing_dmd_sink10.yaml \
--output_folder ./output/DS_PC \
--checkpoint_path checkpoints/self_forcing_dmd.pt \
--data_path ./prompts/MovieGenVideoBench_txt/line_0043.txt \
--use_ema
This codebase is built on top of the open-source implementation of Self Forcing by Xun Huang.
If you find this codebase useful for your research, please kindly cite our paper:
@article{yi2025deep,
title={Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression},
author={Yi, Jung and Jang, Wooseok and Cho, Paul Hyunbin and Nam, Jisu and Yoon, Heeji and Kim, Seungryong},
journal={arXiv preprint arXiv:2512.05081},
year={2025}
}