Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion
Matiur Rahman Minar1, Seunghun Oh2, Ganghyeon Jeong2, Unsang Park1,2
1Department of Computer Science and Engineering, Sogang University 2Department of Artificial Intelligence, Sogang University
- 📝 Technical Report / Paper
- 🌐 Project Homepage
- 💻 Training & Inference Code
- 🤗 Pretrained Model: T2V-1.3B
Steady-Forcing produces long-horizon nature video rollouts from a fixed-camera view. It decouples spatial persistence from motion continuity via a structural dual-memory protocol. This enables stable backgrounds and sustained fluid motion.
TL;DR: We propose a dual-memory framework that balances stability and motion to sustain high background persistence and continuous fluid dynamics over multi-minute horizons for fixed-camera nature video generation.
- Requirements
- Installation
- Pretrained Checkpoints
- Inference
- Training
- Results
- Citation
- Acknowledgements
- Nvidia GPU with at least 24 GB memory (tested on NVIDIA A100 with 80 GB VRAM)
- Linux operating system
Other hardware may work but has not been tested.
Create a Python 3.10 environment, install dependencies, and download models:
bash setup_env.shhf download minar09/Steady-Forcing-T2V-1.3B --local-dir ./ckptNote: The training algorithm is data-free distillation; no video data is needed.
After downloading, organize the checkpoints and prompts as follows:
steady-forcing/
├── prompts/
├── ckpt/
└── steady-forcing-t2v.pt
Run inference with the provided script:
bash inference.shThe repository can also be used for training and evaluation.
bash train.shThis training recipe was completed in under 67 hours on 8 A100 GPUs.
Quantitative and qualitative results are available in the paper. For detailed comparisons and visualizations, please refer to the arXiv preprint. For viewing generated videos, please visit the project page.
If you use this codebase, please cite:
@article{minar2025steady,
title={Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion},
author={Minar, Matiur Rahman and Oh, Seunghun and Jeong, Ganghyeon and Park, Unsang},
journal={arXiv preprint arXiv:2606.7661673},
year={2026}
}This project builds on the open-source Infinity-RoPE and Reward-Forcing implementation and acknowledges related work in long-horizon video diffusion, motion continuity, and spatial persistence. We sincerely appreciate their efforts and thank them.
