Stream-T1:
Test-Time Scaling for Streaming Video Generation

Yijing Tu¹, Shaojin Wu^3,†, Mengqi Huang^1,†, Wenchuan Wang¹, Yuxin Wang², Chunxiao Liu³, Zhendong Mao^1,*

¹ University of Science and Technology of China, ² FrameX.AI, ³ Independent Researcher

_{^* Corresponding author · ^† Project lead}

Overview

While Test-Time Scaling (TTS) offers a promising direction to enhance video generation without the surging costs of training, current test-time video generation methods based on diffusion models suffer from exorbitant candidate exploration costs and lack temporal guidance. To address these structural bottlenecks, we propose shifting the focus to streaming video generation. We identify that its chunk-level synthesis and few denoising steps are intrinsically suited for TTS, significantly lowering computational overhead while enabling fine-grained temporal control. Driven by this insight, we introduced Stream-T1, a pioneering comprehensive TTS framework exclusively tailored for streaming video generation. Evaluated on both 5s and 30s comprehensive video benchmarks, Stream-T1 demonstrates profound superiority, significantly improving temporal consistency, motion smoothness, and frame-level visual quality.

Method

Stream‑Scaled Noise Propagation: actively refines the initial latent noise of the generating chunk using historically proven, high-quality previous chunk noise, effectively establishes temporal dependency and utilizing the historical Gaussian prior to guide the current generation;
Stream‑Scaled Reward Pruning: comprehensively evaluates generated candidates to strike an optimal balance between local spatial aesthetics and global temporal coherence by integrating immediate short-term assessments with sliding-window-based long-term evaluations;
Stream‑Scaled Memory Sinking: dynamically routes the context evicted from KV-cache into distinct updating pathways guided by the reward feedback, ensuring that previously generated visual information effectively anchors and guides the subsequent video stream.

TODO List

Release the paper and project page.
Release the inference code.
Release test cases with our pretrained model, prompts, and reference image.

Requirements

The inference are conducted on 1 A800 GPU (80GB VRAM)

Setup

git clone https://github.com/FrameX-AI/Stream-T1.git
cd Stream-T1

cd metrics
https://github.com/KlingAIResearch/VideoAlign.git

Environment

All the tests are conducted in Linux. To set up our environment in Linux, please run:

conda create -n StreamT1 python=3.10 -y
conda activate StreamT1

pip install -r requirements.txt

Checkpoints

1.base model checkpoints

huggingface-cli download Efficient-Large-Model/LongLive --local-dir longlive_models
huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B --local-dir wan_models/Wan2.1-T2V-1.3B

2.reward model checkpoints

huggingface-cli download MizzenAI/HPSv3 --local-dir metrics/models/hpsv3_model
huggingface-cli download KlingTeam/VideoReward --local-dir metrics/models/videoalign

Inference

bash stream_scaling.sh

Citation:

Don't forget to cite this source if it proves useful in your research!

@misc{tu2026streamt1testtimescalingstreaming,
      title={Stream-T1: Test-Time Scaling for Streaming Video Generation}, 
      author={Yijing Tu and Shaojin Wu and Mengqi Huang and Wenchuan Wang and Yuxin Wang and Chunxiao Liu and Zhendong Mao},
      year={2026},
      eprint={2605.04461},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.04461}, 
}

Acknowledgement:

LongLive: the codebase and algorithm we built upon. Thanks for their wonderful work.
HPSv3 and videoalign: the reward model we use. Thanks for their wonderful work.

License

See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
configs		configs
model		model
pipelines		pipelines
prompts		prompts
utils		utils
wan		wan
README.md		README.md
inference.py		inference.py
inference.sh		inference.sh
interactive_inference.py		interactive_inference.py
interactive_inference.sh		interactive_inference.sh
requirements.txt		requirements.txt
stream_scaling.py		stream_scaling.py
stream_scaling.sh		stream_scaling.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stream-T1:
Test-Time Scaling for Streaming Video Generation

Overview

Method

TODO List

Requirements

Setup

Environment

Checkpoints

Inference

Citation:

Acknowledgement:

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Stream-T1: Test-Time Scaling for Streaming Video Generation

Overview

Method

TODO List

Requirements

Setup

Environment

Checkpoints

Inference

Citation:

Acknowledgement:

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Stream-T1:
Test-Time Scaling for Streaming Video Generation

Packages