Skip to content

Eddie0521/NarraStream-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

Jinzhuo Liu1, Jiangning Zhang1, Wencan Jiang1, Yabiao Wang2, Dingkang Liang3, Zhucun Xue1, Ran Yi4, Yong Liu1

1Zhejiang University,    2Tencent Youtu Lab,    3Huazhong University of Science and Technology,
4Shanghai Jiao Tong University    Corresponding author

   

📷 Introduction

We introduce NarraStream-Bench, a benchmark for narrative streaming video generation that features 324 multi-prompt scripts spanning six dimensions and a three-dimensional evaluation protocol that integrates both traditional metrics and multimodal large language model- based assessment. The benchmark is introduced together with IAMFlow.

✨ Highlights

1. Overview of NarraStream-Bench

Overview of NarraStream-Bench

2. Benchmark Comparison

Comparison of related long-video generation benchmarks.

Benchmark VQ TC IC Prompt Type Aggregation Strategy Year
VBench-Long × × Single Slow-Fast Avg. 2024
LV-Bench × Single VDE 2025
NarrLV × Single TNA-based QA 2025
NarraStream-Bench Multi Narrative-Aware 2026

🛠️ Installation

1. Install requirements

git clone git@github.com:Eddie0521/NarraStream-Bench.git
cd NarraStream-Bench
conda create -n NarraStream-Bench python=3.10
conda activate NarraStream-Bench

# Install a PyTorch build that matches your CUDA/runtime first.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt

2. Download checkpoints

Download the metric backbones and auxiliary weights:

bash scripts/download_weights.sh

By default, checkpoints are saved to ./pretrained and resolved by configs/paths.yaml. Expected checkpoints include CLIP, DINO, RAFT, AMT, VTSS, and LanguageBind video weights.

🔑 Usage

1. Prepare the api key

NarraStream-Bench uses API-backed MLLM/VLM metrics by default. Set your API key before running the full evaluation:

export SILICONFLOW_API_KEY=your_api_key

2. Prepare the evaluation data

Prepare generated videos and prompts in the following structure:

your_dataset/
├── prompt.jsonl
└── video/
    ├── sample_0.mp4
    ├── sample_1.mp4
    └── ...

Each line in prompt.jsonl should contain one sample:

{"prompts": ["segment prompt 1", "segment prompt 2", "segment prompt 3"]}

The number of videos must match the number of prompt samples. If videos are not named as sample_0.mp4, sample_1.mp4, ..., NarraStream-Bench will read all supported video files in natural sorted order.

3. Run the command

bash scripts/run_narrastream_bench.sh \
  --run-name my_eval \
  --video-dir your_dataset/video \
  --prompts your_dataset/prompt.jsonl \
  --gpu-id 0

4. See the output

Results are saved under runs/<run-name>/ by default:

runs/<run-name>/
├── processed/
│   ├── eval_data.json
│   ├── .preprocess_signature
│   └── sample_*/
│       ├── seg_0.mp4
│       ├── seg_1.mp4
│       └── ...
└── results/
    ├── results_latest.json
    ├── results_YYYYMMDD_HHMMSS.json
    ├── steps/
    ├── raw_metrics/
    └── artifacts/

The main files to inspect are:

  • results_latest.json: latest resumable snapshot, updated after each metric.
  • results_YYYYMMDD_HHMMSS.json: final timestamped result file.
  • processed/eval_data.json: preprocessed segment metadata.

🌟 Citation

Please leave us a star 🌟 and cite our paper if you find our work helpful.

@misc{liu2026advancingnarrativelongvideo,
      title={Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory}, 
      author={Jinzhuo Liu and Jiangning Zhang and Wencan Jiang and Yabiao Wang and Dingkang Liang and Zhucun Xue and Ran Yi and Yong Liu},
      year={2026},
      eprint={2605.18733},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.18733}, 
}

About

Benchmark introduced in "Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors