Evaluation code

ProactiveVideoQA: A Comprehensive Benchmark Evaluating Proactive Interactions in Video Large Language Models

Introduction

ProactiveVideoQA is the first comprehensive benchmark designed to evaluate a system's ability to engage in proactive interaction in multimodal dialogue settings. Unlike traditional turn-by-turn dialogue systems, in proactive intraction model need to determine when to repsond during the playback, so both response timing and response textual content are important points for evaluation.

This repository hosts the dataset for ProactiveVideoQA, along with the evaluation code of our proposed temporal-aware metric, PAUC (Proactive Area Under the Curve).

Dataset Statistics

ProactiveVideoQA contains 4 tasks:

Proactive web-video QA [WEB]: centering on general web-video understanding.
Proactive ego-centric video QA [EGO]: centering on first-person-view video comprehension, particularly relevant in robotics and daily assistant applications.
Proactive TV-series video QA [TV]: emphasizing dialogue and social relationship understanding with speech input, and
Proactive video anomaly detection [VAD] targeting surveillance video monitoring and alerting.

1377 videos from different sources
1427 different qeustions, and 3510 ground truth reply turns
Fully proactive questions and open-ended answers ✅

Project Structure

ProactiveVideoQA/  
├── data/       # ProactiveVideoQA test data (Download from [🤗Huggingface](https://huggingface.co/datasets/wangyueqian/ProactiveVideoQA))
├── pauc/       # PAUC scoring scripts

Dataset

Download the annotations and videos from 🤗Huggingface, and rename the folder to ./data.

Evaluation code

See pauc/README.md.

Contact

If you find ProactiveVideoQA useful in your research, please cite our paper:

@misc{wang2025proactivevideoqacomprehensivebenchmarkevaluating,
      title={ProactiveVideoQA: A Comprehensive Benchmark Evaluating Proactive Interactions in Video Large Language Models}, 
      author={Yueqian Wang and Xiaojun Meng and Yifan Wang and Huishuai Zhang and Dongyan Zhao},
      year={2025},
      eprint={2507.09313},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.09313}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
pauc		pauc
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ProactiveVideoQA: A Comprehensive Benchmark Evaluating Proactive Interactions in Video Large Language Models

Introduction

Dataset Statistics

Project Structure

Dataset

Evaluation code

Contact

About

Uh oh!

Releases

Packages

Languages

yellow-binary-tree/ProactiveVideoQA

Folders and files

Latest commit

History

Repository files navigation

ProactiveVideoQA: A Comprehensive Benchmark Evaluating Proactive Interactions in Video Large Language Models

Introduction

Dataset Statistics

Project Structure

Dataset

Evaluation code

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages