1National University of Singapore
2Singapore Management University
3Carnegie Mellon University
4Fudan University
✉Corresponding author
OSCBench is a benchmark for evaluating whether text-to-video models can generate correct and temporally consistent object state changes.
This repository currently contains prompt resources, action/object taxonomies, frame extraction code, an MLLM-based evaluation script, and a correlation analysis script for comparing automatic judgments with human ratings.
- Install the required dependencies:
pip install openai opencv-python numpy scipy- Set up your OpenAI API credentials in
mllm_eval.py:
OPENAI_API_KEY = "YOUR_OPENAI_API_KEY"Generate videos from OSCBench prompts using your target text-to-video model, then extract evenly sampled frames for automatic evaluation.
Use prompts from prompts.txt or one of the split files under prompt_splits/.
This script samples 20 evenly spaced frames from each video and saves them into one subfolder per video.
Usage:
python extract_frames.pyThis script evaluates sampled frames using an MLLM through the OpenAI Responses API. It asks the model to inspect the frames chronologically and return evidence-backed scores for eight dimensions:
1aSubject Alignment1bManipulated Object Alignment2aAction Accuracy3aObject State Change Accuracy3bObject Change Continuity and Consistency4aScene Alignment5aRealism5bAesthetic
Among these dimensions, 3a and 3b directly measure the object state change ability emphasized by OSCBench.
Usage:
python mllm_eval.pyThis script analyzes the correlation between MLLM-based evaluation and human evaluation, following the benchmark's automatic-evaluation validation setting described on the project page and in the paper.
It computes per-dimension:
- Kendall's tau
- Spearman's rho
Usage:
python result_analyze.pyIf you find our work useful, please cite:
@article{han2026oscbench,
title={OSCBench: Benchmarking Object State Change in Text-to-Video Generation},
author={Han, Xianjing and Zhu, Bin and Hu, Shiqi and Li, Franklin Mingzhe and Carrington, Patrick and Zimmermann, Roger and Chen, Jingjing},
journal={arXiv preprint arXiv:2603.11698},
year={2026}
}