- [2026-03-11] 🔥🔥🔥 We release the CourtSI dataset, benchmark, and the fine-tuned model! Check out the repository and the Hugging Face links above for more details.
TL;DR We introduce CourtSI and CourtSI-Bench, the first large-scale dataset and benchmark dedicated to spatial intelligence in sports.
CourtSI contains 1M+ QA pairs built upon a holistic spatial taxonomy covering: · (i) Spatial Counting · (ii) Distance Measurement · (iii) Localization · (iv) Relational Reasoning
CourtSI-Bench, a high-quality benchmark consisting of 3,686 human-verified QA pairs. We evaluate 25 state-of-the-art proprietary and open-source VLMs on it.
The Fine-tuned Qwen3-VL-8B on CourtSI yields a +23.5% absolute improvement on CourtSI-Bench.
CourtSI-Ext, an extended version of CourtSI-Bench, focusing on cross-sport generalization.
Leverage court geometry for semi-automatic sport scene reconstruction, enabling the generation of spatially grounded QA pairs at scale.
Please refer to the documentation in the protocol folder for detailed instructions.
If you find our work useful, please consider citing:
@misc{yang2026CourtSI,
title={Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports},
author={Yuchen Yang and Yuqing Shao and Duxiu Huang and Linfeng Dong and Yifei Liu and Suixin Tang and Xiang Zhou and Yuanyuan Gao and Wei Wang and Yue Zhou and Xue Yang and Yanfeng Wang and Xiao Sun and Zhihang Zhong},
year={2026},
eprint={2603.09896},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2603.09896},
}
