Repo for the Video Person Clustering dataset, and code for the associated paper. This reporsitory contains the Video Person Clustering Dataset (below), and the code (coming soon...) from the associated paper, for the task of video person-clustering
The dataset can be downloaded here. The tar.gz file contains the dataset, and a README detailing the contents
VPCD is built upon popular video datasets that are commonly used in the Computer Vision community (e.g. TBBT, Buffy, Friends, Sherlock, About Last Night, Hidden Figures)
The code to produce video person-clustering results: Coming soon...
Details for the raw resolution of the videos, and the frame rates used in the dataset, can be found in this document
Currently the available dataset does not have the exact statistics quoted in the paper. A corrected version will be made available soon
If you find VPCD, or the code useful, please consider citing:
@misc{brown2021face,
title={Face, Body, Voice: Video Person-Clustering with Multiple Modalities},
author={Andrew Brown and Vicky Kalogeiton and Andrew Zisserman},
year={2021},
eprint={2105.09939},
archivePrefix={arXiv},
primaryClass={cs.CV}
}