ClipShots is the first large-scale dataset for shot boundary detection collected from Youtube and Weibo covering more than 20 categories, including sports, TV shows, animals, etc. In contrast to previous shot boundary detection dataset, e.g. TRECVID and RAI, which only consist of documentaries or talk shows where the frames are relatively static, we construct a database containing short videos from Youtube and Weibo. Many short videos are home-made, with more challenges, e.g. hand-held vibrations and large occlusion. The types of these videos are various, including movie spotlights, competition highlights, family videos recorded by mobile phones etc. Each video has a length of 1-20 minutes. The gradual transitions in our database include dissolve, fade in fade out, and sliding in sliding out.
The database contains 3 sets of data, training set, testing set and 'only_gradual' set. The trainig set and the 'only_gradual' set are for training and the testing set is for evaluation. For the 'only_gradual' set, we annotate the gradual transitions because of insufficent gradual transitions in training set. In
video_lists, there are 3 files that contain the video names of them respectively. The evaluation script is in
We list some strong baselines here.
|deepSBD (Alexnet-like, origin)||0.731||0.921||0.815||0.837||0.386||0.528|