Skip to content
YTO dataset annotations in PASCAL VOC format
Branch: master
Clone or download
Latest commit 8216f33 Sep 26, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE initial commit Sep 26, 2017

YouTube-Objects v2.3

Vicky Kalogeiton, Vittorio Ferrari, and Cordelia Schmid


The dataset is composed of videos collected from YouTube for 10 moving object classes of the PASCAL VOC Challenge. This release provides the annotations in PASCAL VOC 2007 format for the same 7,000 bounding-box annotations from the YTO v2.2.

In the training set, we annotated one instance per frame, while in the test set we annotated all instances of the desired object class.


You can explore the annotated frames with the Dataset viewer.


  1. Download and extract the directory called YTOdevkit

    curl | tar xz
  2. Go to the YTOdevkit directory

    cd YTOdevkit
  3. Download and extract the VOCdevkit Code

    curl | tar x --strip-components 1 VOCdevkit/VOCcode
  4. Download and extract the YouTube-Objects v2.3 image sets, annotations and jpg images

    curl | tar xz -C YTO/
    curl | tar xz -C YTO/
    curl | tar xz -C YTO/

It should have this basic structure

YTOdevkit/             # development kit
├── results/           # YTO results
├── VOCcode/           # VOC utility code
├── YTO/               # image sets, annotations, etc.
│   ├── Annotations/
│   ├── ImageSets/
│   └── JPEGImages/
└── ...


If you find YouTube-Objects v2.3 useful in your research, please consider citing:

  title={Analysing domain shift factors between videos and images for object detection},
  author={Kalogeiton, Vicky and Ferrari, Vittorio and Schmid, Cordelia},
  journal={IEEE transactions on pattern analysis and machine intelligence},
You can’t perform that action at this time.