Skip to content
Branch: master
Go to file
Code

Latest commit

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

WildLife Documentary (WLD) Dataset

Introduction

The dataset contains 15 documentary films that are downloaded from YouTube, whose durations vary from 9 minutes to as long as 50 minutes, and the total number of frames is more than 747,000. More than 4000 object tracklets of 65 categories are annotated.

Here is an overview of the dataset. Dataset overview

Content

The dataset are organized as the following structure:

  • videos/: Downloaded raw videos should be extracted here.
  • frames/: Video frames will be generated here.
  • subtitles/: Subtitles of the videos, in srt format. The subtitles are originally auto-generated by YouTube and we correct some obvious mistakes manually.
  • annotations/: Bounding box annotations, in json format. Coordinates are 0-based and the bounding boxes are labeled as [x1, y1, x2, y2]. The videos are fully annotated with the help of object tracking.

Citation

If you use WLD dataset in your research, please consider citing our paper:

@inproceedings{chen2017discover,
  author = {Kai Chen, Hang Song, Chen Change Loy, Dahua Lin},
  title = {Discover and Learn New Objects from Documentaries},
  booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = July,
  year = {2017}
}

Download

  1. Download the raw videos from Google Drive and extract all videos to the folder video/.
  2. run the script video2frames.py (opencv required) to convert all videos into frames.

About

WildLife Documentary Dataset

Resources

License

Releases

No releases published

Languages

You can’t perform that action at this time.