Skip to content

Convert Open Image v4 Dataset to VOC pasacal format XML. Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes. https://github.com/openimages/dataset

License

AtriSaxena/OIDv4_to_VOC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OIDv4 To VOC XML format

If you have experience in working with Pascal VOC format but not able to work with Open Image Dataset v4 that has 600 classes. Than there are steps how you can download images per class and convert annotation to XML files.

The Code is documented and easy to understand. Please see the Usage steps down.

Open Image Dataset v4

All the information related to this huge dataset can be found here In these few lines are simply summarized some statistics and important tips.

TrainValidationTest#Classes
Images1,743,04241,620 125,436-
Boxes14,610,229204,621625,282600

Getting Started

Installation

Python3 is required.

  1. Clone this repository.
   git clone https://github.com/AtriSaxena/OIDv4_to_VOC.git
  1. Install the required package.
   pip3 install -r requirements.txt

Peek inside the requirements file if you have everything already installed. Most of the dependencies are common libraries.

Launch the ToolKit to check the available options

First of all, if you simply want a quick reminder of al the possible options given by the script, you can simply launch, from your console of choice, the OIDv4_to_VOC.py. Remember to point always at the main directory of the project

python3 OIDv4_to_VOC.py

or in the following way to get more information

python3 OIDv4_to_VOC.py -h

Download the Dataset

To download the Dataset per class goto this repository https://github.com/EscVM/OIDv4_ToolKit

Read README.MD file to download some classes.

Make Annotation into XML format.

To Convert a class say 'Apple' give source path of Images containing Images and Labels Folder.

└───Apple

    |0fdea8a716155a8e.jpg
    |2fe4f21e409f0a56.jpg
    |...
    └───Labels
            |0fdea8a716155a8e.txt
            |2fe4f21e409f0a56.txt
            |...

And give destination path to store converted xml files.

python3 OIDv4_to_VOC.py --sourcepath Dataset/train/Apple --dest_path Dataset/train/Annotation/Apple

After running the script Annotation will be saved in Destination Path.

About

Convert Open Image v4 Dataset to VOC pasacal format XML. Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes. https://github.com/openimages/dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages