Skip to content

HwangToeMat/Open-Images_EasyDownload

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Open-Images_EasyDownload HitCount

Helper library for downloading OpenImages(https://storage.googleapis.com/openimages/web/index.html) categorically.

Open Images is the largest annotated image dataset in many regards, for use in training the latest deep convolutional neural networks for computer vision tasks. But, sometimes large capacities of 'Open Images' make it difficult to find only the data you need.

OpenImages

Then you can easily get data with this code including Bounding Boxes (600 classes), Object Segmentations, Visual Relationships, and Localized Narratives.

Settings

This code needs 'ratelim', 'tqdm' and 'checkpoint'. Both 'tqdm' and 'checkpoint' are included in this repository. But you need to install a 'ratelim' using the code below before running.

pip install ratelim

Usage

usage: EasyDownloader.py [-h] [--category CATEGORY] [--type TYPE] [--ndata NDATA]
               [--label LABEL] [--annotation ANNOTATION] [--imageURL IMAGEURL]
               [--savepath SAVEPATH]
  
optional arguments:
  -h, --help            show this help message and exit
  --category CATEGORY   Enter the category you want. If you want multi-
                        category, please tag each category.
  --type TYPE           Enter the type of data you want. If you want 'Union
                        data' enter 'sum' else if you want 'intersection data'
                        enter 'inter'.
  --ndata NDATA         Number of data you want
  --label LABEL         Path of class descriptions file.
  --annotation ANNOTATION
                        Path of bbox annotation file.
  --imageURL IMAGEURL   Path of imageURL file.
  --savepath SAVEPATH   Path where downloaded data will be saved

An example of usage is shown as follows.

### If you use this code at colab, add '!' at the beginning of the line.

python EasyDownloader.py --category "Football" --category "Person" --type "inter" --savepath "Football_data"

In this example, you can get images that have both 'football category' and 'personal category' in each image.

If you enter "sum" instead of "inter", you can get images that have 'Football category' or 'Person category' in each image.

etc...

Image is saved at "{--savepath}/images/[imageURL].jpg".

Information of bbox is saved at "{--savepath}/bbox/bbox_data.csv".

Information of label is saved at "{--savepath}/bbox/label_data.csv".

You can use name of imagefile and column('OriginalURL') of 'bbox.csv' to match annotation to image.

If you want to download more faster, change parameters of ratelim in line 122.

### Too many calls in a short time can lead to missing data.

@ratelim.patient(5, 5) # 5 times in 5 seconds (Gets called at most every 1. seconds)
@ratelim.patient(10, 5) # 10 times in 5 seconds (Gets called at most every 0.5 seconds)

About

Helper library for downloading OpenImages categorically.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published