Skip to content
/ geda Public

Get Data for you projects with just three lines of code. Currently suppored datasets: Pascal VOC, NYUDv2, Person-Parts, DUTS

License

Notifications You must be signed in to change notification settings

thawro/geda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeDa

GeDa is a Python package that helps you to Get the Data for your project easily.

Installation

pip install geda

Usage

Using specific data provider class

from geda.data_providers.voc import VOCSemanticSegmentationDataProvider

root = "<directory>/<to>/<store>/<data>" # e.g. "data/VOC"
dataprovider = VOCSemanticSegmentationDataProvider(root)
dataprovider.get_data()

Using get_data shortcut

from geda import get_data

root = "<directory>/<to>/<store>/<data>" # e.g. "data/VOC"
dataprovider = get_data(name="VOC_SemanticSegmentation", root=root)
dataprovider.get_data()

The get_data function currently supported names: MNIST, DUTS, NYUDv2, VOC_InstanceSegmentation, VOC_SemanticSegmentation, VOC_PersonPartSegmentation, VOC_Main, VOC_Action, VOC_Layout, MPII, COCO_Keypoints

What it does

By using dataprovider.get_data() functionality, the data is subjected to the following pipeline:

  1. Download the data from source (specified by the _URLS variable in each module)
  2. Unzip the files if needed (in case of tar, zip or gz files downloaded)
  3. Move the files to <root>/raw directory
  4. Find the split ids (file basenames or indices - depending on the dataset)
  5. Arrange files, i.e. move (or copy) files from <root>/raw directory to task-specific directories
  6. [Optional] Create labels in specific format (f.e. YOLO)

Example

Resulting directory structure of the get_data(name="VOC_SemanticSegmentation", root="data/VOC")

.
└── data
    └── VOC
        ├── raw
        │   ├── Annotations
        │   ├── ImageSets
        │   ├── JPEGImages
        │   ├── SegmentationClass
        │   └── SegmentationObject
        ├── SegmentationClass
        │   ├── annots
        │   ├── images
        │   ├── labels
        │   └── masks
        └── trainval_2012.tar

Currently supported datasets

Image classification

Image Segmentation

Keypoints detection

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

About

Get Data for you projects with just three lines of code. Currently suppored datasets: Pascal VOC, NYUDv2, Person-Parts, DUTS

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages