Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1) #17263

Open
zhreshold opened this issue Jan 9, 2020 · 2 comments
Open

Comments

@zhreshold
Copy link
Member

Description

This is the part 1 of Gluon Data API extension and fixes, which mainly focus on cleaning up diverging usage of mxnet module/gluon.
Through long time evolution, there's currently two streams of data loading conventions implemented in mxnet

In order to eliminate the confusion here and to reduce the maintenance efforts, the plan is to drop all old iterators and provide similar Dataset + Dataloader experience in gluon data API.

Things to be removed

iterators

Augmenters from mxnet.image and mxnet.image.detection module

Random augmenters, e.g. (https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/image.py#L615) will be removed.

Transform = args in gluon.data.Datasets

transform = is no longer supported, and can be replaced with dataset.transform or dataset.transform_first

Things to be added

Gluon Data Datasets

Dataset + Transfrom combo that simulate the removed Iterators

For example, NDArrayIter can be reimplemented as NDArrayDataset + empty transform function.

Gluon Data Augmentaters/Transforms

Data augmenters as mxnet.gluon.Block

Candidates TBD, useful candidates from GluonCV(https://github.com/dmlc/gluon-cv/tree/master/gluoncv/data/transforms) and GluonNLP(https://github.com/dmlc/gluon-nlp/blob/v0.8.x/src/gluonnlp/data/transforms.py)

mxnet.image

image processing functions will be absorbed from GluonCV(https://github.com/dmlc/gluon-cv/blob/master/gluoncv/data/transforms/image.py)

@ptrendx
Copy link
Member

ptrendx commented Jan 10, 2020

What about mx.io.ImageRecordIter? Also, what about the return type of those iterator - mx.io iterators return mx.io.DataBatch, will that be changed too?

@JanuszL FYI since DALI MXNet plugin produces mx.io.DataBatch and may be affected.

@zhreshold
Copy link
Member Author

The old iterators will get a special gluon dataset wrapper which has no length and forbids random accessing or sampling from dataloader, they keep their original arguments during iteration

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants