Skip to content
This repository has been archived by the owner on May 30, 2019. It is now read-only.

cifar10 and mnist should be sharded #463

Open
ry opened this issue Mar 29, 2018 · 0 comments
Open

cifar10 and mnist should be sharded #463

ry opened this issue Mar 29, 2018 · 0 comments

Comments

@ry
Copy link
Contributor

ry commented Mar 29, 2018

cifar10/train is 150mb and mnist/train is 50mb - accessing a single element from either of these will trigger a download of the whole set.

Ideally we can split these datasets into ~ 2mb chunks so they can be downloaded progressively. If only one batch is inspected (like for debugging) only 2mb will be used.

I've already split cifar10/train images into these files and uploaded them (using this script)

http://ar.propelml.org/cifar10_train_images_00.npy
http://ar.propelml.org/cifar10_train_images_01.npy
...
http://ar.propelml.org/cifar10_train_images_49.npy
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant