Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read directly from archives on datasets with many records #686

Closed
pierrot0 opened this issue Jun 17, 2019 · 2 comments
Closed

Read directly from archives on datasets with many records #686

pierrot0 opened this issue Jun 17, 2019 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@pierrot0
Copy link
Collaborator

image/downsampled_imagenet.py extracts about 1M files in a single directory (11GiB).
Some file systems don't like having too many files within the same directory.
We should read from the archive directly (look at imagenet for an example).

Chanchal, since you added the downsampled_imagenet dataset, could you look into this please?

@pierrot0 pierrot0 added the bug Something isn't working label Jun 17, 2019
@ChanchalKumarMaji
Copy link
Contributor

Sure, I will like to work on this. Thanks.

@ChanchalKumarMaji

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants