Read directly from archives on datasets with many records #686

pierrot0 · 2019-06-17T13:17:37Z

image/downsampled_imagenet.py extracts about 1M files in a single directory (11GiB).
Some file systems don't like having too many files within the same directory.
We should read from the archive directly (look at imagenet for an example).

Chanchal, since you added the downsampled_imagenet dataset, could you look into this please?

ChanchalKumarMaji · 2019-06-17T14:03:19Z

Sure, I will like to work on this. Thanks.

pierrot0 added the bug Something isn't working label Jun 17, 2019

pierrot0 assigned ChanchalKumarMaji Jun 17, 2019

This comment has been minimized.

Sign in to view

ChanchalKumarMaji mentioned this issue Jun 21, 2019

Read directly from archives on datasets with many records. #701

Merged

pierrot0 closed this as completed Jul 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read directly from archives on datasets with many records #686

Read directly from archives on datasets with many records #686

pierrot0 commented Jun 17, 2019

ChanchalKumarMaji commented Jun 17, 2019

This comment has been minimized.

Read directly from archives on datasets with many records #686

Read directly from archives on datasets with many records #686

Comments

pierrot0 commented Jun 17, 2019

ChanchalKumarMaji commented Jun 17, 2019

This comment has been minimized.