Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data request] Downsampled ImageNet #45

Closed
rodrigob opened this issue Feb 8, 2019 · 24 comments
Closed

[data request] Downsampled ImageNet #45

rodrigob opened this issue Feb 8, 2019 · 24 comments
Assignees
Labels
dataset request Request for a new dataset to be added

Comments

@rodrigob
Copy link
Contributor

rodrigob commented Feb 8, 2019

  • Name of dataset: Downsampled ImageNet
  • URL of dataset: http://image-net.org/small/download.php
  • License of dataset: Same as ImageNet
  • Short description of dataset and use case(s): great for prototyping and quick experimentation beyond MNIST.

Folks who would also like to see this dataset in tensorflow/datasets, please +1/thumbs-up so the developers can know which requests to prioritize.

@rodrigob rodrigob added the dataset request Request for a new dataset to be added label Feb 8, 2019
@tabshaikh
Copy link

As this is a good first issue would like to take this up 😀

@tabshaikh
Copy link

tabshaikh commented Feb 25, 2019

There are 2 version of this dataset 32 * 32 images and 64 * 64 images should both be done ?

@rsepassi
Copy link
Contributor

That's great @tabshaikh, thank you!

Yes, I think we should have both using "heavy" configuration, but you can start with just one.

@rsepassi
Copy link
Contributor

I'll assign the issue to you as soon as you accept the collaborator invite! 😃

@tabshaikh
Copy link

tabshaikh commented Feb 25, 2019

@rsepassi invite accepted would have a pr soon :)

@rsepassi
Copy link
Contributor

Sounds good! Thank you!

@Anupam-tripathi-zz
Copy link

@rsepassi I would love to collaborate with @tabshaikh and make a pull request for Downsampled ImageNet. Please assign it to me also.

@rsepassi
Copy link
Contributor

rsepassi commented Mar 1, 2019

Hi @anupam-tripathi, thanks for your interest! Are you already working directly with @tabshaikh? If not, let's give him a chance to get a PR in. If he'd like the help, then please do work together to get something in!

@Anupam-tripathi-zz
Copy link

No, I have not joined him till now but will surely contact him personally.

@tabshaikh
Copy link

tabshaikh commented Mar 1, 2019

@anupam-tripathi I would love to collaborate with you but the pr is almost done with a few changes left to do and hopefully, I have added the dataset correctly @rsepassi I would do a pr till mid next week as I will be going for an ML hackathon during the weekend, I had some question too would ask in the draft pr
Let us collaborate on adding a big dataset @anupam-tripathi would be great to have a teammate in it

@rsepassi
Copy link
Contributor

rsepassi commented Mar 2, 2019 via email

@Anupam-tripathi-zz
Copy link

Ya, surely I will prove to be a good teammate.

@tabshaikh
Copy link

tabshaikh commented Mar 5, 2019

@rodrigob @rsepassi the link for http://image-net.org/small/download.php does not contain the whole dataset of the downsampled imagenet nor does it contain the labels.
Further i found these links https://patrykchrabaszcz.github.io/Imagenet32/ for dataset details and this http://image-net.org/download-images which contains the whole dataset and
I could not understand in which dev kit of imagenet the labels are present as mentioned in this link https://patrykchrabaszcz.github.io/Imagenet32/ ?
Also the data requires login and is present in the form of pickle file which extracts into a dictionary
Can you help me how to proceed with this further :)

@cyfra
Copy link
Contributor

cyfra commented Mar 5, 2019

@tabshaikh - Yes, this dataset has only a subset of subsampled imagenet images and does NOT have labels.

This is on purpose - as it was used for autoregressive algorithms, that were generating the output images (rather than trying to predict the class).

Please download from the official link rather than from side-ones.

@tabshaikh
Copy link

@cyfra okay cool

@rodrigob
Copy link
Contributor Author

rodrigob commented Mar 5, 2019

The idea of this ticket was to create a smaller version of imagenet that is small enough so that most people can prototype and experiment without having to worry about download time or disk-space.

I would suggest to go for 32x32 and 64x64 versions; ideally with labels so that supervised training (à la MNIST and CIFAR) can also be used.

@tabshaikh
Copy link

@rodrigob okay thanks :)

@joel-shor
Copy link
Contributor

Any thoughts on doing the 128x128 version while you're at it?

@rsepassi
Copy link
Contributor

rsepassi commented Mar 5, 2019

Joel, looks like there are only 2 versions listed, 32 and 64: http://image-net.org/small/download.php

@rsepassi
Copy link
Contributor

rsepassi commented Mar 5, 2019

@joel-shor Do you have a link to 128x128?

@tabshaikh
Copy link

@rsepassi no there are 4 versions actually 8x8, 16x16, 32x32, 64x64 here http://image-net.org/download-images. The link which you pointed out is incomplete as there is no labels for the same.
@joel-shor can you point me to 128x128 version :)

@joel-shor
Copy link
Contributor

Sorry for the delay. The 128x128 imagenet, which is used in a number of state-of-the-art GANs (such as Self Attention GAN), can be found here: https://github.com/openai/improved-gan/blob/master/imagenet/convert_imagenet_to_records.py

If you were able to turn this in to a TFDS data set, you would be a hero!

@joel-shor
Copy link
Contributor

@tabshaikh Have you moved on from this?

@Conchylicultor
Copy link
Member

Has been added with #613. Closing this now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataset request Request for a new dataset to be added
Projects
None yet
Development

No branches or pull requests

7 participants