Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load image from CC3M #13

Closed
viyjy opened this issue Sep 25, 2021 · 8 comments
Closed

Cannot load image from CC3M #13

viyjy opened this issue Sep 25, 2021 · 8 comments

Comments

@viyjy
Copy link

viyjy commented Sep 25, 2021

Get the following error:
PIL.UnidentifiedImageError: cannot identify image file '/home/ubuntu/data/CC3M/DownloadConceptualCaptions/validation/10481_3355970027'

The error is generated by this code in caption_dataset.py:
image = Image.open(ann['image']).convert('RGB')

BTW, I can only download 2.4M images from CC3M/training, how did you download 2.95M images? Thanks.

@LiJunnan1992
Copy link
Contributor

Hi, I used this code to download the dataset:
https://github.com/igorbrigadir/DownloadConceptualCaptions

@viyjy
Copy link
Author

viyjy commented Sep 26, 2021

Thanks. How about the first question? The code cannot identify CC3M images (the path is correct, and images do exists), while it can identify images from other datasets.

@LiJunnan1992
Copy link
Contributor

It could be because that the image is not downloaded correctly, so PIL cannot load it.

@viyjy
Copy link
Author

viyjy commented Sep 27, 2021

Thanks for your help!

@viyjy viyjy closed this as completed Sep 27, 2021
@shoutOutYangJie
Copy link

Hi, I used this code to download the dataset: https://github.com/igorbrigadir/DownloadConceptualCaptions

How to down CC12M dataset ? can you share the download tool?

@LiJunnan1992
Copy link
Contributor

LiJunnan1992 commented Dec 14, 2021

Hi, I used this code to download the dataset: https://github.com/igorbrigadir/DownloadConceptualCaptions

How to down CC12M dataset ? can you share the download tool?

Hi, I simply modified the download code for CC3M, the format between the two is very similar.

@shoutOutYangJie
Copy link

Hi, I used this code to download the dataset: https://github.com/igorbrigadir/DownloadConceptualCaptions

How to down CC12M dataset ? can you share the download tool?

Hi, I simply modified the download code for CC3M, the format between the two is very similar.

This is for CC3M rather than cc12.

@LiJunnan1992
Copy link
Contributor

Hi, I used this code to download the dataset: https://github.com/igorbrigadir/DownloadConceptualCaptions

How to down CC12M dataset ? can you share the download tool?

Hi, I simply modified the download code for CC3M, the format between the two is very similar.

This is for CC3M rather than cc12.

Yes, you can slightly modify the code to download cc12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants