Skip to content
This repository has been archived by the owner on Feb 16, 2022. It is now read-only.

training_feat_all.lmdb and caption_train.json not found #7

Closed
HubHop opened this issue Jan 31, 2020 · 9 comments
Closed

training_feat_all.lmdb and caption_train.json not found #7

HubHop opened this issue Jan 31, 2020 · 9 comments

Comments

@HubHop
Copy link

HubHop commented Jan 31, 2020

Hi, thanks for your excellent work.

I'm trying to run your code with Pretraining on Conceptual Captions, but some files do not exist, and I didn't find any script to generate them. I've tried to set up data as specified here, but the step 3 only gives me a data.mdb and a lock.mdb. Can you provide training_feat_all.lmdb and caption_train.json ?

Many thanks.

@yangapku
Copy link

The training_feat_all.lmdb should be the name of the directory storing data.mdb and lock.mdb.

@HubHop
Copy link
Author

HubHop commented Jan 31, 2020

Hi @yangapku , thanks a lot, it works. But how do you prepare caption_train.json?

@vedanuj
Copy link
Contributor

vedanuj commented Jan 31, 2020

Hi @HubHop

For caption_train.json you will need to download the captions for Conceptual Captions dataset.

@HubHop HubHop closed this as completed Feb 2, 2020
@zaynmi
Copy link

zaynmi commented Apr 16, 2020

@HubHop Hi, where could I get the caption_train.json? Conceptual Captions dataset only provide .tsv files officially. Or should I generate images and captions by myself?

@HubHop
Copy link
Author

HubHop commented Apr 16, 2020

Hi @zaynmi, I prepared my data based on this script:
https://github.com/jackroos/VL-BERT/blob/master/data/conceptual-captions/ReadMe.txt

Cheers

@leyuan
Copy link

leyuan commented Apr 22, 2020

Hi @HubHop what did you get after generating the Json? I got a weird format:

{"image": "val_image.zip@/00000000.jpg", "caption": ["author", ":", "a", "life", "in", "photography", "--", "in", "pictures"]}
{"image": "val_image.zip@/00000002.jpg", "caption": ["photograph", "of", "the", "sign", "being", "repaired", "by", "brave", "person"]}
{"image": "val_image.zip@/00000003.jpg", "caption": ["the", "player", "staring", "intently", "at", "a", "computer", "screen", "."]}

Did you get the caption_train.py work? Thanks.

@wu-zhonghua
Copy link

@HubHop Hi, where could I get the caption_train.json? Conceptual Captions dataset only provide .tsv files officially. Or should I generate images and captions by myself?

Have you found the caption_train.json? I still haven't found. Thanks

@wu-zhonghua
Copy link

Hi @HubHop what did you get after generating the Json? I got a weird format:

{"image": "val_image.zip@/00000000.jpg", "caption": ["author", ":", "a", "life", "in", "photography", "--", "in", "pictures"]}
{"image": "val_image.zip@/00000002.jpg", "caption": ["photograph", "of", "the", "sign", "being", "repaired", "by", "brave", "person"]}
{"image": "val_image.zip@/00000003.jpg", "caption": ["the", "player", "staring", "intently", "at", "a", "computer", "screen", "."]}

Did you get the caption_train.py work? Thanks.

Have you found the caption_train.json? I still haven't found. Thanks

@wu-zhonghua
Copy link

@leyuan @zaynmi Hi, have you found the caption_train.json files? Can you please tell me how to get it? Thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants