training_feat_all.lmdb and caption_train.json not found #7

HubHop · 2020-01-31T11:52:32Z

Hi, thanks for your excellent work.

I'm trying to run your code with Pretraining on Conceptual Captions, but some files do not exist, and I didn't find any script to generate them. I've tried to set up data as specified here, but the step 3 only gives me a data.mdb and a lock.mdb. Can you provide training_feat_all.lmdb and caption_train.json ?

Many thanks.

The text was updated successfully, but these errors were encountered:

yangapku · 2020-01-31T11:57:48Z

The training_feat_all.lmdb should be the name of the directory storing data.mdb and lock.mdb.

HubHop · 2020-01-31T13:00:43Z

Hi @yangapku , thanks a lot, it works. But how do you prepare caption_train.json?

vedanuj · 2020-01-31T18:03:05Z

Hi @HubHop

For caption_train.json you will need to download the captions for Conceptual Captions dataset.

zaynmi · 2020-04-16T02:11:39Z

@HubHop Hi, where could I get the caption_train.json? Conceptual Captions dataset only provide .tsv files officially. Or should I generate images and captions by myself?

HubHop · 2020-04-16T02:20:30Z

Hi @zaynmi, I prepared my data based on this script:
https://github.com/jackroos/VL-BERT/blob/master/data/conceptual-captions/ReadMe.txt

Cheers

leyuan · 2020-04-22T18:28:00Z

Hi @HubHop what did you get after generating the Json? I got a weird format:

{"image": "val_image.zip@/00000000.jpg", "caption": ["author", ":", "a", "life", "in", "photography", "--", "in", "pictures"]}
{"image": "val_image.zip@/00000002.jpg", "caption": ["photograph", "of", "the", "sign", "being", "repaired", "by", "brave", "person"]}
{"image": "val_image.zip@/00000003.jpg", "caption": ["the", "player", "staring", "intently", "at", "a", "computer", "screen", "."]}

Did you get the caption_train.py work? Thanks.

wu-zhonghua · 2020-07-17T08:34:11Z

@HubHop Hi, where could I get the caption_train.json? Conceptual Captions dataset only provide .tsv files officially. Or should I generate images and captions by myself?

Have you found the caption_train.json? I still haven't found. Thanks

wu-zhonghua · 2020-07-17T08:34:27Z

Hi @HubHop what did you get after generating the Json? I got a weird format:

{"image": "val_image.zip@/00000000.jpg", "caption": ["author", ":", "a", "life", "in", "photography", "--", "in", "pictures"]}
{"image": "val_image.zip@/00000002.jpg", "caption": ["photograph", "of", "the", "sign", "being", "repaired", "by", "brave", "person"]}
{"image": "val_image.zip@/00000003.jpg", "caption": ["the", "player", "staring", "intently", "at", "a", "computer", "screen", "."]}

Did you get the caption_train.py work? Thanks.

Have you found the caption_train.json? I still haven't found. Thanks

wu-zhonghua · 2020-07-17T08:36:10Z

@leyuan @zaynmi Hi, have you found the caption_train.json files? Can you please tell me how to get it? Thanks

HubHop closed this as completed Feb 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training_feat_all.lmdb and caption_train.json not found #7

training_feat_all.lmdb and caption_train.json not found #7

HubHop commented Jan 31, 2020

yangapku commented Jan 31, 2020

HubHop commented Jan 31, 2020

vedanuj commented Jan 31, 2020

zaynmi commented Apr 16, 2020

HubHop commented Apr 16, 2020

leyuan commented Apr 22, 2020

wu-zhonghua commented Jul 17, 2020

wu-zhonghua commented Jul 17, 2020

wu-zhonghua commented Jul 17, 2020

training_feat_all.lmdb and caption_train.json not found #7

training_feat_all.lmdb and caption_train.json not found #7

Comments

HubHop commented Jan 31, 2020

yangapku commented Jan 31, 2020

HubHop commented Jan 31, 2020

vedanuj commented Jan 31, 2020

zaynmi commented Apr 16, 2020

HubHop commented Apr 16, 2020

leyuan commented Apr 22, 2020

wu-zhonghua commented Jul 17, 2020

wu-zhonghua commented Jul 17, 2020

wu-zhonghua commented Jul 17, 2020