Could you please provide access to the required data files? #1

yangapku · 2020-01-17T03:04:42Z

Hi! Thank you for releasing this great project! However, I notice that the data files (including the lmdb feature files as well as other metadata) needed to run pre-training and multi-task fine-tuning is not accessible. Could you please add accessible links to them? Or a readme guiding how to generate them is also fine. Thank you very much!

vedanuj · 2020-01-21T22:54:41Z

Hi @yangapku .. I have added a PR that includes files to do feature extraction and converting the extracted features to a LMDB file that can be used to train.

Unfortunately we cannot make public the feature files at this point of time. It should be easy to extract features and use them after the Readme is updated. Stay tuned and monitor the PR for updates.

yangapku · 2020-01-23T01:42:27Z

Thank you for your reply! I will try to use the script to generate the feature files.

yangapku · 2020-01-23T01:54:08Z

Hi, could you please provide more details about the arguments for running the extract_features.py script, such as "num_features", "feature_name", "confidence_threshold" and "background"? Is using the default parameters the appropriate way? Thank you!

vedanuj · 2020-01-24T21:04:52Z

Yes using the default arguments should work for this project.

yangapku · 2020-01-25T10:26:46Z

Thanks! Do you mean that the default arguments work for preprocessing both the Conceptual Captions and the down-streaming datasets (COCO, VCR, etc.)? Meanwhile, I noticed that the arguments are different from the original ViL-BERT, like the increase in the number of boxes to 100 and the decrease in the confidence threshold to 0. Will that be okay?

vedanuj · 2020-01-30T21:15:27Z

For Conceptual Captions you can use number of boxes 36 and for downstream tasks it can be 100

yangapku · 2020-01-31T02:10:35Z

Thank you! May I ask another question? For VCR and RefCOCO, it's needed to generate features based on the given ground truth bounding boxes. In the original ViL-BERT, there is a generate_tsv_gt.py script for this case. How do I achieve this using the new feature extractor script?

vedanuj · 2020-01-31T18:03:51Z

Thanks for asking this. We will add that script as well.

yangapku · 2020-02-06T03:08:20Z

Hi, @vedanuj . May I ask is there any progress on including the script we've discussed before? Thank you!

vedanuj · 2020-02-07T18:55:46Z

@yangapku The scripts are added. Please check the readme in data directory. Let us know if you face any problems running the script.

vedanuj mentioned this issue Jan 21, 2020

Add scripts for feature extraction and lmdb conversion #2

Merged

3 tasks

vedanuj mentioned this issue Feb 7, 2020

Add script to extract features from images with GT bounding boxes #9

Merged

vedanuj closed this as completed Feb 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could you please provide access to the required data files? #1

Could you please provide access to the required data files? #1

yangapku commented Jan 17, 2020

vedanuj commented Jan 21, 2020

yangapku commented Jan 23, 2020

yangapku commented Jan 23, 2020

vedanuj commented Jan 24, 2020

yangapku commented Jan 25, 2020

vedanuj commented Jan 30, 2020

yangapku commented Jan 31, 2020

vedanuj commented Jan 31, 2020

yangapku commented Feb 6, 2020

vedanuj commented Feb 7, 2020

Could you please provide access to the required data files? #1

Could you please provide access to the required data files? #1

Comments

yangapku commented Jan 17, 2020

vedanuj commented Jan 21, 2020

yangapku commented Jan 23, 2020

yangapku commented Jan 23, 2020

vedanuj commented Jan 24, 2020

yangapku commented Jan 25, 2020

vedanuj commented Jan 30, 2020

yangapku commented Jan 31, 2020

vedanuj commented Jan 31, 2020

yangapku commented Feb 6, 2020

vedanuj commented Feb 7, 2020