Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VQA V2.0 and Visual Dialog V0.9. #54

Merged
merged 9 commits into from May 12, 2017

Conversation

@jiasenlu
Copy link
Contributor

jiasenlu commented May 8, 2017

This pull request contains
1: add vqa_v2 task loader.
2: add VisDial_v0.9 task loader.
3: disable download COCO image as default
4: move COCO image path to "--download_path"

To test the new added task

For VQA 2.0: python examples/display_data.py -t vqa_coco2014_v2 --download-path 'path_to_COCO_img'

For Visual Dialog: python examples/display_data.py -t visdial

Currently, the Visual Dialog inherit from the default "DialogTeacher" class. There is no placeholder for the additional image information. The "DialogData" class with the format [(x, y, r, c), new_episode?], could we extend this to [(x, y, r, c, i), new_episode?] where "i" is some optional additional information such as image_id? If so, I can send another pull request to modify that.

@facebook-github-bot
Copy link

facebook-github-bot commented May 8, 2017

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla - and if you have received this in error or have any questions, please drop us a line at cla@fb.com. Thanks!

If you are contributing on behalf of someone else (eg your employer): the individual CLA is not sufficient - use https://developers.facebook.com/opensource/cla?type=company instead. Contact cla@fb.com if you have any questions.

@facebook-github-bot
Copy link

facebook-github-bot commented May 8, 2017

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@alexholdenmiller
Copy link
Member

alexholdenmiller commented May 8, 2017

Thanks Jiasen!! I think we'd like to keep the dataset downloading the images automatically (to datapath)--I know it's large, but users should only need to do it once anyways. Everything in downloads is also automatically downloaded (currently just the Memnn github repo).

As far as DialogTeacher/DialogData, I think we want to write a new one instead that includes some of the data loading from your vqa_coco2014 (v1) and prepares us to handle the preprocessing options that we talked about before. We don't want image_id as part of the action/observation dictionary--just images themselves--so we want to include the image loading as part of that data class.

@@ -36,7 +35,7 @@ def _path(opt):
annotation_path = os.path.join(opt['datapath'], 'VQA-COCO2014',
annotation_suffix + '_annotations.json')

image_path = os.path.join(opt['datapath'], 'VQA-COCO2014', img_suffix)

This comment has been minimized.

Copy link
@alexholdenmiller

alexholdenmiller May 8, 2017

Member

actually, we'd like to keep the images in datapath

This comment has been minimized.

Copy link
@jiasenlu

jiasenlu May 9, 2017

Author Contributor

I see. Then I think we might not want the image stay in 'VQA-COCO2014' folder, other task such as Visual Dialog may also use the COCO image. How about put the image under COCO-IMG ? Then, multiple task can share the image data.

This comment has been minimized.

Copy link
@alexholdenmiller

alexholdenmiller May 9, 2017

Member

Sounds great! Yeah giving it the most general name for the data file is perfect, and then multiple tasks can depend on it (and it won't rebuild it if it's already there).

anno = self.annotation['annotations'][self.episode_idx]
answers = [ans['answer'] for ans in anno['answers']]
else:
answers = ['fake_answer']

This comment has been minimized.

Copy link
@alexholdenmiller

alexholdenmiller May 8, 2017

Member

actually, just leave the labels field of the dict empty (accessing it gives KeyError, not None)

This comment has been minimized.

Copy link
@jiasenlu

jiasenlu May 9, 2017

Author Contributor

ok, thanks. will change this.

@facebook-github-bot
Copy link

facebook-github-bot commented May 10, 2017

@jiasenlu updated the pull request - view changes

Summary:

Fix the VQA, VisalDialog, VQA v2.0 based on last update of ParAI.

Test Plan:

python examples/display_data.py -t vqa_coco2014
python examples/display_data.py -t vqa_coco2014_v2
python examples/display_data.py -t visdial

Reviewers:

Subscribers:

Tasks:

Tags:

Blame Revision:
@facebook-github-bot
Copy link

facebook-github-bot commented May 10, 2017

@jiasenlu updated the pull request - view changes

@facebook-github-bot
Copy link

facebook-github-bot commented May 10, 2017

@jiasenlu updated the pull request - view changes

@facebook-github-bot
Copy link

facebook-github-bot commented May 11, 2017

@jiasenlu updated the pull request - view changes

Copy link
Member

alexholdenmiller left a comment

looking really good overall! how are you testing if the images are loading properly?

would be awesome to get a model in next so that we can try to train it and see if we can reproduce the results from the paper

@@ -265,7 +265,7 @@ def get(self, episode_idx, entry_idx=0):
table['reward'] = entry[2]
if len(entry) > 3:
table['label_candidates'] = entry[3]
if len(entry) > 4 and not opt.get('no_images', False):
if len(entry) > 4 and not self.opt.get('no_images', False):

This comment has been minimized.

Copy link
@alexholdenmiller

alexholdenmiller May 11, 2017

Member

ah good catch thank you

This comment has been minimized.

Copy link
@jiasenlu

jiasenlu May 11, 2017

Author Contributor

:)

self.len = len(self.ques['questions'])

class DefaultTeacher(OeTeacher):
pass

This comment has been minimized.

Copy link
@alexholdenmiller

alexholdenmiller May 11, 2017

Member

does v2 have a multiple-choice version?

This comment has been minimized.

Copy link
@jiasenlu

jiasenlu May 11, 2017

Author Contributor

no, VQA v2.0 doesn't have the multiple-choice now.

@jaseweston
Copy link
Contributor

jaseweston commented May 11, 2017

@alexholdenmiller
Copy link
Member

alexholdenmiller commented May 11, 2017

simple example which prints ascii version of images (after pip install asciimatics), or we could write our own similar one

import sys
from asciimatics.renderers import ImageFile
img = ImageFile(sys.argv[1], height=30)
print(img)

maybe we want something like this in display_data?

@jiasenlu
Copy link
Contributor Author

jiasenlu commented May 11, 2017

Yeah, I think we should do that. Maybe we can add this in a separate pull request?

@alexholdenmiller alexholdenmiller merged commit 10453fd into facebookresearch:master May 12, 2017
@alexholdenmiller
Copy link
Member

alexholdenmiller commented May 12, 2017

Just pushed this to get that fix in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.