Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster RCNN model version and Object Tag Sequences #13

Closed
xiaoleihuang opened this issue Jun 13, 2020 · 10 comments
Closed

Faster RCNN model version and Object Tag Sequences #13

xiaoleihuang opened this issue Jun 13, 2020 · 10 comments

Comments

@xiaoleihuang
Copy link

Did you use the open sourced version of the faster rcnn from torchvision: https://pytorch.org/docs/stable/torchvision/models.html#torchvision.models.detection.fasterrcnn_resnet50_fpn?
And, did you use the open sourced version of tags and labels?

@xjli
Copy link
Collaborator

xjli commented Jun 13, 2020

For vg object labels, we use the opensourced bottom-up attention repo (https://github.com/peteanderson80/bottom-up-attention).

@xiaoleihuang
Copy link
Author

For vg object labels, we use the opensourced bottom-up attention repo (https://github.com/peteanderson80/bottom-up-attention).

Did you directly apply the released pretrained faster rcnn model (https://www.dropbox.com/s/5xethd2nxa8qrnq/resnet101_faster_rcnn_final.caffemodel?dl=1) on the VQA images to get the semantic tags?

@xiaoleihuang
Copy link
Author

The VQA uses the train2014_vg_qla_mrcnn as input, were the tags extracted by Masked R-CNN? But the paper says the tags were extracted by Faster R-CNN. .. Just want to confirm that.

@xiaoleihuang xiaoleihuang reopened this Jun 18, 2020
@xjli
Copy link
Collaborator

xjli commented Jun 18, 2020

There are two kinds of corpus in the work, pre-training corpus and downstream task finetuning corpus, both are using tag sets. For finetuning VQA, we observed that the best tag set is from COCO, not VG tag set (80 categories), because Faster R-CNN (pretrained on VG corpus), its tag prediction precision is not good enough, though VG tag set has more categories (1600). Here we use a high-precision Mask R-CNN (trained on COCO tag set) to generate the tags. For pre-training corpus, we use VG tag set from Faster R-CNN. For downstream task finetuning corpus, you can use any off-the-shelf object detector to generate the tags, it is a trade-off on high precision or more categories. And one more example is NoCaps finetuning corpus, we use OpenImage tag set, which is better on NoCaps task.

@xjli xjli closed this as completed Jun 18, 2020
@xiaoleihuang
Copy link
Author

Can you share the mrcnn model trained on COCO tagset?

@xjli
Copy link
Collaborator

xjli commented Jun 18, 2020

I used torchvision.maskrcnn to generate the tags,

model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
maskrcnn = GeneralizedRCNN(model.backbone, model.rpn, model.roi_heads, model.transform)

It is opensourced for both API and pretrained models.

@xiaoleihuang
Copy link
Author

I see. I thought you used the different versions of mrcnn. I have tried the open sourced one from torchvision. But I find that the tag set in train2014_qla_mrcnn is different from the open-sourced version tag set: there are 28 tags that exist in the json file but not found in the open-sourced categories. I tested the train+val2014_qla_mrcnn and the one with using vg. Their tag sets are different from the open-sourced version as well.

@xjli
Copy link
Collaborator

xjli commented Jun 18, 2020

May try different confidence_threshold = 0.2 or 0.4 or 0.0 to generate (filter) the tags.

@xiaoleihuang
Copy link
Author

The default category list is COCO_INSTANCE_CATEGORY_NAMES (https://pytorch.org/docs/stable/torchvision/models.html). Then I collected all the tag set from each json file aforementioned, I compare their differences, then I find 28 tags in the json files but not in the default category list.

@xjli
Copy link
Collaborator

xjli commented Jun 19, 2020

See this, pytorch/vision#990
please take your time to explore by yourself ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants