Faster RCNN model version and Object Tag Sequences #13

xiaoleihuang · 2020-06-13T05:58:02Z

Did you use the open sourced version of the faster rcnn from torchvision: https://pytorch.org/docs/stable/torchvision/models.html#torchvision.models.detection.fasterrcnn_resnet50_fpn?
And, did you use the open sourced version of tags and labels?

xjli · 2020-06-13T06:01:41Z

For vg object labels, we use the opensourced bottom-up attention repo (https://github.com/peteanderson80/bottom-up-attention).

xiaoleihuang · 2020-06-17T21:02:42Z

For vg object labels, we use the opensourced bottom-up attention repo (https://github.com/peteanderson80/bottom-up-attention).

Did you directly apply the released pretrained faster rcnn model (https://www.dropbox.com/s/5xethd2nxa8qrnq/resnet101_faster_rcnn_final.caffemodel?dl=1) on the VQA images to get the semantic tags?

xiaoleihuang · 2020-06-18T06:10:26Z

The VQA uses the train2014_vg_qla_mrcnn as input, were the tags extracted by Masked R-CNN? But the paper says the tags were extracted by Faster R-CNN. .. Just want to confirm that.

xjli · 2020-06-18T06:35:13Z

There are two kinds of corpus in the work, pre-training corpus and downstream task finetuning corpus, both are using tag sets. For finetuning VQA, we observed that the best tag set is from COCO, not VG tag set (80 categories), because Faster R-CNN (pretrained on VG corpus), its tag prediction precision is not good enough, though VG tag set has more categories (1600). Here we use a high-precision Mask R-CNN (trained on COCO tag set) to generate the tags. For pre-training corpus, we use VG tag set from Faster R-CNN. For downstream task finetuning corpus, you can use any off-the-shelf object detector to generate the tags, it is a trade-off on high precision or more categories. And one more example is NoCaps finetuning corpus, we use OpenImage tag set, which is better on NoCaps task.

xiaoleihuang · 2020-06-18T21:49:46Z

Can you share the mrcnn model trained on COCO tagset?

xjli · 2020-06-18T22:15:47Z

I used torchvision.maskrcnn to generate the tags,

model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
maskrcnn = GeneralizedRCNN(model.backbone, model.rpn, model.roi_heads, model.transform)

It is opensourced for both API and pretrained models.

xiaoleihuang · 2020-06-18T23:24:46Z

I see. I thought you used the different versions of mrcnn. I have tried the open sourced one from torchvision. But I find that the tag set in train2014_qla_mrcnn is different from the open-sourced version tag set: there are 28 tags that exist in the json file but not found in the open-sourced categories. I tested the train+val2014_qla_mrcnn and the one with using vg. Their tag sets are different from the open-sourced version as well.

xjli · 2020-06-18T23:31:25Z

May try different confidence_threshold = 0.2 or 0.4 or 0.0 to generate (filter) the tags.

xiaoleihuang · 2020-06-18T23:47:23Z

The default category list is COCO_INSTANCE_CATEGORY_NAMES (https://pytorch.org/docs/stable/torchvision/models.html). Then I collected all the tag set from each json file aforementioned, I compare their differences, then I find 28 tags in the json files but not in the default category list.

xjli · 2020-06-19T00:57:18Z

See this, pytorch/vision#990
please take your time to explore by yourself ...

xiaoleihuang closed this as completed Jun 15, 2020

xiaoleihuang reopened this Jun 18, 2020

xjli closed this as completed Jun 18, 2020

yangapku mentioned this issue Apr 6, 2021

VQA object tags are different from image feature #73

Open

CCYChongyanChen mentioned this issue Oct 7, 2021

VINVL code on the VizWiz datasets #84

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster RCNN model version and Object Tag Sequences #13

Faster RCNN model version and Object Tag Sequences #13

xiaoleihuang commented Jun 13, 2020

xjli commented Jun 13, 2020

xiaoleihuang commented Jun 17, 2020

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 18, 2020 •

edited

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 18, 2020

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 18, 2020 •

edited

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 19, 2020

Faster RCNN model version and Object Tag Sequences #13

Faster RCNN model version and Object Tag Sequences #13

Comments

xiaoleihuang commented Jun 13, 2020

xjli commented Jun 13, 2020

xiaoleihuang commented Jun 17, 2020

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 18, 2020 • edited

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 18, 2020

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 18, 2020 • edited

xiaoleihuang commented Jun 18, 2020

xjli commented Jun 19, 2020

xjli commented Jun 18, 2020 •

edited

xjli commented Jun 18, 2020 •

edited