New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster RCNN model version and Object Tag Sequences #13
Comments
For vg object labels, we use the opensourced bottom-up attention repo (https://github.com/peteanderson80/bottom-up-attention). |
Did you directly apply the released pretrained faster rcnn model (https://www.dropbox.com/s/5xethd2nxa8qrnq/resnet101_faster_rcnn_final.caffemodel?dl=1) on the VQA images to get the semantic tags? |
The VQA uses the train2014_vg_qla_mrcnn as input, were the tags extracted by Masked R-CNN? But the paper says the tags were extracted by Faster R-CNN. .. Just want to confirm that. |
There are two kinds of corpus in the work, pre-training corpus and downstream task finetuning corpus, both are using tag sets. For finetuning VQA, we observed that the best tag set is from COCO, not VG tag set (80 categories), because Faster R-CNN (pretrained on VG corpus), its tag prediction precision is not good enough, though VG tag set has more categories (1600). Here we use a high-precision Mask R-CNN (trained on COCO tag set) to generate the tags. For pre-training corpus, we use VG tag set from Faster R-CNN. For downstream task finetuning corpus, you can use any off-the-shelf object detector to generate the tags, it is a trade-off on high precision or more categories. And one more example is NoCaps finetuning corpus, we use OpenImage tag set, which is better on NoCaps task. |
Can you share the mrcnn model trained on COCO tagset? |
I used torchvision.maskrcnn to generate the tags, model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True) It is opensourced for both API and pretrained models. |
I see. I thought you used the different versions of mrcnn. I have tried the open sourced one from torchvision. But I find that the tag set in |
May try different confidence_threshold = 0.2 or 0.4 or 0.0 to generate (filter) the tags. |
The default category list is COCO_INSTANCE_CATEGORY_NAMES (https://pytorch.org/docs/stable/torchvision/models.html). Then I collected all the tag set from each json file aforementioned, I compare their differences, then I find 28 tags in the json files but not in the default category list. |
See this, pytorch/vision#990 |
Did you use the open sourced version of the faster rcnn from torchvision: https://pytorch.org/docs/stable/torchvision/models.html#torchvision.models.detection.fasterrcnn_resnet50_fpn?
And, did you use the open sourced version of tags and labels?
The text was updated successfully, but these errors were encountered: