You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does the pre-trained model you provide cover the categories on LVIS data? If I want to do open-world object detection on the LVIS dataset, can I directly use your pre-trained model to generate the proposals or should I need to filter the dataset so that it doesn't contain any object in the LVIS dataset?
The text was updated successfully, but these errors were encountered:
Thank you for your interest in our work. Our MAVL model is trained on 1.3M aligned image-text pairs from from Flickr30k, MS-COCO (2014), and Visual Genome (VG). We refer this dataset as LMDet Dataset (See. 2 of paper). Note that we do not explicitly include LVIS categories in LMDet, however, it has many LVIS categories mentioned in the text used for training MAVL.
So for a fair Open World comparison on LVIS, it is recommended to train MAVL on a filtered dataset removing all the captions/text that mention any of the LVIS categories. We followed a similar setting for reporting ORE results on COCO using MAVL proposals (See. 4.2 of paper).
However, during our COCO Open-world OD experiments, we note a very little difference in results when using proposals from original MAVL and the MAVL trained on a filtered dataset.
I hope this would be helpful. Do let me know if you have any questions and face any difficulty on training MAVL. Thanks
Does the pre-trained model you provide cover the categories on LVIS data? If I want to do open-world object detection on the LVIS dataset, can I directly use your pre-trained model to generate the proposals or should I need to filter the dataset so that it doesn't contain any object in the LVIS dataset?
The text was updated successfully, but these errors were encountered: