Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About obtaining uvo data set ap or boundary ap #114

Open
yaohusama opened this issue Dec 24, 2023 · 5 comments
Open

About obtaining uvo data set ap or boundary ap #114

yaohusama opened this issue Dec 24, 2023 · 5 comments

Comments

@yaohusama
Copy link

I would like to ask if the uvo data set is a video segmentation data set. How is instance segmentation performed? There are two types of videos in the uvo data set, one is dense, which is labeled with coco categories, and the other is sparse, which is only classified into coco or non-coco categories. Since the focalnet-dino used was trained on coco, is it possible to use its dense version of data to measure the indicators in the paper?
Thank you.

@ymq2017
Copy link
Collaborator

ymq2017 commented Dec 26, 2023

Hi, we use an earlier version of UVO v0.5. It has class agnostic label and image/frame set.

@yaohusama
Copy link
Author

Then I would like to ask, on data sets such as coco, hq-ytvis, lvis, and uvo, when selecting boxes based on scores, is it only coco that not only uses the score of the detector to output the box, but also uses the score of the mask when selecting the box? ? The score of mask is predicted by sam's iou token. Do hq-ytvis, lvis, and uvo only require the box score output by the detector, and do not need the score predicted by sam?
Sorry to bother you.

@yaohusama
Copy link
Author

When I inference on instance segmentation across datasets, such as coco, should I resize the image to equal proportions so that the long side reaches 1024 and then add 0 to 1024x1024, instead of directly resizing to 1024X1024. Just like the public hqsam code on hqseg 44k, the inference code directly resizes the image to 1024X1024. Is there no other data augmentation used during inference?

@lkeab
Copy link
Collaborator

lkeab commented Jan 10, 2024

we provided the coco evaluation code here: #113 (comment)

@yaohusama
Copy link
Author

Hi, we use an earlier version of UVO v0.5. It has class agnostic label and image/frame set.

I followed coco's testing method and tested the indicators on the uvo data set. But sam-l only reached 29.2, while the one in the paper reached 29.7. Could you please provide the configuration file for testing uvo data set indicators? Are only the first hundred boxes output by the detector selected for the uvo data set? When testing the indicators of the uvo data set, do you also use the predicted score of sam's iou token to select boxes? May I ask under which framework did you measure the uvo data set indicators? For example, mmdetectron or mmdet. The uvo v0.5 data set only has the object class, while focalnet-dino will output 91 categories. When you calculate ap, do the category IDs predicted by focalnet-dino exceed 80 categories as the background, otherwise they are regarded as the foreground category, and are they calculated as two categories?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants