Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting the mask of first frame without using XMem or SAM as a preprocessing #51

Closed
monajalal opened this issue Apr 11, 2024 · 9 comments

Comments

@monajalal
Copy link

This is a followup question.

Is there a way to make FoundationPose work with only the 2D bounding box of he object of interest?

Has anyone streamlined it so that there was no need for providing the mask of the first frame as a pre-requisite?

image

Also, I am not sure how I can provide the pre-req by clicking on one single point on the object. Can someone please walk me through?

@wenbowen123
Copy link
Collaborator

Is there a way to make FoundationPose work with only the 2D bounding box of he object of interest?

Yes, you can convert the bbox to a segmentation mask and run the same way. It will work fine. To convert, make the pixels inside the box >0 and background==0.

@monajalal
Copy link
Author

monajalal commented Apr 12, 2024

Thanks for your response. Could you please clarify this or please link me to a reference? Any chance you may be able to provide an example of this?

Yes, you can convert the bbox to a segmentation mask and run the same way. It will work fine. To convert, make the pixels inside the box >0 and background==0.

@monajalal
Copy link
Author

Do you expect the performance to drop if I use 2D bbox instead of segmentation mask?

@wenbowen123
Copy link
Collaborator

Suppose your bbox is [umin, vmin, umax, vmax]

mask = np.zeros((height, width), dtype=bool)
mask[vmin:vmax, umin:umax] = 1

@wenbowen123
Copy link
Collaborator

no, it should work as good as the segmentation. I've tried this many times.

@monajalal
Copy link
Author

@wenbowen123
Thanks a lot for your guidance. I just wanted to confirm I was able to perform FoundationPose with only 2D bbox of first frame in yolox format and converting it to binary mask.

@wenbowen123
Copy link
Collaborator

wenbowen123 commented Apr 16, 2024

yes

@abhishekmonogram
Copy link

@wenbowen123 In this case, the generated mask will be completely white. Will it still work?

@wenbowen123
Copy link
Collaborator

@wenbowen123 In this case, the generated mask will be completely white. Will it still work?

the area inside the 2D box will be all white, yes, this will be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants