New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
near real-time?? #17
Comments
Regarding segmentation, I think the idea is to run the XMem segmentation code (https://github.com/hkchengrex/XMem). This requires only the mask for one frame and then outputs the mask for all other frames. This worked for me (just follow the setup and instructions on the XMem repo). The README.md of this (BundleSDF) project mentions something along those lines I think (and that this code couldn't be included in this repo because of copyright issues). |
Thanks @redgreenblue3 for the explanation! Yes, and as part of the reason, the current release has some code re-factorization that does not necessarily reproduce the same speed. But we hope this code release could still benefit the community, and we are also looking forwarding to seeing future work making it blazingly fast. That said, there are many parameters in the config where you can tune. The current config is not the optimal for live running. Like I mentioned, there are settings like sync restriction, image resolution, etc that will also affect the speed a lot. @monajalal V100 GPU is also not a state-of-art GPU. |
While I understand V100 is not a state-of-the-art GPU, could you please state what is the minimum GPU needed for your experiments? Is it 3090 Ti with 24 G VRAM or a GPU with certain specs and CC? Thanks for your help with the open-source community. As someone who is interested in reproducing the results but doesn't necessarily have access to a 3090 Ti but have access to Azure Subscription and can potentially find something if I know enough specs, this would be a great help. |
May I ask how to combine these two? Is it to get the required data set of BundleSDF after obtaining the segment mask in XMem? |
Hello, thank you for reproducing the work. I also have the same question. Why do some people run the first line of code segment for 3 hours? Does this conflict with the author's statement of real-time? |
Thank you for your excellent work @wenbowen123 run_demo.py, took an hour, does it conflict with what you said about real-time? My understanding is not thorough enough. What exactly does this feature do? Would it be convenient to answer my questions. Delaying some of your valuable time |
@wenbowen123 with a RTX ADA6000 GPU and 48G GPU VRAM, it is not (near) real-time even if we pass all the masks for each frame of the video (and don't count that time). Are you planning to release the version of the code that corresponds to your abstract statement in the OP? |
Hi @wenbowen123
I am trying to fill in the gap for my own understanding.
So, your method works for 6 DoF pose of novel objects (and novel classes) that were not in training and are only shown in inference time to the model.
However, what I don't understand why is it taking ~3 hr ish as reported by other users to just run the demo for the milk?
Also, in the paper you mention we would only need the segmentation mask for the first frame of the video, however, in the milk input data from Google Drive, the entire video has seg mask.
Any update is really great specially about novel objects of novel class 6 DoF
The text was updated successfully, but these errors were encountered: