-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a mask prompt for boundary refinement #169
Comments
I encounter the same problem, I found the input mask size must be 256*256. When I resize my mask to this size, the output segmentation results is a mess and make no sense. Do anyone have a clue? |
Encountered the same issue |
I'm observing the same behavior. |
Not yet, unfortunately. The only crutch I came up with is sampling the points inside of an instance's mask (with different sampling strategies, e.g. w.r.t distance transform) as "positive" class and sampling more points outside of the mask as "negative" class. Then combining those sparse inputs along with a coarse binary mask into a prompt and feed it into the SAM. But still that's not perfect to refine the dataset |
@Gpoxolcku |
@kampelmuehler when you say pad the mask, is this so that the mask fits over the transformed input image to the model? (which is also padded and squared.) |
Using only the mask logits did not work for me, it rendered nonsense results. On the other hand, querying positive and negative points from the binary mask yielded a better result; and it improves a lot if you do it iteratively, feeding the predictor with random samples of positive and negative points of the binary mask besides the best logits outcome for some time (on the paper, they mention 11 iterations; see. Appendix $A - Training Algorithm); The number of query points does not seem to matter too much; it seems to impact more on how fidelity you want it to keep with the original mask. In my data case, despite the iterative process improving the SAM outcome, it didn't refine fine details. |
I will be instrested to boundary refinement using a mask prompt. First, I tried using a bbox and the object has been delineated with good accuracy. Any help with this please? |
@Davidyao99 yes, precisely |
The mask prompt and bbox prompt are needed to provided together to generate a proper mask. |
@antoniocandito , did you manage to make the mask input work? @GoGoPen that's really useful information, thanks for sharing! |
What is the proper way to pad the mask? Do you add the pad to the lower right, or do you center the mask in the target dimensions and add padding to the top, bottom, left, and right? |
Not sure if this is the proper way to get the mask output, but this is what I discovered... SamPredictor.predict() states that the mask_input should be something like this:
Looking at a histogram of values the model produces in the In Sam.py, the mask_threshold is harcoded to 0.0. Thresholding the Looking at a thresholded and scaled * 128 version of By making a custom |
Very good observations @markushunter! I'll try to add padding to the bottom-right and see if the results change. I don't fully understand the part related to assigning values of -8 / 1. Do you mean that binary mask values (0/1) should be replaced to -8 and 1, because of the 0.0 mask_threshold in Sam.py? Thanks for the info! 💙🖖 |
@cip8 Yes, instead of using 0 or 1 for the values in the mask, you need to represent the negative space with a number far less than zero. Since SAM thresholds the mask at the floating point value 0.0, having the negative space as 0.0 isn't good enough. The histogram seemed to imply that negative space in the output mask has values around -8 to -10, so I just ran with -8. |
The docs say that logits from a previous run can be used for this These logits are indeed floats and look like this:
From what I understand they represent probabilities for the mask, do you know if that's accurate? |
Grayscale mask to SAM mask_input:Based on the info discussed so far, this is how I implemented a conversion between grayscale and SAM's
Usage example:
Experimental findings
Improvement suggestions
|
Hi all, I am trying to refine cell segmentation foreground/background mask predicted by another model using SAM. I have tried following iterative approach and Grayscale to mask_input approach (as mentioned by @cip8 sir) but no help. Please could someone guide me? All my images are greyscale of size (256,256) |
Hi all, has anyone managed to solve this problem efficiently? |
Hi everyone, If you check this demo notebook, it is explained that the input mask is not such a mask: it is supposed to be the output low resolution mask from a previous iteration (prediction):
So for now, it seems that is no possible to prompt with an accurate mask (or not with good results). Hope it helps! |
Is there a law that says we are not allowed to "fake" these logits? 😃 So far in this conversation people came with different conclusions on how to replicate the behavior of these masks, where the threshold point is, etc. I don't think an answer that doesn't take into consideration the rest of the thread is helpful. Anyone can say "this can't be done", but that's not a real hacker mentality and rarely achieves anything. |
I think the problem resides in the "weight" associated to this extra mask parameter. Maybe the next version will put a bigger importance on this param, and maybe accept sizes greater than 256x256 - this will make the model easier to include in existing image processing pipelines. As a trick to bypass this I extract a grid of points from the mask and pass it to SAM instead - the results are much better than the minor changes provided by using the mask_input. I wish someone from Meta could clarify this for us 🙏 💙🖖 |
@cip8 apologize if my response was not to your liking or not what you were looking for. I you had taken the time to deeply read the paper and replicate SAM architecture (not just reading the docs...) you would understand the purpose of this mask_input better. Of course it is possible to replicate it, is just coding and imitating. My point was about replicating it with the desired results. Once again, if you read this thread with all its comments you can check that nobody has gotten the "refinement" results that everyone (including me) were expecting. This is because mask_input is expected to be used in conjunction with a point prompt input (or box), not alone by itself. PS: if it was so straightforward, Meta would have released it for mask prompting... |
It's not about that @dankresio - every contribution is of course helpful and I appreciate your reply, truly! It just seemed to me that your answer didn't take into consideration what was discussed before & I'm also quite easily-triggered by "can't be done" type of answers 😅 I also apologize for my harsh reply 💙🖖 |
i wonder that in stead of "extracting a grid of points from the mask and pass it to SAM", if we shrink the mask prompt with a certain pixel number (to avoid the sampled points later being out of the ground truth mask), and sample a few points on the edge of the shrinked mask would provide better results. To me, it may constrains SAM in a way similar to a mask promt. I might test this and report back if i do that. Others may update if anyone here has time to try it out |
It seems that self-made masks to logit was implemented in micro-sam for ellipse and polygonal prompts and seems to be working correctly. |
You guys might want to check out these two repositories and try creating some sort of a pipeline stitching everything together https://github.com/danielgatis/rembg/tree/main |
Does anyone know how to convert |
Hi, I have a roughly labeled dataset and trying to feed it's labels as a prompt into SAM. I want SAM to refine the segmentation labels and improve my dataset quality. In my case I don't use any additional prompt artifacts like points or boxes (though it works pretty good for such prompts). It seems to me that a pure mask prompt should be supported as well, according to the paper. But the results I obtain are kinda unreliable, an output mask mostly repeats an input one, even making it slightly worse. Is there a code snippet to build the prompts out of the foreign masks?
Thanks in advance!
The text was updated successfully, but these errors were encountered: