-
Notifications
You must be signed in to change notification settings - Fork 5.5k
SOTA Model for Text Prompt Segmentation #575
Comments
I recommend you this: https://github.com/luca-medeiros/lang-segment-anything |
Thank you for the recommendation. However, I have tried it and found that it is just an easier-to-read version of Grounded-Segment-Anything. It uses the same method of using GroundingDINO to translate the text prompt to a box prompt and then sending it to SAM, resulting in similar outcomes to the Grounded-Segment-Anything mentioned earlier. I believe that an oriented text prompt segment model (rather than the two-stage invoking) is necessary to address the issue at hand and facilitate broader downstream applications. |
Do you have any good solutions? I'm facing the same problem now |
@TerryYiDa No. So, I hope this issue can track the progress of the advanced text-prompt segmentation model. |
I have the same problem, do you find a solution? |
Lol, really wish it was possible to open up the ability to use text prompts . A two-stage approach like Grounded-Segment-Anything is neither useful nor elegant.😣 |
Anyone made progress with this issue? |
I am looking for a state-of-the-art (SOTA) model for text prompt segmentation. Currently, I am aware of two choices: Grounded-Segment-Anything and SEEM. However, both of these models fail to meet my requirements.
Consider the following example: I want the model to segment the lane lines, but the results from the aforementioned methods are as follows (i hope they can segment the lane line in the road):
Grounded-Segment-Anything:
![image](https://private-user-images.githubusercontent.com/33119031/270579943-89883c7f-d5de-41ae-89d0-a79f10680a17.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjMzOTc3MDMsIm5iZiI6MTcyMzM5NzQwMywicGF0aCI6Ii8zMzExOTAzMS8yNzA1Nzk5NDMtODk4ODNjN2YtZDVkZS00MWFlLTg5ZDAtYTc5ZjEwNjgwYTE3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODExVDE3MzAwM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTg5NzZmZjhjZjliNjhmM2E3NTE0NmI4OWI5NGUxZTQ3MTNlNDU4ZTVlOWZhMGY4ZWVhODgyOGI3YjU5NjM2ZjImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.b72dMWp3M5cvG8sGkuhsenicHBAg2491tSvrlE_WipE)
SEEM Model:
![image](https://private-user-images.githubusercontent.com/33119031/270579442-9c724f67-643f-4071-92fa-acf61cdead98.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjMzOTc3MDMsIm5iZiI6MTcyMzM5NzQwMywicGF0aCI6Ii8zMzExOTAzMS8yNzA1Nzk0NDItOWM3MjRmNjctNjQzZi00MDcxLTkyZmEtYWNmNjFjZGVhZDk4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODExVDE3MzAwM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWIxMGI5ZmE0MTVkODVlNGIyOTNiNWE2YmY1MDYwMGY2Y2MwMWZmNjZlMmNiZTRmMzMxNGYzYjVlY2M1NDQxMWImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.k2X5e65b6J4K8k7_qqJNDIYB0xJXnv2WDbMp6cNMeXM)
Unfortunately, neither of them can solve this problem effectively. I would greatly appreciate any recommendations you may have.
Any information regarding the timeline for the release of SAM text-prompt capabilities would be welcome.
The text was updated successfully, but these errors were encountered: