For OWL-ViT, is there a demo which shows the way using image patch as querys to do one-shot detection? #325

Edwardmark · 2022-05-19T10:09:01Z

Hi, thanks for your great work. And the demo of text zero-shot is amazing.
For OWL-ViT, is there a demo which shows the way using image patch as querys to do one-shot detection?
Thanks.

mjlm · 2022-05-20T06:23:11Z

Hi, we're actively working on this demo and will let you know when it's available, hopefully some time next week.

Edwardmark · 2022-05-27T09:26:22Z

@mjlm And what prompt is used in coco evaluation? In the paper, it says it uses the seven best prompts, so what are the seven best text prompts? Thanks.

AlexeyG · 2022-05-27T10:13:17Z

The prompts can be found in the CLIP repository. During inference we used the 7 ensembling prompts from the colab.

stevebottos · 2022-06-11T19:40:49Z

Is this still in the works? I've been interested in seeing how image input queries could be used as well

xishanhan · 2022-06-11T20:47:17Z

Hi, we're actively working on this demo and will let you know when it's available, hopefully some time next week.

Hi, is this one-shot detection demo finished? I'm also very interested in it and want to try.

mjlm · 2022-06-13T09:54:51Z

We're still working on this and will let you know here when the demo is ready. I re-opened the issue to keep track.

xishanhan · 2022-06-13T09:58:39Z

We're still working on this and will let you know here when the demo is ready. I re-opened the issue to keep track.

That would be very nice, thank you!

mjlm · 2022-06-22T13:43:07Z

We just added a Playground Colab with an interactive demo of both text-conditioned and image-conditioned detection:

The underlying code illustrates how to extract an embedding for a given image patch, specifically here: https://github.com/google-research/scenic/blob/main/scenic/projects/owl_vit/notebooks/inference.py#L110-L131

Let us know if you have any questions!

xishanhan · 2022-06-22T13:54:24Z

We just added a Playground Colab with an interactive demo of both text-conditioned and image-conditioned detection:

The underlying code illustrates how to extract an embedding for a given image patch, specifically here: https://github.com/google-research/scenic/blob/main/scenic/projects/owl_vit/notebooks/inference.py#L110-L131

Let us know if you have any questions!

Thanks for your reply！I don't have any problems now.

BIGBALLON · 2023-09-26T05:01:03Z

Hi, @mjlm , thanks for your great work!

I wonder if there are any plans to implement multi-query image-conditioned detection.

Sometimes a single query image is often unable to capture all the features of an object, and using multiple query images to represent it can yield better results.

thanks again!

mjlm · 2023-09-26T08:17:49Z

You can simply average the embeddings of multiple boxes to get a query embedding. This is how we implemented few-shot (i.e. more than one-shot) detection in the paper.

#890 will add example code for image-conditioned detection to the colab. The example shows how to get a query_embedding from the class_embeddings of the source (query) image. If you have e.g. two query embeddings representing the same object, you can simply do two_shot_query_embedding = (query_embedding_1 + query_embedding_2) / 2. This simple method worked for us. Another option would be to keep the embeddings separate, but map them to the same class after classification.

AlexeyG closed this as completed May 27, 2022

mjlm reopened this Jun 13, 2022

AlexeyG closed this as completed Jul 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For OWL-ViT, is there a demo which shows the way using image patch as querys to do one-shot detection? #325

For OWL-ViT, is there a demo which shows the way using image patch as querys to do one-shot detection? #325

Edwardmark commented May 19, 2022

mjlm commented May 20, 2022

Edwardmark commented May 27, 2022

AlexeyG commented May 27, 2022

stevebottos commented Jun 11, 2022

xishanhan commented Jun 11, 2022

mjlm commented Jun 13, 2022

xishanhan commented Jun 13, 2022

mjlm commented Jun 22, 2022

xishanhan commented Jun 22, 2022

BIGBALLON commented Sep 26, 2023

mjlm commented Sep 26, 2023

For OWL-ViT, is there a demo which shows the way using image patch as querys to do one-shot detection? #325

For OWL-ViT, is there a demo which shows the way using image patch as querys to do one-shot detection? #325

Comments

Edwardmark commented May 19, 2022

mjlm commented May 20, 2022

Edwardmark commented May 27, 2022

AlexeyG commented May 27, 2022

stevebottos commented Jun 11, 2022

xishanhan commented Jun 11, 2022

mjlm commented Jun 13, 2022

xishanhan commented Jun 13, 2022

mjlm commented Jun 22, 2022

xishanhan commented Jun 22, 2022

BIGBALLON commented Sep 26, 2023

mjlm commented Sep 26, 2023