Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline support for image similarity #20830

Closed
sayakpaul opened this issue Dec 19, 2022 · 10 comments
Closed

Pipeline support for image similarity #20830

sayakpaul opened this issue Dec 19, 2022 · 10 comments

Comments

@sayakpaul
Copy link
Member

sayakpaul commented Dec 19, 2022

Feature request

Given that we have a tutorial notebook on image similarity and an upcoming blog post, and given the usefulness of the use case, it's time we added a pipeline for this task.

Motivation

Image similarity is an important use case in the industry.

Your contribution

Happy to contribute the pipeline.

Following describes some of the design decisions I had in mind for this pipeline.

By default, we provide the most downloaded image classification model (trained on ImageNet-1k).

Image inputs to the __call__() of the pipeline would be similar to an ImageClassificationPipeline except the the input needs to be a list of two images / URLs, etc.

We return a matrix quantifying the similarity scores (cosine similarity) between all the input images.

We might want to also provide recommendations to the users when using this pipeline. For example, the input images would need to be provided in accordance with the provided model. If you're using a model that was pre-trained / fine-tuned on medical images then there's no point in passing images of cats and dogs to compute similarity over.

Related: huggingface/huggingface.js#338

@sayakpaul
Copy link
Member Author

Ccing @NielsRogge @osanseviero @nateraw

@nateraw
Copy link
Contributor

nateraw commented Jan 4, 2023

We don't currently have a text-similarity/sentence-similarity pipeline either, right? I think for that task, folks use the feature-extraction pipeline to get embeddings, then just compute the similarity. Here's an example.

So, with that in mind, maybe the pipeline could be an equivalent image-feature-extraction for vision?

Unfortunately, the name '-feature extractor' is quite confusing since that's what the image processing utils are called still (I think?).

@sayakpaul
Copy link
Member Author

So, with that in mind, maybe the pipeline could be an equivalent image-feature-extraction for vision?

Yes, let's do that!

Unfortunately, the name '-feature extractor' is quite confusing since that's what the image processing utils are called still (I think?).

@amyeroberts worked on porting the image feature extractors to ***ImageProcessor (example). We also throw a warning when users cal XXXFeatureExtractor from the library.

With that in mind, image-feature-extraction does seem alright to me.

@nateraw
Copy link
Contributor

nateraw commented Jan 4, 2023

Even with it being legacy, I'm slightly concerned this may become confusing to some users. I'll let some others weight in here!

I can live with image-feature-extraction if nobody else vetos

@osanseviero
Copy link
Member

Afaik feature-extraction already works for image feature extraction as well (see https://huggingface.co/google/vit-base-patch16-224-in21k for example)

@sayakpaul
Copy link
Member Author

Afaik feature-extraction already works for image feature extraction as well (see https://huggingface.co/google/vit-base-patch16-224-in21k for example)

I need to try it out to verify if it works with the feature extraction pipeline. Will confirm soon.

@NielsRogge
Copy link
Contributor

NielsRogge commented Jan 4, 2023

Yes I'm not sure there's a need for a new image-feature-extraction pipeline, one can leverage the feature-extraction pipeline already

@sayakpaul
Copy link
Member Author

I verified the feature-extraction pipeline, and it seems like it always assumes the preprocessing will use a tokenizer as opposed to an image processor:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-55a624159a32> in <module>
      1 image_one = "https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png"
      2 
----> 3 image_feature_extractor(image_one)

3 frames
/usr/local/lib/python3.8/dist-packages/transformers/pipelines/feature_extraction.py in __call__(self, *args, **kwargs)
    103             A nested list of `float`: The features computed by the model.
    104         """
--> 105         return super().__call__(*args, **kwargs)

/usr/local/lib/python3.8/dist-packages/transformers/pipelines/base.py in __call__(self, inputs, num_workers, batch_size, *args, **kwargs)
   1072             return self.iterate(inputs, preprocess_params, forward_params, postprocess_params)
   1073         else:
-> 1074             return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
   1075 
   1076     def run_multi(self, inputs, preprocess_params, forward_params, postprocess_params):

/usr/local/lib/python3.8/dist-packages/transformers/pipelines/base.py in run_single(self, inputs, preprocess_params, forward_params, postprocess_params)
   1078 
   1079     def run_single(self, inputs, preprocess_params, forward_params, postprocess_params):
-> 1080         model_inputs = self.preprocess(inputs, **preprocess_params)
   1081         model_outputs = self.forward(model_inputs, **forward_params)
   1082         outputs = self.postprocess(model_outputs, **postprocess_params)

/usr/local/lib/python3.8/dist-packages/transformers/pipelines/feature_extraction.py in preprocess(self, inputs, **tokenize_kwargs)
     77     def preprocess(self, inputs, **tokenize_kwargs) -> Dict[str, GenericTensor]:
     78         return_tensors = self.framework
---> 79         model_inputs = self.tokenizer(inputs, return_tensors=return_tensors, **tokenize_kwargs)
     80         return model_inputs
     81 

TypeError: 'NoneType' object is not callable

The design choice seems reasonable in case image feature extraction was not considered.

Here's my Colab Notebook.

@nateraw @NielsRogge @osanseviero

@github-actions
Copy link

github-actions bot commented Feb 7, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@sayakpaul
Copy link
Member Author

Closing this as we internally deprioritized this pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants