-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline support for image similarity #20830
Comments
Ccing @NielsRogge @osanseviero @nateraw |
We don't currently have a So, with that in mind, maybe the pipeline could be an equivalent Unfortunately, the name '-feature extractor' is quite confusing since that's what the image processing utils are called still (I think?). |
Yes, let's do that!
@amyeroberts worked on porting the image feature extractors to With that in mind, |
Even with it being legacy, I'm slightly concerned this may become confusing to some users. I'll let some others weight in here! I can live with |
Afaik |
I need to try it out to verify if it works with the feature extraction pipeline. Will confirm soon. |
Yes I'm not sure there's a need for a new |
I verified the feature-extraction pipeline, and it seems like it always assumes the preprocessing will use a tokenizer as opposed to an image processor: ---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-10-55a624159a32> in <module>
1 image_one = "https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png"
2
----> 3 image_feature_extractor(image_one)
3 frames
/usr/local/lib/python3.8/dist-packages/transformers/pipelines/feature_extraction.py in __call__(self, *args, **kwargs)
103 A nested list of `float`: The features computed by the model.
104 """
--> 105 return super().__call__(*args, **kwargs)
/usr/local/lib/python3.8/dist-packages/transformers/pipelines/base.py in __call__(self, inputs, num_workers, batch_size, *args, **kwargs)
1072 return self.iterate(inputs, preprocess_params, forward_params, postprocess_params)
1073 else:
-> 1074 return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
1075
1076 def run_multi(self, inputs, preprocess_params, forward_params, postprocess_params):
/usr/local/lib/python3.8/dist-packages/transformers/pipelines/base.py in run_single(self, inputs, preprocess_params, forward_params, postprocess_params)
1078
1079 def run_single(self, inputs, preprocess_params, forward_params, postprocess_params):
-> 1080 model_inputs = self.preprocess(inputs, **preprocess_params)
1081 model_outputs = self.forward(model_inputs, **forward_params)
1082 outputs = self.postprocess(model_outputs, **postprocess_params)
/usr/local/lib/python3.8/dist-packages/transformers/pipelines/feature_extraction.py in preprocess(self, inputs, **tokenize_kwargs)
77 def preprocess(self, inputs, **tokenize_kwargs) -> Dict[str, GenericTensor]:
78 return_tensors = self.framework
---> 79 model_inputs = self.tokenizer(inputs, return_tensors=return_tensors, **tokenize_kwargs)
80 return model_inputs
81
TypeError: 'NoneType' object is not callable The design choice seems reasonable in case image feature extraction was not considered. Here's my Colab Notebook. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Closing this as we internally deprioritized this pipeline. |
Feature request
Given that we have a tutorial notebook on image similarity and an upcoming blog post, and given the usefulness of the use case, it's time we added a pipeline for this task.
Motivation
Image similarity is an important use case in the industry.
Your contribution
Happy to contribute the pipeline.
Following describes some of the design decisions I had in mind for this pipeline.
By default, we provide the most downloaded image classification model (trained on ImageNet-1k).
Image inputs to the
__call__()
of the pipeline would be similar to anImageClassificationPipeline
except the the input needs to be a list of two images / URLs, etc.We return a matrix quantifying the similarity scores (cosine similarity) between all the input images.
We might want to also provide recommendations to the users when using this pipeline. For example, the input images would need to be provided in accordance with the provided model. If you're using a model that was pre-trained / fine-tuned on medical images then there's no point in passing images of cats and dogs to compute similarity over.
Related: huggingface/huggingface.js#338
The text was updated successfully, but these errors were encountered: