-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat: add local_files_only
to sentence-transformers embedders
#9400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add local_files_only
to sentence-transformers embedders
#9400
Conversation
local_files_only
to sentence-transformers embedders
Pull Request Test Coverage Report for Build 15117750305Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Hey @jantrienes, thanks for the PR. I would say that there is no need to introduce a new parameter since we already provide This works as expected and raises an error if the model is not available locally: from haystack.components.embedders import SentenceTransformersTextEmbedder
text_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-mpnet-base-v2",
config_kwargs={"local_files_only":True})
text_embedder.warm_up() However, I agree that this feature is a bit hidden. What do you think? |
Hi @anakin87, thanks a lot! We initially used After closer inspection, we found that other parts of While these paths are guarded by Given that |
Thanks. I'll take a better look then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your clarification, that is correct.
I agree with this change and only left minor comments on the wording.
haystack/components/embedders/sentence_transformers_document_embedder.py
Outdated
Show resolved
Hide resolved
haystack/components/embedders/sentence_transformers_text_embedder.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks!
Related Issues
None, however a similar issue came up in the past: #1256
Proposed Changes:
Pass
local_files_only
argument toSentenceTransformer
to allow use of embedders in offline environments. Whenlocal_files_only=True
and the model is available in cache, no outside networking requests are made. When the model is not available in cache, an exception is raised. See offline mode docs of Huggingface.Note that for proper offline use, passing
local_files_only=True
is necessary even if the model is available in cache, as there are other places where the flag is used inSentenceTransformer
(e.g., here).How did you test it?
Unit tests. Additionally, following snippet demonstrates the desired behavior:
Notes for the reviewer
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.