Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: How can I load text embeddings from a local model? #9421

Closed
dm-2111 opened this issue Aug 17, 2023 · 4 comments
Closed

Issue: How can I load text embeddings from a local model? #9421

dm-2111 opened this issue Aug 17, 2023 · 4 comments
Labels
Ɑ: embeddings Related to text embedding models module 🤖:question A specific question about the codebase, product, project, or how to use a feature

Comments

@dm-2111
Copy link

dm-2111 commented Aug 17, 2023

Issue you'd like to raise.

I do not have access to huggingface.co in my environment, but I do have the Instructor model (hkunlp/instructor-large) saved locally. How do I utilize the langchain function HuggingFaceInstructEmbeddings to point to a local model?

I tried the below code but received an error:

from langchain.embeddings import HuggingFaceInstructEmbeddings

model_name = "<local_filepath>/hkunlp/instructor-large"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
hf = HuggingFaceInstructEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

embeddings = HuggingFaceInstructEmbeddings(
    query_instruction="Represent the query for retrieval: "
)


Error:
HfHubHTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/hkunlp/instructor-large

Suggestion:

No response

@dosubot dosubot bot added Ɑ: embeddings Related to text embedding models module 🤖:question A specific question about the codebase, product, project, or how to use a feature labels Aug 17, 2023
@dosubot
Copy link

dosubot bot commented Aug 17, 2023

🤖

Hello,

Thank you for reaching out. Based on the information you've provided and the similar issues I found in the LangChain repository, you can load a local model using the HuggingFaceInstructEmbeddings function by passing the local path to the model_name parameter.

Here's how you can do it:

from langchain.embeddings import HuggingFaceInstructEmbeddings

# Path to your local model
local_model_path = "<local_filepath>/hkunlp/instructor-large"

# Instantiate the HuggingFaceInstructEmbeddings class
hf = HuggingFaceInstructEmbeddings(
    model_name=local_model_path,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

In this example, replace "<local_filepath>/hkunlp/instructor-large" with the actual path to your locally saved Instructor model.

Please note that the sentence_transformers and InstructorEmbedding python packages should be installed.

This solution is based on the following similar issues:

And the LangChain codebase:

Please try this solution and let me know if it works for you or if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dmyoung9
Copy link

dmyoung9 commented Aug 18, 2023

Based on #2667 (thanks @dosu-beta[bot] 🙏), it looks like in order for it to recognize that you're using a local model, you just need to give it the local path, extension and all. Looks like you're almost there 🤏

Copy link

dosubot bot commented Nov 17, 2023

Hi, @dm-2111! I'm Dosu, and I'm helping the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you encountered a 403 Forbidden error when using the HuggingFaceInstructEmbeddings function with a local model. I suggested passing the local path to the model_name parameter, and user "dmyoung9" mentioned that the local path should include the extension. However, there haven't been any further updates or resolutions provided.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 17, 2023
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 24, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 24, 2023
@MiCodes2
Copy link

MiCodes2 commented Mar 25, 2024

I am facing similar issue. I am using LangChain 0.15. I can't upgrade it to latest version due to compatibility issues with other libraries.
I am using following code:

# Load the locally saved SentenceTransformer model
modelPath = r"******\MiniLM"
model = SentenceTransformer(modelPath)

# Set the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Initialize the embeddings
embeddings = HuggingFaceEmbeddings(model=model, model_kwargs={'device': device})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: embeddings Related to text embedding models module 🤖:question A specific question about the codebase, product, project, or how to use a feature
Projects
None yet
Development

No branches or pull requests

3 participants