Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: HuggingFaceEmbeddings can not take trust_remote_code argument #6080

Open
jackfrost1411 opened this issue Jun 13, 2023 · 18 comments
Open

Comments

@jackfrost1411
Copy link
Contributor

jackfrost1411 commented Jun 13, 2023

HuggingFaceEmbeddings can not take trust_remote_code argument

image

Suggestion:

No response

@damingerdai
Copy link

damingerdai commented Aug 10, 2023

i also meet the same issue

@damingerdai
Copy link

i fixed by this code:

from langchain.llms import HuggingFacePipeline
from langchain.memory.buffer import ConversationBufferMemory
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM

local_llm = HuggingFacePipeline.from_model_id(
        model_id="chatglm-6b-int4",
        task="text-generation",
        model_kwargs={
            "temperature": 0, "max_length": 64,
+           "trust_remote_code": True
        },
    )
print(local_llm('What is the capital of France? '))

@nbbaier
Copy link
Contributor

nbbaier commented Oct 29, 2023

The above solution doesn't work for HuggingFaceEmbeddings:

from langchain.embeddings import HuggingFaceEmbeddings

model_name = "jinaai/jina-embeddings-v2-small-en"
model_kwargs = {"device": "cpu", "trust_remote_code": True}
encode_kwargs = {
    "normalize_embeddings": False,
}
hf = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)
text = "This is a test document."

res = hf.embed_query(text)

print(len(res))

Results in the following error:

Traceback (most recent call last):
  File "/Users/nbbaier/grubby-galaxy/main.py", line 9, in <module>
    hf = HuggingFaceEmbeddings(
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nbbaier/.pyenv/versions/3.11.0/lib/python3.11/site-packages/langchain/embeddings/huggingface.py", line 66, in __init__
    self.client = sentence_transformers.SentenceTransformer(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: SentenceTransformer.__init__() got an unexpected keyword argument 'trust_remote_code'

@GlitChed2k2
Copy link

The above solution doesn't work for HuggingFaceEmbeddings:

from langchain.embeddings import HuggingFaceEmbeddings

model_name = "jinaai/jina-embeddings-v2-small-en"
model_kwargs = {"device": "cpu", "trust_remote_code": True}
encode_kwargs = {
    "normalize_embeddings": False,
}
hf = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)
text = "This is a test document."

res = hf.embed_query(text)

print(len(res))

Results in the following error:

Traceback (most recent call last):
  File "/Users/nbbaier/grubby-galaxy/main.py", line 9, in <module>
    hf = HuggingFaceEmbeddings(
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nbbaier/.pyenv/versions/3.11.0/lib/python3.11/site-packages/langchain/embeddings/huggingface.py", line 66, in __init__
    self.client = sentence_transformers.SentenceTransformer(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: SentenceTransformer.__init__() got an unexpected keyword argument 'trust_remote_code'

Facing the same issue! Any solution for this?

@gururise
Copy link
Contributor

gururise commented Nov 9, 2023

Has anyone found a solution to this? Seems like we should be able to pass 'trust_remote_code' into kwargs of the HugginfFaceEmbeddings.

@Nafay-0
Copy link

Nafay-0 commented Nov 9, 2023

from langchain.embeddings import HuggingFaceEmbeddings
from transformers import AutoModel
model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-en', trust_remote_code=True) 

model_name = "jinaai/jina-embeddings-v2-base-en"
model_kwargs = {'device': 'cpu'}
embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
)

Found the following work around for the problem by loading the embedding model via Transformers class.

@christophfroeschl
Copy link

christophfroeschl commented Nov 14, 2023

workaround is good but a fix for the underlying issue pass

trust_remote_code into kwargs of the HugginfFaceEmbeddings

is welcome

@mallapraveen
Copy link

is this fixed?

@jcgeo9
Copy link

jcgeo9 commented Dec 14, 2023

same isuse here, still waiting for the fix :)

@manjunathshiva
Copy link
Contributor

Same issue! Need fix for the same

@weissenbacherpwc
Copy link

weissenbacherpwc commented Jan 28, 2024

so there is the same performance when loading the embeddings model with:

from transformers import AutoModel
model = AutoModel.from_pretrained('PATH_TO_LOCAL_EMBEDDING_MODEL_FOLDER', trust_remote_code=True)

instead of:
from langchain.embeddings import HuggingFaceEmbeddings

model_name = "PATH_TO_LOCAL_EMBEDDING_MODEL_FOLDER"
model_kwargs = {'device': 'cpu'}
embeddings = HuggingFaceEmbeddings(
model_name=model_name,
model_kwargs=model_kwargs,
)

I figured out that some embeddings have a sligthly different value, so enabling "trust_remote_code=True" would be appreciated!

@hidacow
Copy link

hidacow commented Feb 20, 2024

pip install -U sentence-transformers

Update sentence-transformers to >=2.3.1 is working for me.
Seems trust_remote_code is not introduced into the constructor of SentenceTransformer before 2.3

@shimonl20
Copy link

shimonl20 commented Feb 20, 2024

This worked for me...

model_kwargs = {'trust_remote_code': True}
embeddings = HuggingFaceEmbeddings(model_name="jinaai/jina-embeddings-v2-base-en",
                                       model_kwargs=model_kwargs)

@weissenbacherpwc
Copy link

This worked for me...

model_kwargs = {'trust_remote_code': True}
embeddings = HuggingFaceEmbeddings(model_name="jinaai/jina-embeddings-v2-base-en",
                                       model_kwargs=model_kwargs)

are you using Langchain Huggingface Embeddings here?

@shimonl20
Copy link

This worked for me...

model_kwargs = {'trust_remote_code': True}
embeddings = HuggingFaceEmbeddings(model_name="jinaai/jina-embeddings-v2-base-en",
                                       model_kwargs=model_kwargs)

are you using Langchain Huggingface Embeddings here?

Yes

@KasemOmary
Copy link

KasemOmary commented Feb 26, 2024

embedding_model = HuggingFaceEmbeddings(model_name="whatever-model-you-are-using", model_kwargs={"trust_remote_code":True})

Worked fine for me. As mentioned earlier, be sure to update the versions being used. I've used transformer 2.4 models with no issues.

@ps360pa
Copy link

ps360pa commented Mar 29, 2024

pip install -U sentence-transformers

Update sentence-transformers to >=2.3.1 is working for me. Seems trust_remote_code is not introduced into the constructor of SentenceTransformer before 2.3

pip install -U sentence-transformers==2.4.0
it works for 2.4.0 but 2.5.1 the sentence-transformers get error

@ramamimu
Copy link

ramamimu commented Apr 3, 2024

pip install -U sentence-transformers==2.4.0

I am still facing another error which seems like similar things
TypeError: __init__() takes 1 positional argument but 2 were given

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests