Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OpenAIEmbeddings is broken in 0.10.6 #10977

Closed
david1542 opened this issue Feb 19, 2024 · 8 comments
Closed

[Bug]: OpenAIEmbeddings is broken in 0.10.6 #10977

david1542 opened this issue Feb 19, 2024 · 8 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@david1542
Copy link

Bug Description

Hey everyone :)

I'm trying to store & embed some documents using OpenAI embeddings but the process seems to crash due to an illegal assignment to the embed_model object.

This is what I'm trying to do in my code (llama-index==0.10.6):

vector_store = PineconeVectorStore(
    pinecone_index=pc_index, namespace=organization_id
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
embed_model = OpenAiEmbedding(
    api_key=openai_api_key,
    api_base=OPENAI_API_BASE,
    api_version=OPENAI_API_VERSION,
    model="text-embedding-3-large",
    dimensions=768,
)

VectorStoreIndex.from_documents(
    documents,
    vector_store=pc_index,
    show_progress=True,
    embed_model=embed_model,
    storage_context=storage_context,
)

The error I'm seeing:

llama_index\core\embeddings\utils.py", line 134, in resolve_embed_model
    embed_model.callback_manager = callback_manager or Settings.callback_manager
TypeError: "OpenAiEmbedding" is immutable and does not support item assignment

It seems to be a bug in the resolve_embed_model function. Can anyone verify that?

Version

0.10.6

Steps to Reproduce

Use OpenAIEmbeddings like in the example I provided and it should crash 🐵

Relevant Logs/Tracbacks

llama_index\core\embeddings\utils.py", line 134, in resolve_embed_model
    embed_model.callback_manager = callback_manager or Settings.callback_manager
TypeError: "OpenAiEmbedding" is immutable and does not support item assignment
@david1542 david1542 added bug Something isn't working triage Issue needs to be triaged/prioritized labels Feb 19, 2024
Copy link

dosubot bot commented Feb 19, 2024

🤖

Hey @david1542! Great to see you back. Running into some turbulence with OpenAIEmbeddings, huh? Let's see what we can do about that.

Thank you for reporting this issue. It seems like you're encountering a problem with the OpenAiEmbedding class being immutable and not supporting item assignment for the callback_manager attribute.

This issue seems similar to a previously resolved issue: [Bug]: ValueError: "HuggingFaceEmbeddings" object has no field "callback_manager". In that case, the solution was to wrap the LLM class with the LangchainLLM class when passing it into the service context. However, this might not be directly applicable to your case as you're dealing with the OpenAiEmbedding class.

The callback_manager attribute in the OpenAiEmbedding class is used to manage events during the embedding process. It seems like the callback_manager is being assigned in the resolve_embed_model function, which is causing the error you're seeing.

As a workaround, you might want to try creating a new instance of OpenAiEmbedding with the callback_manager attribute set during initialization. However, this would require changes to the OpenAiEmbedding class to accept callback_manager as an argument during initialization, which might not be ideal.

I'm sorry for the inconvenience this issue is causing you. I'll forward this to the development team so they can investigate further. In the meantime, if you have any other questions or need further clarification, please don't hesitate to ask.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@logan-markewich
Copy link
Collaborator

What is OpenAiEmbedding ? Isn't it OpenAIEmbedding

This works for me

>>> from llama_index.embeddings.openai import OpenAIEmbedding
>>> from llama_index.core import Document, VectorStoreIndex
>>> embed_model = OpenAIEmbedding(model="text-embedding-3-large", dimensions=768)
>>> index = VectorStoreIndex.from_documents([Document.example()], embed_model=embed_model)

@fearfulsteel420
Copy link

I’m also facing issues with OpenAIEmbedding. Getting a connection error every time. Increasing the timeout doesn’t seem to have any effect. Follow the code below to replicate.

from llama_index.embeddings.openai import OpenAIEmbedding

embed_model=OpenAIEmbedding(model="text-embedding-3-small",dimensions=256, timeout=60)

embeddings = embed_model.get_text_embedding(
    "I’ve tried increasing timeouts and max tries as well but that doesn’t seem to help"
)

@logan-markewich
Copy link
Collaborator

logan-markewich commented Feb 19, 2024

@fearfulsteel420 Any details on the connection error? I cannot replicate

>>> from llama_index.embeddings.openai import OpenAIEmbedding
>>> embed_model=OpenAIEmbedding(model="text-embedding-3-small",dimensions=256, timeout=60)
>>> embeddings = embed_model.get_text_embedding(
...     "I’ve tried increasing timeouts and max tries as well but that doesn’t seem to help"
... )
>>> embeddings[:10]
[-0.06855928152799606, -0.012188607826828957, 0.15770725905895233, 0.012306182645261288, -0.060041628777980804, 0.008269441314041615, 0.0244555976241827, -0.030151452869176865, 0.00643396470695734, 0.025957945734262466]
>>> 

@fearfulsteel420
Copy link

@logan-markewich Apologies. Looks like I was declaring the OPENAI_API_KEY env variable incorrectly. Seems to work fine now.

However, if my declaration is being done incorrectly, shouldn’t i be getting a different error message. All I was getting is the following WARNING message

WARNING:llama_index.embeddings.openai.utils:Retrying llama_index.embeddings.openai.base.get_embedding in 0.5722216962934636 seconds as it raised APIConnectionError: Connection error..

followed by this error:

---------------------------------------------------------------------------
LocalProtocolError                        Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py](https://t7mmctukebp-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240214-060113_RC00_606917857#) in map_httpcore_exceptions()
     66     try:
---> 67         yield
     68     except Exception as exc:

51 frames
LocalProtocolError: Illegal header value b'Bearer '

The above exception was the direct cause of the following exception:

LocalProtocolError                        Traceback (most recent call last)
LocalProtocolError: Illegal header value b'Bearer '

The above exception was the direct cause of the following exception:

APIConnectionError                        Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/openai/_base_client.py](https://t7mmctukebp-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240214-060113_RC00_606917857#) in _request(self, cast_to, options, remaining_retries, stream, stream_cls)
    950 
    951             log.debug("Raising connection error")
--> 952             raise APIConnectionError(request=request) from err
    953 
    954         log.debug(

APIConnectionError: Connection error.

This is how I was incorrectly declaring the env variable:

%env OPENAI\_API\_KEY=sk-…

Forgive my rookie mistake again

@logan-markewich
Copy link
Collaborator

logan-markewich commented Feb 19, 2024

LocalProtocolError: Illegal header value b'Bearer ' -- this means that there is no Bearer token, which means the API key is missing.

Probably we could catch this and give a more informative error. A PR for another day :) Glad it works!

@david1542
Copy link
Author

@logan-markewich Thank you so much. Seems like I imported OpenAiEmbedding from llama_index_client instead of OpenAIEmbedding from llama_index

@logan-markewich
Copy link
Collaborator

Glad we got that cleared up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

3 participants