GPTSimpleVectorIndex azure embedding error #990

kwin-wang · 2023-03-30T15:38:34Z

After referring to this example Azure Openai demo and running the code, I received the following error message, how to fix this problem?

INFO:openai:error_code=None error_message="This model's maximum context length is 2046 tokens, however you requested 3383 tokens (3383 in your prompt; 0 for the completion). Please reduce your prompt; or completion length." error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False
Traceback (most recent call last):
  File "/workspaces/gptdemo/test.py", line 61, in <module>
    index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/base.py", line 100, in from_documents
    return cls(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/vector_indices.py", line 94, in __init__
    super().__init__(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 58, in __init__
    super().__init__(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/base.py", line 69, in __init__
    index_struct = self.build_index_from_nodes(nodes)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/token_counter/token_counter.py", line 78, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 214, in build_index_from_nodes
    return self._build_index_from_nodes(nodes)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 203, in _build_index_from_nodes
    self._add_nodes_to_index(index_struct, nodes)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 182, in _add_nodes_to_index
    embedding_results = self._get_node_embedding_results(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 100, in _get_node_embedding_results
    ) = self._service_context.embed_model.get_queued_text_embeddings()
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/base.py", line 151, in get_queued_text_embeddings
    embeddings = self._get_text_embeddings(cur_batch_texts)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/base.py", line 103, in _get_text_embeddings
    result = [self._get_text_embedding(text) for text in texts]
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/base.py", line 103, in <listcomp>
    result = [self._get_text_embedding(text) for text in texts]
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/langchain.py", line 30, in _get_text_embedding
    return self._langchain_embedding.embed_documents([text])[0]
  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain/embeddings/openai.py", line 254, in embed_documents
    response = embed_with_retry(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain/embeddings/openai.py", line 53, in embed_with_retry
    return _completion_with_retry(**kwargs)
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 314, in iter
    return fut.result()
  File "/home/codespace/.python/current/lib/python3.10/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/home/codespace/.python/current/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain/embeddings/openai.py", line 51, in _completion_with_retry
    return embeddings.client.create(**kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_resources/embedding.py", line 33, in create
    response = super().create(*args, **kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 2046 tokens, however you requested 3383 tokens (3383 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

eerhil · 2023-03-31T12:47:46Z

I'm running to the same problem myself. Somehow the embedding models are not considering the maximum token lengths from the PromptHelper and I did not find any alternative way to provide the information from the documentation.

Could someone help us solve this?

DavidLiCIG · 2023-04-02T05:11:10Z

Doesn't seem like that the embedding operation uses the PromptHelper, I got past that error by explicitly setting chunk_size_limit :
service_context = ServiceContext.from_defaults( llm_predictor=llm_predictor, embed_model=embedding_llm, prompt_helper=prompt_helper, chunk_size_limit=1000 )

fms-santos · 2023-04-03T13:36:05Z

After referring to this example Azure Openai demo and running the code, I received the following error message, how to fix this problem?

INFO:openai:error_code=None error_message="This model's maximum context length is 2046 tokens, however you requested 3383 tokens (3383 in your prompt; 0 for the completion). Please reduce your prompt; or completion length." error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False
Traceback (most recent call last):
  File "/workspaces/gptdemo/test.py", line 61, in <module>
    index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/base.py", line 100, in from_documents
    return cls(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/vector_indices.py", line 94, in __init__
    super().__init__(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 58, in __init__
    super().__init__(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/base.py", line 69, in __init__
    index_struct = self.build_index_from_nodes(nodes)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/token_counter/token_counter.py", line 78, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 214, in build_index_from_nodes
    return self._build_index_from_nodes(nodes)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 203, in _build_index_from_nodes
    self._add_nodes_to_index(index_struct, nodes)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 182, in _add_nodes_to_index
    embedding_results = self._get_node_embedding_results(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 100, in _get_node_embedding_results
    ) = self._service_context.embed_model.get_queued_text_embeddings()
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/base.py", line 151, in get_queued_text_embeddings
    embeddings = self._get_text_embeddings(cur_batch_texts)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/base.py", line 103, in _get_text_embeddings
    result = [self._get_text_embedding(text) for text in texts]
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/base.py", line 103, in <listcomp>
    result = [self._get_text_embedding(text) for text in texts]
  File "/home/codespace/.python/current/lib/python3.10/site-packages/llama_index/embeddings/langchain.py", line 30, in _get_text_embedding
    return self._langchain_embedding.embed_documents([text])[0]
  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain/embeddings/openai.py", line 254, in embed_documents
    response = embed_with_retry(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain/embeddings/openai.py", line 53, in embed_with_retry
    return _completion_with_retry(**kwargs)
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 314, in iter
    return fut.result()
  File "/home/codespace/.python/current/lib/python3.10/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/home/codespace/.python/current/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/codespace/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain/embeddings/openai.py", line 51, in _completion_with_retry
    return embeddings.client.create(**kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_resources/embedding.py", line 33, in create
    response = super().create(*args, **kwargs)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/home/codespace/.python/current/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 2046 tokens, however you requested 3383 tokens (3383 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

I'm having the exact same problem when using the Azure OpenAI example

fms-santos · 2023-04-03T14:11:15Z

I'm running to the same problem myself. Somehow the embedding models are not considering the maximum token lengths from the PromptHelper and I did not find any alternative way to provide the information from the documentation.

This didn't solve the issue for my case.

eerhil · 2023-04-11T09:35:25Z

chunk_size_limit=1000

This solved the error for me. Maybe, the OpenAI Demo could be updated accordingly?

dosubot · 2023-08-20T16:28:58Z

Hi, @kwin-wang! I'm here to help the LlamaIndex team manage their backlog and I wanted to let you know that we are marking this issue as stale.

Based on my understanding, the issue you reported was about the code requesting more tokens than the maximum context length allowed by the model. It seems that the issue has been resolved by setting the chunk_size_limit explicitly to solve the error. Additionally, the embedding models have been updated to consider the maximum token lengths from the PromptHelper.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your contribution to the LlamaIndex repository!

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Aug 20, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 10, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 10, 2023

dosubot bot mentioned this issue Sep 14, 2023

[Question]: Custom embedding max query length #7679

Closed

1 task

dosubot bot mentioned this issue Oct 5, 2023

[Question]: ModelError: Your input is too long. Max input length is 4096 tokens, but you supplied 5441 tokens. #7974

Closed

1 task

dosubot bot mentioned this issue Oct 13, 2023

[Question]: Using Prompt Helper but stil exceeding LLM's maximum context length #8113

Closed

1 task

dosubot bot mentioned this issue Feb 27, 2024

[Bug]: PromptHelper not adjusting Context window #11416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTSimpleVectorIndex azure embedding error #990

GPTSimpleVectorIndex azure embedding error #990

kwin-wang commented Mar 30, 2023

eerhil commented Mar 31, 2023

DavidLiCIG commented Apr 2, 2023

fms-santos commented Apr 3, 2023

fms-santos commented Apr 3, 2023

eerhil commented Apr 11, 2023

dosubot bot commented Aug 20, 2023

GPTSimpleVectorIndex azure embedding error #990

GPTSimpleVectorIndex azure embedding error #990

Comments

kwin-wang commented Mar 30, 2023

eerhil commented Mar 31, 2023

DavidLiCIG commented Apr 2, 2023

fms-santos commented Apr 3, 2023

fms-santos commented Apr 3, 2023

eerhil commented Apr 11, 2023

dosubot bot commented Aug 20, 2023