Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunks too Large #410

Closed
Patrick-Davis-MSFT opened this issue Dec 19, 2023 · 5 comments · Fixed by #478
Closed

Chunks too Large #410

Patrick-Davis-MSFT opened this issue Dec 19, 2023 · 5 comments · Fixed by #478
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Patrick-Davis-MSFT
Copy link
Collaborator

Occurs in Delta Release

Describe the bug
When uploading some documents, we receive the error using the Azure Embeddings model ADA.

openai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 22089 tokens (22089 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

The embeddings model has a capacity of 352K TPM.

To Reproduce
Steps to reproduce the behavior:

  1. Install as norma;
  2. Upload a file
  3. View the error in cosmos

Expected behavior
The embeddings should resolve as normal and if the chunk is too big split the chunk.

Screenshots
full error is below from 2023-12-19 18:52:43

"Traceback (most recent call last):  File \"/opt/python/3.10.13/lib/python3.10/threading.py\", line 973, in _bootstrap
self._bootstrap_inner()
  File \"/opt/python/3.10.13/lib/python3.10/threading.py\", line 1016, in _bootstrap_inner
    self.run()
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py\", line 807, in run
    result = context.run(func, *args)
  File \"/tmp/8dc00b6931bbb9f/app.py\", line 405, in poll_queue
    statusLog.upsert_document(blob_path, f'Message requed to embeddings queue, attempt {str(requeue_count)}. Visible in {str(backoff)} seconds. Error: {str(error)}.',
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/tenacity/__init__.py\", line 382, in __call__
    result = fn(*args, **kwargs)
  File \"/tmp/8dc00b6931bbb9f/app.py\", line 85, in encode
    response = openai.Embedding.create(
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/openai/api_resources/embedding.py\", line 33, in create
    response = super().create(*args, **kwargs)
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py\", line 153, in create
    response, _, api_key = requestor.request(
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/openai/api_requestor.py\", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/openai/api_requestor.py\", line 619, in _interpret_response
    self._interpret_response_line(
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/openai/api_requestor.py\", line 679, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 22089 tokens (22089 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File \"/tmp/8dc00b6931bbb9f/app.py\", line 228, in embed_texts
    embeddings = model_obj.encode(texts)
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/tenacity/__init__.py\", line 289, in wrapped_f
    return self(f, *args, **kw)
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/tenacity/__init__.py\", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File \"/tmp/8dc00b6931bbb9f/antenv/lib/python3.10/site-packages/tenacity/__init__.py\", line 326, in iter
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x770bd92d7640 state=finished raised InvalidRequestError>]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File \"/tmp/8dc00b6931bbb9f/app.py\", line 348, in poll_queue
    embedding = embed_texts(target_embeddings_model, [text])
  File \"/tmp/8dc00b6931bbb9f/app.py\", line 242, in embed_texts
    raise HTTPException(status_code=500, detail=f\"Failed to embed: {str(error)}\") from error
fastapi.exceptions.HTTPException

Desktop (please complete the following information):

  • OS: Windows
  • Browser Edge
  • Version Delta 0.4 (Main Branch)

Alpha version details

  • GitHub branch: Main
  • Latest commit: Dec 15 2023.

Additional context
A commit was pushed while writing this ticket for the auto scaler. I will try with the latest and update once done.

@Patrick-Davis-MSFT
Copy link
Collaborator Author

This is confirmed to exist on the Main Branch as of 12/19. Reach out for the file.

@Patrick-Davis-MSFT
Copy link
Collaborator Author

On 12/19 Build Main. It does work with the BAAI embeddings.

@Patrick-Davis-MSFT
Copy link
Collaborator Author

What would be better.pdf
Recreated with the attached file.

@dayland dayland added the bug Something isn't working label Jan 4, 2024
@dayland dayland added this to the 1.0 milestone Jan 4, 2024
@dayland
Copy link
Contributor

dayland commented Jan 10, 2024

@Patrick-Davis-MSFT , as you can see George and the team have begun working on this issue. The fix is a bit too deep and complex to make it into the v1.0 release this close to release date. So we have begun the work, but are targeting this for a hotfix after the v1.0 release. Just wanted to give you an update on the plan.

@dayland
Copy link
Contributor

dayland commented Feb 6, 2024

@Patrick-Davis-MSFT , This hotfix has been applied to main in #478 . Closing this issue.

@dayland dayland closed this as completed Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants