Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.xlsx file extension not uploading #545

Closed
sohyb-qasem opened this issue Mar 10, 2024 · 3 comments · Fixed by #558
Closed

.xlsx file extension not uploading #545

sohyb-qasem opened this issue Mar 10, 2024 · 3 comments · Fixed by #558
Labels
bug Something isn't working duplicate This issue or pull request already exists
Milestone

Comments

@sohyb-qasem
Copy link
Collaborator

When tyring to upload a file with .xlsx extension it shows the following error from cosmo db statuscontainer.

{
"status": "An error occurred, max requeue limit was reached. Error description: ",
"status_timestamp": "2024-03-05 01:21:09",
"status_classification": "Error",
"stack_trace": "Traceback (most recent call last):\n File "/opt/python/3.10.12/lib/python3.10/threading.py", line 973, in _bootstrap\n self._bootstrap_inner()\n File "/opt/python/3.10.12/lib/python3.10/threading.py", line 1016, in _bootstrap_inner\n self.run()\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run\n result = context.run(func, *args)\n File "/tmp/8dc38d50313da49/app.py", line 410, in poll_queue\n statusLog.upsert_document(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 382, in call\n result = fn(*args, **kwargs)\n File "/tmp/8dc38d50313da49/app.py", line 85, in encode\n response = openai.Embedding.create(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_resources/embedding.py", line 33, in create\n response = super().create(*args, **kwargs)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create\n response, _, api_key = requestor.request(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request\n resp, got_stream = self._interpret_response(result, stream)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response\n self._interpret_response_line(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line\n raise self.handle_error_response(\nopenai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 58161 tokens (58161 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "/tmp/8dc38d50313da49/app.py", line 228, in embed_texts\n embeddings = model_obj.encode(texts)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 289, in wrapped_f\n return self(f, *args, **kw)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 379, in call\n do = self.iter(retry_state=retry_state)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 326, in iter\n raise retry_exc from fut.exception()\ntenacity.RetryError: RetryError[<Future at 0x7f32ac164910 state=finished raised InvalidRequestError>]\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "/tmp/8dc38d50313da49/app.py", line 348, in poll_queue\n embedding = embed_texts(target_embeddings_model, [text])\n File "/tmp/8dc38d50313da49/app.py", line 242, in embed_texts\n raise HTTPException(status_code=500, detail=f"Failed to embed: {str(error)}") from error\nfastapi.exceptions.HTTPException\n"
}

version used is: 0.4 Delta

region of Azure Open AI Service: Australia East

What ChatGPT model are you using? gpt-4

model name: (i.e. gpt-3.5-turbo, gpt-4): gpt-4

model version: (i.e. 0613): 1106-Preview

What embeddings model are you using?: text-embedding-ada-002

Have you ever faced this issue? and how would you solve it?

@dayland dayland added the duplicate This issue or pull request already exists label Mar 11, 2024
@dayland dayland added this to the 1.1 milestone Mar 11, 2024
@dayland
Copy link
Contributor

dayland commented Mar 11, 2024

This is the same issues as described in #492. A recent update to Unstructure.io will address this. We are moving to the latest version of Unstructured.io in our next release.

@dayland
Copy link
Contributor

dayland commented Mar 14, 2024

PR #558 was applied to main to address these issues. Pull latest from main and re-run make deploy.

@georearl
Copy link
Contributor

closing due to inactivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists
Projects
None yet
3 participants