.xlsx file extension not uploading #545

sohyb-qasem · 2024-03-10T23:20:37Z

When tyring to upload a file with .xlsx extension it shows the following error from cosmo db statuscontainer.

{
"status": "An error occurred, max requeue limit was reached. Error description: ",
"status_timestamp": "2024-03-05 01:21:09",
"status_classification": "Error",
"stack_trace": "Traceback (most recent call last):\n File "/opt/python/3.10.12/lib/python3.10/threading.py", line 973, in _bootstrap\n self._bootstrap_inner()\n File "/opt/python/3.10.12/lib/python3.10/threading.py", line 1016, in _bootstrap_inner\n self.run()\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run\n result = context.run(func, *args)\n File "/tmp/8dc38d50313da49/app.py", line 410, in poll_queue\n statusLog.upsert_document(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 382, in call\n result = fn(*args, **kwargs)\n File "/tmp/8dc38d50313da49/app.py", line 85, in encode\n response = openai.Embedding.create(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_resources/embedding.py", line 33, in create\n response = super().create(*args, **kwargs)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create\n response, _, api_key = requestor.request(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request\n resp, got_stream = self._interpret_response(result, stream)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response\n self._interpret_response_line(\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line\n raise self.handle_error_response(\nopenai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 58161 tokens (58161 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "/tmp/8dc38d50313da49/app.py", line 228, in embed_texts\n embeddings = model_obj.encode(texts)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 289, in wrapped_f\n return self(f, *args, **kw)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 379, in call\n do = self.iter(retry_state=retry_state)\n File "/tmp/8dc38d50313da49/antenv/lib/python3.10/site-packages/tenacity/init.py", line 326, in iter\n raise retry_exc from fut.exception()\ntenacity.RetryError: RetryError[<Future at 0x7f32ac164910 state=finished raised InvalidRequestError>]\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "/tmp/8dc38d50313da49/app.py", line 348, in poll_queue\n embedding = embed_texts(target_embeddings_model, [text])\n File "/tmp/8dc38d50313da49/app.py", line 242, in embed_texts\n raise HTTPException(status_code=500, detail=f"Failed to embed: {str(error)}") from error\nfastapi.exceptions.HTTPException\n"
}

version used is: 0.4 Delta

region of Azure Open AI Service: Australia East

What ChatGPT model are you using? gpt-4

model name: (i.e. gpt-3.5-turbo, gpt-4): gpt-4

model version: (i.e. 0613): 1106-Preview

What embeddings model are you using?: text-embedding-ada-002

Have you ever faced this issue? and how would you solve it?

dayland · 2024-03-11T13:56:26Z

This is the same issues as described in #492. A recent update to Unstructure.io will address this. We are moving to the latest version of Unstructured.io in our next release.

dayland · 2024-03-14T22:51:08Z

PR #558 was applied to main to address these issues. Pull latest from main and re-run make deploy.

georearl · 2024-04-16T18:04:07Z

closing due to inactivity

dayland added the duplicate This issue or pull request already exists label Mar 11, 2024

dayland added this to the 1.1 milestone Mar 11, 2024

dayland mentioned this issue Mar 11, 2024

Geearl/6403 unstructured updates #530

Merged

dayland added the bug Something isn't working label Mar 11, 2024

dayland mentioned this issue Mar 14, 2024

Merge pull request #523 and #530 from vNext-Dev for large table fixes #558

Merged

georearl closed this as completed Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.xlsx file extension not uploading #545

.xlsx file extension not uploading #545

sohyb-qasem commented Mar 10, 2024

dayland commented Mar 11, 2024

dayland commented Mar 14, 2024

georearl commented Apr 16, 2024

.xlsx file extension not uploading #545

.xlsx file extension not uploading #545

Comments

sohyb-qasem commented Mar 10, 2024

dayland commented Mar 11, 2024

dayland commented Mar 14, 2024

georearl commented Apr 16, 2024