🐛 Bug Report: Large zip breaking stream endpoint #859

pabik · 2024-02-21T14:11:21Z

📜 Description

Stream endpoint doesn't provide answer when embedded file in zip archive is long.

👟 Reproduction steps

Upload a zip file
Try chatting
docs_tester.zip

👍 Expected behavior

DocsGPT should provide an answer.

👎 Actual Behavior with Screenshots

No answer, stream endpoint breaks.

💻 Operating system

MacOS

What browsers are you seeing the problem on?

Chrome

🤖 What development environment are you experiencing this bug on?

Docker

🔒 Did you set the correct environment variables in the right path? List the environment variable names (not values please!)

No response

📃 Provide any additional context for the Bug.

No response

📖 Relevant log output

No response

👀 Have you spent some time to check if this bug has been raised before?

I checked and didn't find similar issue

🔗 Are you willing to submit PR?

None

🧑‍⚖️ Code of Conduct

I agree to follow this project's Code of Conduct

dartpain · 2024-06-07T13:51:07Z

Looks like it happens because the file is not being chunked properly or at all when answering, thus resulting current context token overload

nayelimdejesus · 2024-06-10T17:59:46Z

Hi, I would like to work on this issue.

nayelimdejesus · 2024-06-15T18:32:43Z

When you upload a big zip file what answer should it provide?

dartpain · 2024-06-17T13:39:14Z

Just shouldn't break. Basically make sure that it doesn't error out.
Try running it with the file attached.

jayantp2003 · 2024-10-10T21:30:11Z

I am interested to work on this issue.

jayantp2003 · 2024-10-10T23:47:18Z

I was playing around with the zip file and couple of different files, I found that its not an issue related to chunking of code, there is some issue with RstParser class, I did update the file extensions to text file, for that case, it was working fine.

Currently checking the Rstparser class to figure out the changes required.

jayantp2003 · 2024-10-11T10:07:58Z

The issue is with the implementation of rst parser, in each file, it looks for a header and a text below it, but for the zip file we are testing on, it is just a single file with no header available, hence it is not being chunked. This header and text breakdown thing also seems to be an issue for markdown parser. The file should be chunked based on tokens or bytes and this tuple implementation also need to be updated.

dartpain · 2024-10-11T10:28:38Z

Yeah seems like thats the issue, lets add another token size handler to it maybe?

jayantp2003 · 2024-10-11T11:50:18Z

Hey, I have updated the code and created a PR, can you review it and approve, I am new to open source contributions and do not know how it works, after making a PR. Open to feedbacks.

jayantp2003 · 2024-10-11T17:54:13Z

@dartpain Can you review my changes and provide feedback, and approve if implementation seems correct.

dartpain added help wanted Extra attention is needed bug Something isn't working labels Feb 22, 2024

pabik changed the title ~~🐛 Bug Report:~~ 🐛 Bug Report: Large zip breaking stream endpoint Mar 1, 2024

dartpain assigned nayelimdejesus Jun 10, 2024

nayelimdejesus removed their assignment Jul 3, 2024

pabik added hacktoberfest backend labels Oct 7, 2024

dartpain assigned jayantp2003 Oct 10, 2024

jayantp2003 added a commit to jayantp2003/DocsGPT that referenced this issue Oct 11, 2024

Fix arc53#859: Resolve issue with large zip breaking stream endpoint

a2ef45e

jayantp2003 added a commit to jayantp2003/DocsGPT that referenced this issue Oct 11, 2024

Fix arc53#859: Resolve issue with large zip breaking stream endpoint

3db07f3

jayantp2003 mentioned this issue Oct 11, 2024

Bugfix/859 large zip breaking stream endpoint #1303

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Bug Report: Large zip breaking stream endpoint #859

🐛 Bug Report: Large zip breaking stream endpoint #859

pabik commented Feb 21, 2024

dartpain commented Jun 7, 2024

nayelimdejesus commented Jun 10, 2024

nayelimdejesus commented Jun 15, 2024

dartpain commented Jun 17, 2024

jayantp2003 commented Oct 10, 2024

jayantp2003 commented Oct 10, 2024

jayantp2003 commented Oct 11, 2024

dartpain commented Oct 11, 2024

jayantp2003 commented Oct 11, 2024

jayantp2003 commented Oct 11, 2024

🐛 Bug Report: Large zip breaking stream endpoint #859

🐛 Bug Report: Large zip breaking stream endpoint #859

Comments

pabik commented Feb 21, 2024

📜 Description

👟 Reproduction steps

👍 Expected behavior

👎 Actual Behavior with Screenshots

💻 Operating system

What browsers are you seeing the problem on?

🤖 What development environment are you experiencing this bug on?

🔒 Did you set the correct environment variables in the right path? List the environment variable names (not values please!)

📃 Provide any additional context for the Bug.

📖 Relevant log output

👀 Have you spent some time to check if this bug has been raised before?

🔗 Are you willing to submit PR?

🧑‍⚖️ Code of Conduct

dartpain commented Jun 7, 2024

nayelimdejesus commented Jun 10, 2024

nayelimdejesus commented Jun 15, 2024

dartpain commented Jun 17, 2024

jayantp2003 commented Oct 10, 2024

jayantp2003 commented Oct 10, 2024

jayantp2003 commented Oct 11, 2024

dartpain commented Oct 11, 2024

jayantp2003 commented Oct 11, 2024

jayantp2003 commented Oct 11, 2024