fix: Fixed high CPU usage issue related to index #18

chandrasekharan-zipstack · 2024-03-13T05:40:41Z

What

Made sure we perform text extraction only when indexing is needed
Removed unused packages from SDK (which was needed for LLMWhisperer previously)
Bumped SDK to 0.15.1

Why

We had an issue with high CPU usage due to text extraction running all the time during index - this issue affects on-prem the most

How

...

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Removed below

 "filetype==1.2.0",
  "pdfplumber==0.10.3",
  "pytesseract==0.3.10",

Notes on Testing

Able to generate index only once and ensured we don't re-extract everytime (checked with a debugger)
Able to fetch response for a prompt - didn't check eval parts though

Screenshots

Checklist

I have read and understood the Contribution Guidelines.

…dex. Bumped SDK to 0.15.1 and removed unused packages

jaseemjaskp

LGTM

Fixed high CPU usage issue by avoiding extracting before check for in…

96b7670

…dex. Bumped SDK to 0.15.1 and removed unused packages

chandrasekharan-zipstack requested review from arun-venkataswamy, hari-kuriakose, jaseemjaskp and nehabagdia March 13, 2024 05:40

chandrasekharan-zipstack self-assigned this Mar 13, 2024

jaseemjaskp approved these changes Mar 13, 2024

View reviewed changes

arun-venkataswamy approved these changes Mar 13, 2024

View reviewed changes

arun-venkataswamy merged commit 1ed1ad7 into main Mar 13, 2024

arun-venkataswamy deleted the fix/prevent-extraction-before-indexing branch March 13, 2024 05:45

chandrasekharan-zipstack mentioned this pull request Mar 13, 2024

deps: Bumped SDK to 0.15.1 in backend and prompt-service Zipstack/unstract#92

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Fixed high CPU usage issue related to index #18

fix: Fixed high CPU usage issue related to index #18

Uh oh!

chandrasekharan-zipstack commented Mar 13, 2024

Uh oh!

jaseemjaskp left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: Fixed high CPU usage issue related to index #18

fix: Fixed high CPU usage issue related to index #18

Uh oh!

Conversation

chandrasekharan-zipstack commented Mar 13, 2024

What

Why

How

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Screenshots

Checklist

Uh oh!

jaseemjaskp left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants