Skip to content

Conversation

@chandrasekharan-zipstack
Copy link
Contributor

What

  • Made sure we perform text extraction only when indexing is needed
  • Removed unused packages from SDK (which was needed for LLMWhisperer previously)
  • Bumped SDK to 0.15.1

Why

  • We had an issue with high CPU usage due to text extraction running all the time during index - this issue affects on-prem the most

How

...

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

  • Removed below
 "filetype==1.2.0",
  "pdfplumber==0.10.3",
  "pytesseract==0.3.10",

Notes on Testing

  • Able to generate index only once and ensured we don't re-extract everytime (checked with a debugger)
  • Able to fetch response for a prompt - didn't check eval parts though

Screenshots

image
image

Checklist

I have read and understood the Contribution Guidelines.

…dex.

Bumped SDK to 0.15.1 and removed unused packages
Copy link
Contributor

@jaseemjaskp jaseemjaskp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants