Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGFIX] All documents returned for the summariser #691

Merged
merged 5 commits into from
Jun 28, 2024

Conversation

wpfl-dbt
Copy link
Collaborator

@wpfl-dbt wpfl-dbt commented Jun 28, 2024

Context

The all document retriever wasn't bringing back all documents -- it was limited to 10. Fixed with a new class, which in turn throw up some more problems that needed fixing.

Changes proposed in this pull request

  • Adds AllElasticsearchRetriever to return all chunks in order
    • Removed embeddings from the returned object
    • Implemented scan() to get all results
  • Enforces the shape of returned documents at the retrieval class level, and simplifies the returned structure
    • Chunk and Document both handled
  • Refactors core_api.src.retriever to be simpler for the new classes and functions\
  • Refactors API to use these changes

Guidance to review

  • Look through core_api.src.retriever most closely -- this is where the action is
  • Ensure it all runs

Things to check

Will Langdale added 3 commits June 28, 2024 14:24
…ents in a consistent format, and ensuring the all_chunks retriever actually gets the whole document
Copy link
Contributor

@jamesrichards4 jamesrichards4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, this is what my refactor should have been in the first place

@jamesrichards4 jamesrichards4 marked this pull request as ready for review June 28, 2024 15:11
@wpfl-dbt wpfl-dbt merged commit 70bd346 into main Jun 28, 2024
9 checks passed
@wpfl-dbt wpfl-dbt deleted the bugfix/all-doc-retriever branch September 26, 2024 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants