-
Notifications
You must be signed in to change notification settings - Fork 682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup scores, search results and DocID #3849
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…anking algorithms
donomii
changed the title
Fix explainScore returning nulls on a search with no query
Fix code keeping separate copies of "score", fix assigning score to .Dist
Dec 6, 2023
…o explainScore-nulls
donomii
changed the title
Fix code keeping separate copies of "score", fix assigning score to .Dist
Cleanup scores, search results and DocID
Dec 22, 2023
dirkkul
previously approved these changes
Dec 29, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not look at every change in detail, but this contains nothing scary :)
Quality Gate failedFailed conditions 47.5% Duplication on New Code (required ≤ 3%) |
dirkkul
approved these changes
Jan 19, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's being changed:
This PR is motived by reports from users about issues including scores not appearing, or scores being incorrect. I was unable to reproduce the exact issues reported, but a look through the code revealed the opportunity for some improvements.
There were some code paths where it was possible for scores to become out of sync or misreported, so this PR reworks our internal handling of scores. For instance, DocID was not being kept in the same structure as the UUID and Score and other object attributes. This led to situations where code would try to rebuild the DocID from the UUID, or by attaching an auxillary structure to carry that information. DocID is now kept in the same structure as the rest of the object attributes, from the moment the structure is created. DocID is a pointer value because 0x00000 is a valid document id, so there needs to be a way to tell the difference between document #0 and an uninitialised value.
Additionally, sometimes the score was being stored in the .Additional[] property structure, sometimes it was kept in the .Score field. This patch removes all the places where score was directly assigned to .Additional, and replaces them with .Score. .Additional is(and was) populated at the end of the request, using values from the result structure.
There were also some places where .SecondarySortValue appeared to be set to the wrong value, or not set at all. Correcting these did not trigger any tests, so they were probably unused, which is its own problem.
The remainder are some smaller cleanups to match the Weaviate coding style, adding more tests.
Remaining TODO: we do some extra copying on the search results that we don't need to, so we can save a little bit of memory and time by moving all functions to accept []*search.Result
Review checklist