-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-8952: Use a sort key instead of true distance in NearestNeighbor. #832
Conversation
This commit addresses a TODO in NearestNeighbors around switching to `SloppyMath.haversinSortKey`. When comparing candidate hits, we now only compute a distance sort key. The sort key is converted to a true distance when returning the final set of `FieldDocs`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at a glance, seems like a great win to me. this avoids the expensive asin() call which is why we have the fn split. thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use haversinSortKey
when calculating approxBestDistance
. In addition can you add an entry in CHANGES.txt
FYI: I opened #842 which seems to help a lot in performance as well. |
Thanks for the review @iverase, I pushed some new changes. |
Thank you @jtibshirani! |
…cusSorealheis/lucene-solr into enhancement_blockUnkwon-default-true * 'enhancement_blockUnkwon-default-true' of github.com:MarcusSorealheis/lucene-solr: (37 commits) SOLR-13699 - maxChars no longer working on CopyField with javabin SOLR-13699 - maxChars no longer working on CopyField with Javabin SOLR-13655: Fix precommit SOLR-11601: Improve geodist error message when using with LLPSF. SOLR-13655: Added CHANGES entry SOLR-13655:Upgrade Collections.unModifiableSet to Set.of and Set.copyOf (apache#817) SOLR-13702: Fix precommit SOLR-13702: Some components register twice their metric names (apache#834) LUCENE-8952: Use a sort key instead of true distance in NearestNeighbor. (apache#832) SOLR-13707: API to expose the currently used package name, details for each plugin SOLR-13707: API to expose the currently used package name, details for each plugin (apache#841) Additional logging in test framework methods that 'waitFor' something to better trace order of operations when failures occur SOLR-13257: Support deterministic replica routing SOLR-13706: Config API output is broken for "highlight" component LUCENE-8755: Spatial-extras quad and packed-quad trees now index points a little faster, and also fix an edge case bug. Fixes apache#824 removed unnecessary comments SOLR-13650: AwaitsFix TestContainerReqHandler.testCacheFromGlobalLoader SOLR-13650:ref guide typo SOLR-13704: correct error codes for client errors in expand component SOLR-13650: ref guide ...
This commit addresses a TODO in
NearestNeighbors
around switching toSloppyMath.haversinSortKey
. When comparing candidate hits, we now only computea distance sort key. The sort key is converted to a true distance only when returning
the final set of
FieldDocs
and when calculating the bbox for the current search area.