Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-8952: Use a sort key instead of true distance in NearestNeighbor. #832

Merged
merged 3 commits into from
Aug 23, 2019

Conversation

jtibshirani
Copy link
Member

@jtibshirani jtibshirani commented Aug 14, 2019

This commit addresses a TODO in NearestNeighbors around switching to
SloppyMath.haversinSortKey. When comparing candidate hits, we now only compute
a distance sort key. The sort key is converted to a true distance only when returning
the final set of FieldDocs and when calculating the bbox for the current search area.

This commit addresses a TODO in NearestNeighbors around switching to
`SloppyMath.haversinSortKey`. When comparing candidate hits, we now only compute
a distance sort key. The sort key is converted to a true distance when returning
the final set of `FieldDocs`.
Copy link
Member

@rmuir rmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at a glance, seems like a great win to me. this avoids the expensive asin() call which is why we have the fn split. thank you

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use haversinSortKey when calculating approxBestDistance. In addition can you add an entry in CHANGES.txt

@iverase
Copy link
Contributor

iverase commented Aug 22, 2019

FYI: I opened #842 which seems to help a lot in performance as well.

@jtibshirani jtibshirani changed the title LUCENE-8952: Use a sort key instead of true distance in NearestNeighbors. LUCENE-8952: Use a sort key instead of true distance in NearestNeighbor. Aug 23, 2019
@jtibshirani
Copy link
Member Author

Thanks for the review @iverase, I pushed some new changes.

@iverase iverase merged commit 152756f into apache:master Aug 23, 2019
@iverase
Copy link
Contributor

iverase commented Aug 23, 2019

Thank you @jtibshirani!

@jtibshirani jtibshirani deleted the nn-distance-sort-key branch August 23, 2019 16:40
MarcusSorealheis added a commit to MarcusSorealheis/lucene-solr that referenced this pull request Aug 27, 2019
…cusSorealheis/lucene-solr into enhancement_blockUnkwon-default-true

* 'enhancement_blockUnkwon-default-true' of github.com:MarcusSorealheis/lucene-solr: (37 commits)
  SOLR-13699 - maxChars no longer working on CopyField with javabin
  SOLR-13699 - maxChars no longer working on CopyField with Javabin
  SOLR-13655: Fix precommit
  SOLR-11601: Improve geodist error message when using with LLPSF.
  SOLR-13655: Added CHANGES entry
  SOLR-13655:Upgrade Collections.unModifiableSet to Set.of and Set.copyOf (apache#817)
  SOLR-13702: Fix precommit
  SOLR-13702: Some components register twice their metric names (apache#834)
  LUCENE-8952: Use a sort key instead of true distance in NearestNeighbor. (apache#832)
  SOLR-13707: API to expose the currently used package name, details for each plugin
  SOLR-13707: API to expose the currently used package name, details for each plugin (apache#841)
  Additional logging in test framework methods that 'waitFor' something to better trace order of operations when failures occur
  SOLR-13257: Support deterministic replica routing
  SOLR-13706: Config API output is broken for "highlight" component
  LUCENE-8755: Spatial-extras quad and packed-quad trees now index  points a little faster, and also fix an edge case bug.  Fixes apache#824
  removed unnecessary comments
  SOLR-13650: AwaitsFix TestContainerReqHandler.testCacheFromGlobalLoader
  SOLR-13650:ref guide typo
  SOLR-13704: correct error codes for client errors in expand component
  SOLR-13650: ref guide
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants