-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DrillSideways optimizations #11803
DrillSideways optimizations #11803
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM except for one assertion, also do we have existing unit test testing those code? (I hope there's code coverage report here :)
lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java
Outdated
Show resolved
Hide resolved
lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java
Outdated
Show resolved
Hide resolved
Changes LGTM, do we need to add some unit tests? |
Thanks @zhaih. Let me consider some specific test for this. I know our randomized testing for DrillSideways covers these code paths but maybe some specific, non-random tests would be useful. |
@zhaih I reexamined our test coverage and think we're in good shape already actually. We've got good coverage for covering drill-sideways correctness with multiple dimensions, etc. (including random and non-random). We could try to take these further by somehow asserting that advance is being used in favor of nextDoc when appropriate, but I think those tests would be reasonably complex to write and I'm not sure they add tremendous value. I'd rather we spent time building drill-sideways benchmarks that focus on ensuring our performance doesn't regress. But that's just my opinion. Please let me know if you feel differently and we can keep discussing. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gsmiller, I don't have strong opinion on adding a test because I don't really know what coverage we already got, one fact that makes me a little worry about the coverage is that assertion caught by me should've been caught by unit test and that's why I'm asking. But if you think that'll be covered by randomized test then please feel free to push it!
@zhaih that's a good point and valid concern. I dug into the existing tests and it looks like we have lots of coverage except that the majority of the coverage is using basic, single-phase drill-down dimensions. I'm going to augment our randomized testing to randomly use two-phase drill-downs to broaden coverage. Thanks for the discussion! |
@gsmiller Thank you for checking and continuous effort! |
@zhaih well, thank you for keeping me honest with testing. I think I've already found an insidious, potential bug with some beefier tests. |
2. remove the "validateState" assertion since it's illegal to call match() more than one for the same doc (state validation would require separately tracking the match results for all two-phase iterators, which doesn't seem worth it)
f8f1f27
to
e0ce888
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, test LGTM!
DrillSidewaysScorer now breaks up first- and second-phase matching and makes use of advance when possible over nextDoc.
Description
This change makes use of
advance
instead ofnext
where possible and splits out 1st and 2nd phase checking to avoid match confirmation when unnecessary.Note that I only focused on the
doQueryFirstScoring
implementation here and didn't modify the other two scoring approaches. "Progress not perfection" and all that (plus, I think we should strongly consider removing these other two implementations, but we'd want to benchmark to be certain).Unfortunately,
luceneutil
doesn't have dedicated drill sideways benchmarks, but some benchmarks on our internal software that makes use of drill sideways showed a +2% QPS improvement and no obvious regressions.