-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-5309: Optimize facet counting for single-valued SSDV / StringValueFacetCounts #255
LUCENE-5309: Optimize facet counting for single-valued SSDV / StringValueFacetCounts #255
Conversation
…ocValues / StringValueFacetCounts
@@ -82,7 +82,6 @@ public int nextDoc() throws IOException { | |||
|
|||
if (newDocID == NO_MORE_DOCS) { | |||
currentValues = null; | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were just unnecessary continue statements so I removed them.
@@ -452,7 +449,6 @@ public static SortedNumericDocValues getSortedNumericValues( | |||
|
|||
boolean anyReal = false; | |||
final SortedNumericDocValues[] values = new SortedNumericDocValues[size]; | |||
final int[] starts = new int[size + 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
starts is never referenced, so I pulled it; unrelated cleanup
@@ -680,9 +673,9 @@ public static SortedSetDocValues getSortedSetValues(final IndexReader r, final S | |||
*/ | |||
public static class MultiSortedDocValues extends SortedDocValues { | |||
/** docbase for each leaf: parallel with {@link #values} */ | |||
public final int docStarts[]; | |||
public final int[] docStarts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned up some array declaration styling where I was in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love it.
@@ -415,6 +415,8 @@ Improvements | |||
This prevents caching a query clause when it is much more expensive than | |||
running the top-level query. (Julie Tibshirani) | |||
|
|||
* LUCENE-5309: Optimize facet counting for single-valued SSDV / StringValueFacetCounts. (Greg Miller) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would backport this to 8x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @gsmiller! What an awesome performance pop!
@@ -680,9 +673,9 @@ public static SortedSetDocValues getSortedSetValues(final IndexReader r, final S | |||
*/ | |||
public static class MultiSortedDocValues extends SortedDocValues { | |||
/** docbase for each leaf: parallel with {@link #values} */ | |||
public final int docStarts[]; | |||
public final int[] docStarts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love it.
@@ -103,7 +105,7 @@ public FacetResult getTopChildren(int topN, String dim, String... path) throws I | |||
return getDim(dim, ordRange, topN); | |||
} | |||
|
|||
private final FacetResult getDim(String dim, OrdRange ordRange, int topN) throws IOException { | |||
private FacetResult getDim(String dim, OrdRange ordRange, int topN) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm why did we remove this final
? I wonder if the whole class should be final
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikemccand no need to mark private
methods as final since they're not visible for sub-classes to override
@@ -286,7 +310,7 @@ private final void count(List<MatchingDocs> matchingDocs) | |||
} | |||
|
|||
/** Does all the "real work" of tallying up the counts. */ | |||
private final void countAll() throws IOException, InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More final
attrition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, same comment as above. Since private methods aren't visible to sub-classes, it's not necessary to mark them final. I don't feel strongly about this though, so happy to add it back if the community prefers to keep it.
Description
This change introduces special-case logic when facet counting on a
SortedDocValues
field. Instead of using the more-general logic that supports multi-valued fields (i.e.,SortedSetDocValues
) it has a separate implementation for the single-valued case.Solution
Unwrap the SSDV into a SDV if possible and provide a separate counting implementation for the single-valued case.
Tests
Introduced some new basic unit tests that specifically test single-valued cases, and also modified the randomized testing to occasionally produce all single-valued docs.
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.