New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-10488: Optimize Facets#getTopDims in IntTaxonomyFacets #779
Conversation
Please see attached for the benchmark results (updated after rebase):
|
16fe4fc
to
fd537e9
Compare
I have not taken a detailed look at the PR yet. From the benchmark results posted, I see that we've got a regression in several taxonomy tasks that utilize BInaryDocValues. See (-15, -10, -10, -10). Would you mind re-running the benchmark just to be sure @Yuti-G (also verify that the only difference between mainline and candidate are your commits). Thanks for the effort btw!
|
Hi @gautamworah96, thank you so much! I have re-run the benchmark with the up-to-date mainline, and the regressions have gone, so they could be noise. Please see the results.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks really good. Thanks! I left a few small comments.
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacets.java
Outdated
Show resolved
Hide resolved
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java
Outdated
Show resolved
Hide resolved
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java
Outdated
Show resolved
Hide resolved
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java
Outdated
Show resolved
Hide resolved
fd537e9
to
1285431
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the feedback! I've got one more comment for you. Thanks again!
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java
Outdated
Show resolved
Hide resolved
Thanks! I changed the parameter name to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I'll work on merging. Thanks @Yuti-G !
Description
This change overrides and optimizes the default implementation of getTopDims in IntTaxonomyFacets which is extended by FastTaxonomyFacetCounts and TaxonomyFacetSumIntAssociations.
Solution
Override getTopDims and refactor the getTopChildren function in IntTaxonomyFacets to get dimCount (aggregated dim values) more efficiently by checking if dimCount has been populated in indexing time for a dim that is hierarchical or multiValued && requireDimCount, before aggregating dimCount by iterating its child ordinal.
Tests
Added new testing for the overridden implementations of getTopDims in IntTaxonomyFacets
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.