New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fielddata: Remove custom comparators and use Lucene's instead #6981
Conversation
This commit removes custom comparators in favor of the ones that are in Lucene. The major change is for nested documents: instead of having a comparator wrapper that deals with nested documents, this is done at the fielddata level by having a selector that returns the value to use for comparison. Sorting with custom missing string values might be slower since it is using TermValComparator since Lucene's TermOrdValComparator only supports sorting missing values first or last. But other than this particular case, this change will allow us to benefit from improvements on comparators from the Lucene side. Close elastic#5980
* parent + 1, or 0 if there is no previous parent, and R (excluded). | ||
*/ | ||
// TODO: better name | ||
public static class NestedLayout { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just call it Nested
maybe?
I left a bunch of cosmetic comments - this looks very clean and a nice reduction of complexity :) |
@s1monw pushed a new commit to address your comments |
} else { | ||
final FixedBitSet fixedBitSet = new FixedBitSet(maxDoc); | ||
fixedBitSet.or(set.iterator()); | ||
return fixedBitSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we deprecate the method so it doesnt spread? If it spreads somewhere else, then it would need more logic to handle all the cases (e.g. DISI.iterator() == null etc)
LGTM maybe @rmuir can take another look? |
return new BytesRefOrdValComparator((IndexOrdinalsFieldData) indexFieldData, numHits, sortMode, missingBytes); | ||
// The ordinal-based comparator only supports sorting missing values first or last so when | ||
// a missing value is provided we fall back to the (slow) value-based comparator | ||
if (sortMissingFirst(missingValue) || sortMissingLast(missingValue)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a TODO/followup here to just make SortedDocValues "view" that exposes 'missing' as arbitrary value? I can work on it, just so its always fast...
Awesome, this looks great. I just added minor questions/comments/TODOs, but i think its ready. |
@rmuir pushed a new commit |
+1 |
This commit removes custom comparators in favor of the ones that are in Lucene.
The major change is for nested documents: instead of having a comparator wrapper
that deals with nested documents, this is done at the fielddata level by having
a selector that returns the value to use for comparison.
Sorting with custom missing string values might be slower since it is using
TermValComparator since Lucene's TermOrdValComparator only supports sorting
missing values first or last. But other than this particular case, this change
will allow us to benefit from improvements on comparators from the Lucene side.
Close #5980