-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Min and Max BKD optimizations #44315
Optimize Min and Max BKD optimizations #44315
Conversation
Pinging @elastic/es-analytics-geo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @michalperlak! Left a few comments/questions and tagged @iverase to have a look, since he's more familiar with the BKD stuff than I am.
@@ -219,6 +222,9 @@ public void visit(int docID) { | |||
|
|||
@Override | |||
public void visit(int docID, byte[] packedValue) { | |||
if (lookupCounter.incrementAndGet() > MAX_BKD_LOOKUPS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we exit early, we need to set result back to null
to signal that we didn't find a value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I think it makes more sense to put this check at the bottom of the method. If we have traversed the tree to this leaf value it seems wasteful to exit before checking if the value is the one we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but it looks that the only possibility where returned result[0]
is not null
is the case when the liveDocs.get(docID)
is true
, so when the intersection process is terminated by early exit the returned value is null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I was looking at the line where result = new Number[1];
and missed that the return value is result[0]
and not just result
. All good 👍
@@ -210,6 +212,7 @@ static Number findLeafMinValue(LeafReader reader, String fieldName, Function<byt | |||
return converter.apply(pointValues.getMinPackedValue()); | |||
} | |||
final Number[] result = new Number[1]; | |||
final AtomicInteger lookupCounter = new AtomicInteger(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong, but I think we can move the counter inside the PointValues.IntersectVisitor()
anonymous class and avoid having to use an atomic. I'll defer to @iverase though, in case intersect behaves differently than I'm thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the tree is traversed in one single thread so it make more sense to add the counter inside the visitor and not using an atomic +1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, atomic replaced with short
field in the visitor.
@@ -186,7 +186,10 @@ public void visit(int docID, byte[] packedValue) { | |||
if (liveDocs.get(docID)) { | |||
// we need to collect all values in this leaf (the sort is ascending) where | |||
// the last live doc is guaranteed to contain the max value for the segment. | |||
result[0] = converter.apply(packedValue); | |||
if (result[0] == null) { | |||
result[0] = new byte[packedValue.length]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't thought about this before, @iverase do you know if the packedValue
is guaranteed to have same length for all values in the tree? Or is it a variable packing scheme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok, it is guaranteed to have the same length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 ta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing here is to change the use of atomic for a counter inside the visitor class.
@@ -219,6 +222,9 @@ public void visit(int docID) { | |||
|
|||
@Override | |||
public void visit(int docID, byte[] packedValue) { | |||
if (lookupCounter.incrementAndGet() > MAX_BKD_LOOKUPS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I think it makes more sense to put this check at the bottom of the method. If we have traversed the tree to this leaf value it seems wasteful to exit before checking if the value is the one we need.
@@ -210,6 +212,7 @@ static Number findLeafMinValue(LeafReader reader, String fieldName, Function<byt | |||
return converter.apply(pointValues.getMinPackedValue()); | |||
} | |||
final Number[] result = new Number[1]; | |||
final AtomicInteger lookupCounter = new AtomicInteger(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the tree is traversed in one single thread so it make more sense to add the counter inside the visitor and not using an atomic +1.
@@ -186,7 +186,10 @@ public void visit(int docID, byte[] packedValue) { | |||
if (liveDocs.get(docID)) { | |||
// we need to collect all values in this leaf (the sort is ascending) where | |||
// the last live doc is guaranteed to contain the max value for the segment. | |||
result[0] = converter.apply(packedValue); | |||
if (result[0] == null) { | |||
result[0] = new byte[packedValue.length]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok, it is guaranteed to have the same length.
@elasticmachine test this please |
9eaccd8
to
1443120
Compare
@elasticmachine test this please |
1 similar comment
@elasticmachine test this please |
Sorry for the delay @michalperlak, was out on holiday. I think those test failures are unrelated, I'm going to merge in master to see if that helps unstick CI. |
@elasticmachine update branch |
@elasticmachine test this please |
Merged, thanks @michalperlak! |
Great, thank you :) |
MinAggregator - skip BKD optimization when no result found after 1024 lookups. MaxAggregator - skip unnecessary conversions.
MinAggregator - skip BKD optimization when no result found after 1024 lookups. MaxAggregator - skip unnecessary conversions.
MinAggregator - skip BKD optimization when no result found after 1024 lookups. MaxAggregator - skip unnecessary conversions.
A good example! |
Applied optimizations for BKD flow in Min/Max aggregators from #44290.
MinAggregator - skip BKD optimization when no result found after 1024 lookups.
MaxAggregator - skip unnecessary conversions.
References #44290.