Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Min and Max BKD optimizations #44315

Merged
merged 4 commits into from
Jul 26, 2019

Conversation

michalperlak
Copy link
Contributor

Applied optimizations for BKD flow in Min/Max aggregators from #44290.

MinAggregator - skip BKD optimization when no result found after 1024 lookups.
MaxAggregator - skip unnecessary conversions.

References #44290.

@javanna javanna added the :Analytics/Aggregations Aggregations label Jul 15, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @michalperlak! Left a few comments/questions and tagged @iverase to have a look, since he's more familiar with the BKD stuff than I am.

@@ -219,6 +222,9 @@ public void visit(int docID) {

@Override
public void visit(int docID, byte[] packedValue) {
if (lookupCounter.incrementAndGet() > MAX_BKD_LOOKUPS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we exit early, we need to set result back to null to signal that we didn't find a value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, I think it makes more sense to put this check at the bottom of the method. If we have traversed the tree to this leaf value it seems wasteful to exit before checking if the value is the one we need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong, but it looks that the only possibility where returned result[0] is not null is the case when the liveDocs.get(docID) is true, so when the intersection process is terminated by early exit the returned value is null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I was looking at the line where result = new Number[1]; and missed that the return value is result[0] and not just result. All good 👍

@@ -210,6 +212,7 @@ static Number findLeafMinValue(LeafReader reader, String fieldName, Function<byt
return converter.apply(pointValues.getMinPackedValue());
}
final Number[] result = new Number[1];
final AtomicInteger lookupCounter = new AtomicInteger(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong, but I think we can move the counter inside the PointValues.IntersectVisitor() anonymous class and avoid having to use an atomic. I'll defer to @iverase though, in case intersect behaves differently than I'm thinking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the tree is traversed in one single thread so it make more sense to add the counter inside the visitor and not using an atomic +1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, atomic replaced with short field in the visitor.

@@ -186,7 +186,10 @@ public void visit(int docID, byte[] packedValue) {
if (liveDocs.get(docID)) {
// we need to collect all values in this leaf (the sort is ascending) where
// the last live doc is guaranteed to contain the max value for the segment.
result[0] = converter.apply(packedValue);
if (result[0] == null) {
result[0] = new byte[packedValue.length];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't thought about this before, @iverase do you know if the packedValue is guaranteed to have same length for all values in the tree? Or is it a variable packing scheme?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok, it is guaranteed to have the same length.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 ta

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing here is to change the use of atomic for a counter inside the visitor class.

@@ -219,6 +222,9 @@ public void visit(int docID) {

@Override
public void visit(int docID, byte[] packedValue) {
if (lookupCounter.incrementAndGet() > MAX_BKD_LOOKUPS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, I think it makes more sense to put this check at the bottom of the method. If we have traversed the tree to this leaf value it seems wasteful to exit before checking if the value is the one we need.

@@ -210,6 +212,7 @@ static Number findLeafMinValue(LeafReader reader, String fieldName, Function<byt
return converter.apply(pointValues.getMinPackedValue());
}
final Number[] result = new Number[1];
final AtomicInteger lookupCounter = new AtomicInteger(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the tree is traversed in one single thread so it make more sense to add the counter inside the visitor and not using an atomic +1.

@@ -186,7 +186,10 @@ public void visit(int docID, byte[] packedValue) {
if (liveDocs.get(docID)) {
// we need to collect all values in this leaf (the sort is ascending) where
// the last live doc is guaranteed to contain the max value for the segment.
result[0] = converter.apply(packedValue);
if (result[0] == null) {
result[0] = new byte[packedValue.length];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok, it is guaranteed to have the same length.

@polyfractal
Copy link
Contributor

@elasticmachine test this please

@polyfractal
Copy link
Contributor

@elasticmachine test this please

1 similar comment
@iverase
Copy link
Contributor

iverase commented Jul 18, 2019

@elasticmachine test this please

@polyfractal
Copy link
Contributor

polyfractal commented Jul 25, 2019

Sorry for the delay @michalperlak, was out on holiday. I think those test failures are unrelated, I'm going to merge in master to see if that helps unstick CI.

@polyfractal
Copy link
Contributor

@elasticmachine update branch

@polyfractal
Copy link
Contributor

@elasticmachine test this please

@polyfractal polyfractal merged commit 20284b8 into elastic:master Jul 26, 2019
@polyfractal
Copy link
Contributor

Merged, thanks @michalperlak!

@michalperlak
Copy link
Contributor Author

Great, thank you :)

@michalperlak michalperlak deleted the min_max_bkd_optimization branch July 26, 2019 21:28
polyfractal pushed a commit that referenced this pull request Jul 29, 2019
MinAggregator - skip BKD optimization when no result found after 1024 lookups.
MaxAggregator - skip unnecessary conversions.
polyfractal pushed a commit to polyfractal/elasticsearch that referenced this pull request Jul 29, 2019
MinAggregator - skip BKD optimization when no result found after 1024 lookups.
MaxAggregator - skip unnecessary conversions.
jkakavas pushed a commit that referenced this pull request Jul 31, 2019
MinAggregator - skip BKD optimization when no result found after 1024 lookups.
MaxAggregator - skip unnecessary conversions.
@minglinlu
Copy link

A good example!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants