Optimize Min and Max BKD optimizations :) #44290

polyfractal · 2019-07-12T17:02:18Z

The Max aggregator has an optimization to use the BKD tree in an attempt to find the max, bypassing an expensive collection of all documents. It does this by checking the largest leaf in the tree to see if we can find the max. Today this process decodes the packed value for every live doc in the leaf, which is not necessary. We could instead just cache the packed value and decode after intersecting.

The Min aggregator works a little differently. Since values are sorted ascending in the BKD tree, we can start at the beginning and iterate until we find a live doc (e.g. non-deleted document) then exit and use that value.

This is potentially problematic if there are many or mostly deleted documents, since we could spend a long time traversing the BKD tree. It might be faster to actually collect the documents normally since those skip deleted documents. We should probably include some kind of heuristic and revert to the non-BKD approach if we can't find the value (max 1024 lookups?)

This would be a good first issue for someone wanting to get into the agg framework, or learn how the BKD tree works, or both :)

elasticmachine · 2019-07-12T17:02:20Z

Pinging @elastic/es-analytics-geo

michalperlak · 2019-07-12T19:31:54Z

Hi @polyfractal,
Can I work on this issue?

polyfractal · 2019-07-12T20:06:35Z

@michalperlak Absolutely! Let me know if you have any questions :)

deXetrous · 2019-08-24T08:23:59Z

Hi @polyfractal I want to work on this if that's fine. Can you please help me to get started and understand the code base?

polyfractal · 2019-08-26T15:43:32Z

Oops, this ticket should be closed actually. It was implemented by #44315 linked above.

Sorry! There are other issues labeled with low hanging fruit and help needed which are good issues for new contributors.

$@polyfractal$ polyfractal added >enhancement good first issue low hanging fruit :Analytics/Aggregations Aggregations labels Jul 12, 2019

michalperlak mentioned this issue Jul 14, 2019

Optimize Min and Max BKD optimizations #44315

Merged

$@polyfractal$ polyfractal closed this as completed Aug 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Min and Max BKD optimizations :) #44290

Optimize Min and Max BKD optimizations :) #44290

polyfractal commented Jul 12, 2019

elasticmachine commented Jul 12, 2019

michalperlak commented Jul 12, 2019

polyfractal commented Jul 12, 2019

deXetrous commented Aug 24, 2019

polyfractal commented Aug 26, 2019

Optimize Min and Max BKD optimizations :) #44290

Optimize Min and Max BKD optimizations :) #44290

Comments

polyfractal commented Jul 12, 2019

elasticmachine commented Jul 12, 2019

michalperlak commented Jul 12, 2019

polyfractal commented Jul 12, 2019

deXetrous commented Aug 24, 2019

polyfractal commented Aug 26, 2019