Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork TDigest library #96086

Merged
merged 72 commits into from
Jun 13, 2023
Merged

Fork TDigest library #96086

merged 72 commits into from
Jun 13, 2023

Conversation

kkrik-es
Copy link
Contributor

@kkrik-es kkrik-es commented May 15, 2023

The TDigest library is used for percentile aggregations in ES. The library contains two underlying implementations: (a) one using an AVL tree to maintain a sorted set of centroids and (b) one using pre-allocated arrays of centroids that get sorted and merged repeatedly as more samples are added. ES uses (a) by default, but
early results show that switching to the merging implementation can offers speedups exceeding 10x in percentile calculations.

The currently used library (v.3.2) has bugs in the merging implementation that were later addressed, and the latest available version (v.3.3) provides inaccurate percentile calculations even for simple median calls. We are therefore forking the latest version and addressing these shortcomings for both implementations, before switching to the merging implementation. This PR includes fixes for MergingDigest, but keeps using AVLTreeDigest in TDigestState. The switch to MergingDigest will happen in a subsequent PR, to better track the impact of each change in isolation.

Major changes compared to the initial implementation:

  • Quantile calculation uses interpolation between centroids instead of rounding, for singleton centroids
  • Quantile calculation properly tracks the centroid range (off-by-one error fixed)
  • Smaller sizes for weight and mean buffer arrays in MergingDigest, to reduce memory pressure
  • Centroid retrieval in MergingDigest triggers merging (if needed), as it is used when writing to StreamOutput
  • Call compress() on TDigestState objects when passed to aggregation objects, so that the former are immutable throughout the lifetime of the latter - needed for thread safety
  • No tracking of added data, other than through centroids
  • Removal of misfiring asserts and dead code, esp. around Serialization
  • Style changes to match ES codebase requirements

The top 2 changes lead to slightly different results, compared to the previous implementation from v.3.2. Still, they lead to more accurate quantile calculations. The change is thus deemed acceptable and will be documented in the release notes.

More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.
@kkrik-es kkrik-es added >feature :Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels May 15, 2023
@kkrik-es kkrik-es self-assigned this May 15, 2023
@github-actions
Copy link

Documentation preview:

@elasticsearchmachine
Copy link
Collaborator

Hi @kkrik-es, I've created a changelog YAML for you.

@@ -82,7 +82,7 @@ public void testAllowStyles() {

public void testDefaultFormattingAllowed() {
String html = "<b></b><i></i><s></s><u></u><o></o><sup></sup><sub></sub><ins></ins><del></del><strong></strong>"
+ "<strike></strike><tt></tt><code></code><big></big><small></small><span></span><br /><em></em><hr />";
+ "<strike></strike><code></code><code></code><big></big><small></small><span></span><br /><em></em><hr />";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, reverted.

*/
public class TDigestState extends AVLTreeDigest {
public class TDigestState extends MergingDigest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the wire format between avl and merging can stay the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, serialization is handled by this class so there are no compatibility issues - other than the changes in quantile calculations.

@@ -63,5 +63,19 @@ BuildParams.bwcVersions.withWireCompatible { bwcVersion, baseName ->
tasks.register(bwcTaskName(bwcVersion)) {
dependsOn "${baseName}#mixedClusterTest"
}

tasks.named("${baseName}#mixedClusterTest").configure {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use skip versions instead? I think there are other qa tests that reuse the rest api spec tests too.
This should work, since we test new behaviour and not old behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also ensure that at least a base percentile and percentile_ranks tests are ran in mixed cluster tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're doing both. I saw a number of ml tests that are using skip in the YAML. We should align on one or the other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this has been reverted.

* computing the asin function involved in the merge is expensive. This argues for collecting more samples
* before sorting and merging them into the digest.
*/
@BenchmarkMode(Mode.AverageTime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fork all these micro benchmarks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept just one for the tdigest and one for sorting, in case they come handy.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at AVLTreeDigest, which is the bit I'm familiar with. One thing looks a bit fishy, but other than that it looks good to me.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is ready to be merged - LGTM

While this helps with the case where the digest contains only
singletons (perfect accuracy), it has a major issue problem
(non-monotonic quantile function) when the first singleton is followed
by a non-singleton centroid. It's preferable to revert to the old
version from 3.2; inaccuracies in a singleton-only digest should be
mitigated by using a sorted array for small sample counts.
@kkrik-es kkrik-es removed the :Analytics/Aggregations Aggregations label Jun 10, 2023
@elasticsearchmachine elasticsearchmachine removed the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 10, 2023
Update Dist.cdf to use interpolation, use the same cdf
version in AVLTreeDigest and MergingDigest.
@kkrik-es
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/*

Copy link
Contributor

@mark-vieira mark-vieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* under the License.
*/
apply plugin: 'elasticsearch.build'
apply plugin: 'elasticsearch.publish'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkrik-es kkrik-es merged commit 67211be into elastic:main Jun 13, 2023
13 checks passed
@martijnvg
Copy link
Member

nice! 🚀

HiDAl pushed a commit to HiDAl/elasticsearch that referenced this pull request Jun 14, 2023
* Initial import for TDigest forking.

* Fix MedianTest.

More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.

* Fix Dist.

* Fix AVLTreeDigest.quantile to match Dist for uniform centroids.

* Update docs/changelog/96086.yaml

* Fix `MergingDigest.quantile` to match `Dist` on uniform distribution.

* Add merging to TDigestState.hashCode and .equals.

Remove wrong asserts from tests and MergingDigest.

* Fix style violations for tdigest library.

* Fix typo.

* Fix more style violations.

* Fix more style violations.

* Fix remaining style violations in tdigest library.

* Update results in docs based on the forked tdigest.

* Fix YAML tests in aggs module.

* Fix YAML tests in x-pack/plugin.

* Skip failing V7 compat tests in modules/aggregations.

* Fix TDigest library unittests.

Remove redundant serializing interfaces from the library.

* Remove YAML test versions for older releases.

These tests don't address compatibility issues in mixed cluster tests as
the latter contain a mix of older and newer nodes, so the output depends
on which node is picked as a data node since the forked TDigest library
is not backwards compatible (produces slightly different results).

* Fix test failures in docs and mixed cluster.

* Reduce buffer sizes in MergingDigest to avoid oom.

* Exclude more failing V7 compatibility tests.

* Update results for JdbcCsvSpecIT tests.

* Update results for JdbcDocCsvSpecIT tests.

* Revert unrelated change.

* More test fixes.

* Use version skips instead of blacklisting in mixed cluster tests.

* Switch TDigestState back to AVLTreeDigest.

* Update docs and tests with AVLTreeDigest output.

* Update flaky test.

* Remove dead code, esp around tracking of incoming data.

* Update docs/changelog/96086.yaml

* Delete docs/changelog/96086.yaml

* Remove explicit compression calls.

This was added to prevent concurrency tests from failing, but it leads
to reduces precision. Submit this to see if the concurrency tests are
still failing.

* Revert "Remove explicit compression calls."

This reverts commit 5352c96.

* Remove explicit compression calls to MedianAbsoluteDeviation input.

* Add unittests for AVL and merging digest accuracy.

* Fix spotless violations.

* Delete redundant tests and benchmarks.

* Fix spotless violation.

* Use the old implementation of AVLTreeDigest.

The latest library version is 50% slower and less accurate, as verified
by ComparisonTests.

* Update docs with latest percentile results.

* Update docs with latest percentile results.

* Remove repeated compression calls.

* Update more percentile results.

* Use approximate percentile values in integration tests.

This helps with mixed cluster tests, where some of the tests where
blocked.

* Fix expected percentile value in test.

* Revert in-place node updates in AVL tree.

Update quantile calculations between centroids and min/max values to
match v.3.2.

* Add SortingDigest and HybridDigest.

The SortingDigest tracks all samples in an ArrayList that
gets sorted for quantile calculations. This approach
provides perfectly accurate results and is the most
efficient implementation for up to millions of samples,
at the cost of bloated memory footprint.

The HybridDigest uses a SortingDigest for small sample
populations, then switches to a MergingDigest. This
approach combines to the best performance and results for
small sample counts with very good performance and
acceptable accuracy for effectively unbounded sample
counts.

* Remove deps to the 3.2 library.

* Remove unused licenses for tdigest.

* Revert changes for SortingDigest and HybridDigest.

These will be submitted in a follow-up PR for enabling MergingDigest.

* Remove unused Histogram classes and unit tests.

Delete dead and commented out code, make the remaining tests run
reasonably fast. Remove unused annotations, esp. SuppressWarnings.

* Remove Comparison class, not used.

* Small fixes.

* Add javadoc and tests.

* Remove special logic for singletons in the boundaries.

While this helps with the case where the digest contains only
singletons (perfect accuracy), it has a major issue problem
(non-monotonic quantile function) when the first singleton is followed
by a non-singleton centroid. It's preferable to revert to the old
version from 3.2; inaccuracies in a singleton-only digest should be
mitigated by using a sorted array for small sample counts.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Tentatively restore percentile rank expected results.

* Use cdf version from 3.2

Update Dist.cdf to use interpolation, use the same cdf
version in AVLTreeDigest and MergingDigest.

* Revert "Tentatively restore percentile rank expected results."

This reverts commit 7718dbb.

* Revert remaining changes compared to main.

* Revert excluded V7 compat tests.

* Exclude V7 compat tests still failing.

* Exclude V7 compat tests still failing.

* Restore bySize function in TDigest and subclasses.
stratoula added a commit to elastic/kibana that referenced this pull request Jun 19, 2023
## Summary

Closes #159615

This started faling most possibly because ES upgraded the TDigest
library used for percentiles
elastic/elasticsearch#96086 and now we can have
more accurate results.


Runner
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2397
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure >enhancement Team:Delivery Meta label for Delivery team test-full-bwc Trigger full BWC version matrix tests v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants