fix(core): use compensated summation for histograms by atanzu · Pull Request #1666 · DataDog/saluki

atanzu · 2026-05-15T19:33:06Z

Summary

Use modified Neumaier algorithm to calculate sums, counts, and quantiles for histogram samples.

The naive sum += value * weight loop suffers catastrophic cancellation when the sample stream contains values of wildly different magnitudes. The classic Kahan/Peters counter-example {1, +1e100, 1, -1e100} evaluates to 0 with naive summation but to the correct 2.0 with the new algorithm.

Change Type

Bug fix
New feature
Non-functional (chore, refactoring, docs)
Performance

How did you test this PR?

Added unit tests to check correctness.

References

Similar PR in Datadog-Agent.

Use modified Neumaier algorithm to calculate sums, counts, and quantiles for histogram samples. The naive `sum += value * weight` loop suffers catastrophic cancellation when the sample stream contains values of wildly different magnitudes. The classic Kahan/Peters counter-example `{1, +1e100, 1, -1e100}` evaluates to 0 with naive summation but to the correct 2.0 with the new algorithm. Signed-off-by: Mark Kirichenko <mark.kirichenko@datadoghq.com>

pr-commenter · 2026-05-15T19:40:10Z

Binary Size Analysis (Agent Data Plane)

Target: 55bca14 (baseline) vs df02eb3 (comparison) diff
Analysis Type: Stripped binaries (debug symbols excluded)
Baseline Size: 37.36 MiB
Comparison Size: 37.32 MiB
Size Change: -39.87 KiB (-0.10%)
Pass/Fail Threshold: +5%
Result: PASSED ✅

Changes by Module

Module	File Size	Symbols
`core`	-33.99 KiB	1451
`smallvec`	+22.57 KiB	62
`figment`	-10.78 KiB	12
`saluki_core::data_model::event`	-8.63 KiB	22
`[Unmapped]`	-4.36 KiB	1
`anon.0cac25fc52ba6f4fc475348a8c66d8e3.39.llvm.4081512695721216103`	+3.90 KiB	1
`anon.8bb1cfcdf181d421e8889bc3626b8144.17.llvm.125851317713716366`	-3.81 KiB	1
`[sections]`	-3.30 KiB	7
`anon.bcfbe2edddf7aafb2d5d5e0cc5ffa1e5.16.llvm.12024661772337668300`	-3.15 KiB	1
`hashbrown`	+3.15 KiB	24
`anon.0f2fa1d1fad1031510176699744ee20b.644.llvm.3708903403341574001`	+3.06 KiB	1
`papaya`	+2.56 KiB	11
`anon.eb51c975e2567ebfca80d7da0abd4cd1.7.llvm.11556325092004374934`	-2.55 KiB	1
`anon.40340ab29c26454e228b882d4ecb70d4.0.llvm.14605026808982526804`	+2.54 KiB	1
`saluki_api::DynamicRoute::http`	+1.99 KiB	1
`anon.bdadb00871588c874a55b8b73ce579a9.0.llvm.8997752761958168434`	-1.93 KiB	1
`anon.41b0b8befc3118c3a2b0f17ec06872eb.9.llvm.17862187855904175761`	+1.93 KiB	1
`alloc`	-1.91 KiB	68
`serde_core`	+1.88 KiB	40
`tokio_util`	-1.84 KiB	11

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +6.41Ki  [NEW] +6.33Ki    matchit::router::Router<T>::insert::h261aa988f7ce6a25
  [NEW] +6.14Ki  [NEW] +6.05Ki    matchit::router::Router<T>::insert::hfe75556626a71a8d
  [NEW] +6.07Ki  [NEW] +5.99Ki    matchit::tree::Node<T>::insert::h889792c8a42776b1
  +904% +5.75Ki +10e2% +5.75Ki    axum::routing::Router<S>::route::h0531592b945792f9
  [NEW] +5.42Ki  [NEW] +5.33Ki    serde_core::de::MapAccess::next_value::ha8691cfa6c85eadd
  +795% +5.40Ki  +908% +5.40Ki    tokio::runtime::runtime::Runtime::block_on::h59580466f3413de1
  +992% +5.27Ki +12e2% +5.27Ki    tokio::runtime::runtime::Runtime::block_on::h26cc5fddc7a2e095
  [NEW] +3.90Ki  [NEW]     +16    anon.0cac25fc52ba6f4fc475348a8c66d8e3.39.llvm.4081512695721216103
  [NEW] +3.06Ki  [NEW]     +74    anon.0f2fa1d1fad1031510176699744ee20b.644.llvm.3708903403341574001
  [DEL] -2.55Ki  [DEL]     -80    anon.eb51c975e2567ebfca80d7da0abd4cd1.7.llvm.11556325092004374934
  [DEL] -2.89Ki  [DEL] -2.77Ki    quick_cache::shard::CacheShard<Key,Val,We,B,L,Plh>::insert::haf55a3f4a81ec55a
  [DEL] -3.15Ki  [DEL]     -74    anon.bcfbe2edddf7aafb2d5d5e0cc5ffa1e5.16.llvm.12024661772337668300
  [DEL] -3.48Ki  [DEL] -2.33Ki    _<serde_core::de::impls::<impl serde_core::de::Deserialize for core::time::Duration>::deserialize::DurationVisitor as serde_core::de::Visitor>::visit_map::hed68c5e9e330be6d
  [DEL] -3.81Ki  [DEL]     -16    anon.8bb1cfcdf181d421e8889bc3626b8144.17.llvm.125851317713716366
 -51.3% -4.36Ki  [ = ]       0    [Unmapped]
  [DEL] -5.26Ki  [DEL] -5.10Ki    _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::hedaa6bc6b4faff3a
  [DEL] -6.01Ki  [DEL] -5.93Ki    matchit::tree::Node<T>::insert::ha63513d725e691c6
  [DEL] -6.11Ki  [DEL] -6.02Ki    matchit::router::Router<T>::insert::h02e8e157485e9a5b
  [DEL] -6.17Ki  [DEL] -6.06Ki    axum::routing::path_router::PathRouter<S,_>::route::h75fd76341dec4a4b
  [DEL] -6.39Ki  [DEL] -6.31Ki    matchit::tree::Node<T>::insert::h2e5068899f601750
  -0.6% -37.1Ki  -0.6% -25.2Ki    [7148 Others]
  -0.1% -39.9Ki  -0.1% -19.6Ki    TOTAL

pr-commenter · 2026-05-15T19:54:30Z

Regression Detector (Agent Data Plane)

Run ID: 540d69aa-d5f5-4b28-a395-65c42b03be5d
Baseline: 55bca143 · Comparison: df02eb32 · Diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment	goal	Δ mean %	links
dsd_uds_1mb_3k_contexts_cpu (erratic)	cpu	⚪ +7.96	metrics profiles logs
otlp_ingest_metrics_5mb_memory	memory	⚪ +3.73	metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput	throughput	⚪ -3.00	metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic)	cpu	⚪ +2.36	metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic)	cpu	⚪ +0.75	metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic)	cpu	⚪ +0.36	metrics profiles logs
dsd_uds_10mb_3k_contexts_memory	memory	⚪ +0.28	metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory	memory	⚪ +0.24	metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic)	cpu	⚪ +0.22	metrics profiles logs
otlp_ingest_traces_5mb_memory	memory	⚪ +0.16	metrics profiles logs
quality_gates_rss_dsd_heavy	memory	⚪ +0.16	metrics profiles logs
quality_gates_rss_dsd_low	memory	⚪ +0.14	metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic)	cpu	⚪ +0.10	metrics profiles logs
quality_gates_rss_idle	memory	⚪ +0.09	metrics profiles logs
otlp_ingest_metrics_5mb_throughput	throughput	⚪ -0.02	metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored)	throughput	⚪ -0.02	metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory	memory	⚪ +0.01	metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput	throughput	⚪ -0.00	metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput	throughput	⚪ -0.00	metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput	throughput	⚪ +0.00	metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput	throughput	⚪ +0.02	metrics profiles logs
otlp_ingest_traces_5mb_throughput	throughput	⚪ +0.03	metrics profiles logs
dsd_uds_500mb_3k_contexts_memory	memory	⚪ -0.03	metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput	throughput	⚪ +0.08	metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput	throughput	⚪ +0.12	metrics profiles logs
dsd_uds_512kb_3k_contexts_memory	memory	⚪ -0.16	metrics profiles logs
quality_gates_rss_dsd_medium	memory	⚪ -0.16	metrics profiles logs
quality_gates_rss_dsd_ultraheavy	memory	⚪ -0.17	metrics profiles logs
dsd_uds_100mb_3k_contexts_memory	memory	⚪ -0.23	metrics profiles logs
dsd_uds_1mb_3k_contexts_memory	memory	⚪ -0.25	metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic)	cpu	⚪ -0.55	metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic)	cpu	⚪ -0.94	metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic)	cpu	⚪ -1.27	metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored)	memory	⚪ -1.29	metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored)	cpu	⚪ -1.80	metrics profiles logs

Bounds Checks: ✅ Passed (5)

experiment	check	replicates	observed	links
quality_gates_rss_dsd_heavy	memory_usage	10/10	✅ 123 MiB ≤ 140 MiB	metrics profiles logs
quality_gates_rss_dsd_low	memory_usage	10/10	✅ 39.8 MiB ≤ 50 MiB	metrics profiles logs
quality_gates_rss_dsd_medium	memory_usage	10/10	✅ 60.9 MiB ≤ 75 MiB	metrics profiles logs
quality_gates_rss_dsd_ultraheavy	memory_usage	10/10	✅ 178 MiB ≤ 200 MiB	metrics profiles logs
quality_gates_rss_idle	memory_usage	10/10	✅ 27.1 MiB ≤ 40 MiB	metrics profiles logs

Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df02eb32c4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-18T05:28:22Z

                let mut ddsketch = DDSketch::default();
                for sample in histogram.samples() {
-                    ddsketch.insert_n(sample.value.into_inner(), sample.weight);
+                    ddsketch.insert_n(sample.value.into_inner(), sample.weight.0 as u64);


Preserve fractional histogram weights when encoding sketches

When this encoder handles a histogram built from a non-integer sample rate such as @0.21, Histogram::insert now stores sample.weight as the raw weight (~4.76), while summary.count() rounds that to the nearest sample count. This cast truncates the same sample to 4 before inserting it into the DDSketch, so encoded histogram payloads undercount fractional-weight samples and can disagree with the aggregate count/sum produced from the same histogram. Convert the raw weight using the same rounding/accounting policy before passing it to insert_n.

Useful? React with 👍 / 👎.

## Summary Use modified Neumaier algorithm to calculate sums, counts, and quantiles for histogram samples. The naive `sum += value * weight` loop suffers catastrophic cancellation when the sample stream contains values of wildly different magnitudes. The classic Kahan/Peters counter-example `{1, +1e100, 1, -1e100}` evaluates to 0 with naive summation but to the correct 2.0 with the new algorithm. ## Change Type - [x] Bug fix - [ ] New feature - [ ] Non-functional (chore, refactoring, docs) - [ ] Performance ## How did you test this PR? Added unit tests to check correctness. ## References [Similar PR in Datadog-Agent](DataDog/datadog-agent#49913). Co-authored-by: mark.kirichenko <mark.kirichenko@datadoghq.com> 19791e3

dd-octo-sts Bot added area/core Core functionality, event model, etc. area/components Sources, transforms, and destinations. transform/aggregate Aggregate transform. destination/prometheus Prometheus Scrape destination. encoder/datadog-metrics Datadog Metrics encoder. labels May 15, 2026

tobz changed the title ~~Use compensated summation for histograms~~ fix(chore): use compensated summation for histograms May 15, 2026

tobz approved these changes May 15, 2026

View reviewed changes

atanzu changed the title ~~fix(chore): use compensated summation for histograms~~ fix(core): use compensated summation for histograms May 18, 2026

atanzu marked this pull request as ready for review May 18, 2026 05:25

atanzu requested a review from a team as a code owner May 18, 2026 05:25

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: queued labels May 18, 2026

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

gh-worker-dd-mergequeue-cf854d Bot merged commit 19791e3 into main May 18, 2026
80 of 82 checks passed

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: done and removed mergequeue-status: in_progress labels May 18, 2026

atanzu deleted the mark.kirichenko/use-compensated-summation branch May 18, 2026 05:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): use compensated summation for histograms#1666

fix(core): use compensated summation for histograms#1666
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit into
mainfrom
mark.kirichenko/use-compensated-summation

atanzu commented May 15, 2026

Uh oh!

pr-commenter Bot commented May 15, 2026

Changes by Module

Detailed Symbol Changes

Uh oh!

pr-commenter Bot commented May 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

atanzu commented May 15, 2026

Summary

Change Type

How did you test this PR?

References

Uh oh!

pr-commenter Bot commented May 15, 2026

Binary Size Analysis (Agent Data Plane)

Changes by Module

Detailed Symbol Changes

Uh oh!

pr-commenter Bot commented May 15, 2026

Regression Detector (Agent Data Plane)

Optimization Goals: ✅ No significant changes detected

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants