Skip to content

fix(dogstatsd): match agent timestamped count sampling#1629

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
feat/dsd-timestamp-extension-correctness
May 13, 2026
Merged

fix(dogstatsd): match agent timestamped count sampling#1629
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
feat/dsd-timestamp-extension-correctness

Conversation

@thieman
Copy link
Copy Markdown
Contributor

@thieman thieman commented May 12, 2026

Summary

Updates the lading-payload dependency to the lading commit that can generate DogStatsD metric timestamps, then adds a correctness case for DogStatsD |T timestamp extension parity.

The test exposed a DogStatsD count parity bug: Saluki reinflated timestamped count values by @sample_rate, while the Datadog Agent no-aggregation pipeline forwards timestamped counts as pre-aggregated samples and does not apply sample-rate reinflation. This PR changes Saluki to match the Agent behavior for timestamped counts while preserving existing sample-rate handling for non-timestamped counts.

Key changes

  • Bump lading-payload to f06a75d63397c4e0210ca9184a3240515e547fec.
  • Add dsd-timestamp-extension correctness coverage for DogStatsD count and gauge metrics with |T on every generated metric.
  • Match Datadog Agent no-aggregation semantics by ignoring @sample_rate for timestamped DogStatsD counts.
  • Add parser unit coverage for timestamped count sample-rate behavior.

Test plan

  • cargo check -p millstone --locked
  • cargo test -p saluki-io metric_ --locked
  • cargo check -p saluki-io --locked
  • make fmt
  • cargo check --workspace
  • cargo check --workspace --tests
  • make build-correctness-tools-image
  • make build-datadog-agent-image-release
  • make test-correctness-case CASE=dsd-timestamp-extension

@dd-octo-sts dd-octo-sts Bot added area/io General I/O and networking. area/test All things testing: unit/integration, correctness, SMP regression, etc. labels May 12, 2026
@thieman thieman force-pushed the feat/dsd-timestamp-extension-correctness branch from 0901125 to 7aca132 Compare May 12, 2026 13:37
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 12, 2026

Binary Size Analysis (Agent Data Plane)

Target: eaba660 (baseline) vs 6ce845b (comparison) diff
Analysis Type: Stripped binaries (debug symbols excluded)
Baseline Size: 37.79 MiB
Comparison Size: 37.79 MiB
Size Change: -1.45 KiB (-0.00%)
Pass/Fail Threshold: +5%
Result: PASSED ✅

Changes by Module

Module File Size Symbols
anon.cf211850cb5a0474cf4fe0d46cc11d3e.99.llvm.17789380969698316968 -1.69 KiB 1
anon.fd7b529fe56f46398ecdea2e5e277011.5.llvm.6425510092197803827 +1.69 KiB 1
anon.4dcc86aaff35521a9f90eabb166ceeb8.418.llvm.15542242523218362313 -1.33 KiB 1
anon.dd5b1f3bfca0f8b909ebdc618a7bf277.21.llvm.1618916015308137990 +1.33 KiB 1
core +1.24 KiB 220
crossbeam_channel -957 B 11
[sections] -901 B 6
anon.7f89611b641a2d0028b07647399f351e.777.llvm.10929549773975020062 +810 B 1
anon.eeae9b2e26f2bcbc4d2b3483aec2acd0.124.llvm.5119949059111124976 -809 B 1
saluki_io::net::client +795 B 5
anon.5b14a8caac0857b9c5e903ad657737db.4.llvm.5658531652013958232 +727 B 1
anon.d2356be232af963d32de3ff679fa24b8.14.llvm.5968407893044336533 -726 B 1
anon.f6d4818ac94534343f6adba3392a5bb9.1.llvm.11196830799817033691 -643 B 1
anon.8d8d427fc2a63f9b42ef8d31008c06b2.19.llvm.979421898526345460 +641 B 1
anon.5b14a8caac0857b9c5e903ad657737db.48.llvm.5658531652013958232 -641 B 1
anon.9f797fdb4cbea7c17bea34eda4b15446.137.llvm.6064287306420364224 +640 B 1
anon.1bca455675b592a6268745e3ab7b6811.88.llvm.15185501355693506477 +635 B 1
anon.eeae9b2e26f2bcbc4d2b3483aec2acd0.92.llvm.5119949059111124976 -634 B 1
anon.a1685a5790225e85dd63e440de0fda1d.159.llvm.17794694613133727232 -552 B 1
anon.5b14a8caac0857b9c5e903ad657737db.54.llvm.5658531652013958232 +551 B 1

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +1.69Ki  [NEW]     +97    anon.fd7b529fe56f46398ecdea2e5e277011.5.llvm.6425510092197803827
  [NEW] +1.33Ki  [NEW]     +91    anon.dd5b1f3bfca0f8b909ebdc618a7bf277.21.llvm.1618916015308137990
  [NEW] +1.16Ki  [NEW]    +147    _<std::sync::poison::PoisonError<T> as core::fmt::Debug>::fmt::h3af5c6f2b6b015e6
   +17% +1.13Ki   +17% +1.13Ki    h2::codec::framed_write::Encoder<B>::buffer::h7014e99ca07a5110
  +830% +1.07Ki +30e2% +1.07Ki    _<&T as core::fmt::Debug>::fmt::h7696318a445b5915
  [NEW]    +810  [NEW]     +82    anon.7f89611b641a2d0028b07647399f351e.777.llvm.10929549773975020062
  [NEW]    +727  [NEW]     +98    anon.5b14a8caac0857b9c5e903ad657737db.4.llvm.5658531652013958232
  [NEW]    +641  [NEW]    +100    anon.8d8d427fc2a63f9b42ef8d31008c06b2.19.llvm.979421898526345460
  [NEW]    +640  [NEW]     +96    anon.9f797fdb4cbea7c17bea34eda4b15446.137.llvm.6064287306420364224
  [NEW]    +635  [NEW]     +92    anon.1bca455675b592a6268745e3ab7b6811.88.llvm.15185501355693506477
  [DEL]    -634  [DEL]     -92    anon.eeae9b2e26f2bcbc4d2b3483aec2acd0.92.llvm.5119949059111124976
  [DEL]    -641  [DEL]     -96    anon.5b14a8caac0857b9c5e903ad657737db.48.llvm.5658531652013958232
  [DEL]    -643  [DEL]    -100    anon.f6d4818ac94534343f6adba3392a5bb9.1.llvm.11196830799817033691
  [DEL]    -691  [DEL]    -595    _<&T as core::fmt::Debug>::fmt::h8ad6e70360eee548
  [DEL]    -714  [DEL]    -623    h2::frame::data::Data<T>::encode_chunk::h7028208793ac0190
  [DEL]    -726  [DEL]     -98    anon.d2356be232af963d32de3ff679fa24b8.14.llvm.5968407893044336533
  [DEL]    -797  [DEL]    -147    _<std::sync::poison::PoisonError<T> as core::fmt::Debug>::fmt::hd6bd3671164cce68
  [DEL]    -809  [DEL]     -82    anon.eeae9b2e26f2bcbc4d2b3483aec2acd0.124.llvm.5119949059111124976
  [DEL] -1.33Ki  [DEL]     -91    anon.4dcc86aaff35521a9f90eabb166ceeb8.418.llvm.15542242523218362313
  [DEL] -1.69Ki  [DEL]     -97    anon.cf211850cb5a0474cf4fe0d46cc11d3e.99.llvm.17789380969698316968
  -0.1% -2.66Ki  -0.0%    -929    [1013 Others]
  -0.0% -1.45Ki  +0.0%    +104    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 12, 2026

Regression Detector (Agent Data Plane)

Run ID: f702bb66-d6b4-4635-ba0a-68b7d047c282
Baseline: eaba6602 · Comparison: 6ce845bb · Diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ +7.12 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ +4.39 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ +3.49 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ +2.31 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +0.50 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.29 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ +0.21 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ +0.09 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ +0.05 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ -0.04 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.01 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ -0.01 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ +0.01 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.01 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ -0.01 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ +0.02 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.04 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -0.05 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ -0.14 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.15 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ -0.18 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +0.18 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ -0.32 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ -0.35 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ -0.49 metrics profiles logs
quality_gates_rss_idle memory ⚪ -0.53 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ -0.68 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ -1.20 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -2.28 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ -3.95 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 120 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.7 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.5 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 175 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 27 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

Comment thread Cargo.toml Outdated
home = { version = "0.5", default-features = false }
reqwest = { version = "0.13", default-features = false }
lading-payload = { git = "https://github.com/DataDog/lading", rev = "3eaedacabff0f3fc9947b019c020c5d020adf808" }
lading-payload = { git = "https://github.com/DataDog/lading", rev = "08f9908c9b459fff9e1f001e58f31f0951251db3" }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to wait for DataDog/lading#1873 to land and then restamp this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3b11dc6

@thieman thieman force-pushed the feat/dsd-timestamp-extension-correctness branch from 7aca132 to 44c49ec Compare May 12, 2026 14:01
@thieman thieman marked this pull request as ready for review May 12, 2026 14:22
@thieman thieman requested a review from a team as a code owner May 12, 2026 14:22
@thieman thieman force-pushed the feat/dsd-timestamp-extension-correctness branch from 3b11dc6 to 6ce845b Compare May 13, 2026 09:13
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit f0a3687 into main May 13, 2026
75 checks passed
dd-octo-sts Bot pushed a commit that referenced this pull request May 13, 2026
## Summary

Updates the `lading-payload` dependency to the lading commit that can generate DogStatsD metric timestamps, then adds a correctness case for DogStatsD `|T` timestamp extension parity.

The test exposed a DogStatsD count parity bug: Saluki reinflated timestamped count values by `@sample_rate`, while the Datadog Agent no-aggregation pipeline forwards timestamped counts as pre-aggregated samples and does not apply sample-rate reinflation. This PR changes Saluki to match the Agent behavior for timestamped counts while preserving existing sample-rate handling for non-timestamped counts.

## Key changes

- Bump `lading-payload` to `f06a75d63397c4e0210ca9184a3240515e547fec`.
- Add `dsd-timestamp-extension` correctness coverage for DogStatsD count and gauge metrics with `|T` on every generated metric.
- Match Datadog Agent no-aggregation semantics by ignoring `@sample_rate` for timestamped DogStatsD counts.
- Add parser unit coverage for timestamped count sample-rate behavior.

## Test plan

- `cargo check -p millstone --locked`
- `cargo test -p saluki-io metric_ --locked`
- `cargo check -p saluki-io --locked`
- `make fmt`
- `cargo check --workspace`
- `cargo check --workspace --tests`
- `make build-correctness-tools-image`
- `make build-datadog-agent-image-release`
- `make test-correctness-case CASE=dsd-timestamp-extension`

Co-authored-by: travis.thieman <travis.thieman@datadoghq.com> f0a3687
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/io General I/O and networking. area/test All things testing: unit/integration, correctness, SMP regression, etc. mergequeue-status: done

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants