Skip to content

feat(metrics): support v3 payload splitting#1758

Merged
rayz merged 3 commits into
tobz/datadog-metrics-v3-payload-supportfrom
rayz/metrics-v3-payload-splitting-clean
May 28, 2026
Merged

feat(metrics): support v3 payload splitting#1758
rayz merged 3 commits into
tobz/datadog-metrics-v3-payload-supportfrom
rayz/metrics-v3-payload-splitting-clean

Conversation

@rayz
Copy link
Copy Markdown
Contributor

@rayz rayz commented May 27, 2026

Summary

Adds V3 metrics payload splitting so ADP no longer assumes each V3 flush produces a single request.

The encoder now enforces V3 series/sketch payload limits, drops unsendable zero-point or oversized metrics, and emits correct X-Metrics-Request-Seq / X-Metrics-Request-Len headers when a flush produces multiple requests.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

unit tests / ci

References

@rayz rayz changed the title add support for splitting v3 payloads feat(metrics): support v3 payload splitting May 27, 2026
@dd-octo-sts dd-octo-sts Bot added area/components Sources, transforms, and destinations. encoder/datadog-metrics Datadog Metrics encoder. labels May 27, 2026
@datadog-prod-us1-5

This comment has been minimized.

@rayz rayz force-pushed the rayz/metrics-v3-payload-splitting-clean branch from 1a4fe2d to 4989436 Compare May 27, 2026 22:32
@rayz rayz marked this pull request as ready for review May 27, 2026 22:34
@rayz rayz requested a review from a team as a code owner May 27, 2026 22:34
Copilot AI review requested due to automatic review settings May 27, 2026 22:34
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4989436a02

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +296 to +298
let v3_payload_limits = V3PayloadLimits::new(
self.max_series_payload_size,
self.max_series_uncompressed_payload_size,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Separate V3 sketch payload limits from series limits

When V3 sketches are enabled, this single v3_payload_limits value is also used by encode_and_flush_v3_sketch_metrics, so sketch splitting/dropping is controlled by serializer_max_series_payload_size and serializer_max_series_uncompressed_payload_size rather than the generic sketch limits. In a config that only intends to constrain series payloads (for example setting the series max to 0 or lower than the generic limit), V3 sketch payloads are unexpectedly split or dropped even though serializer_max_payload_size would allow them.

Useful? React with 👍 / 👎.

Comment on lines +989 to +992
if metric_points == 0 {
// The Agent drops zero-point V3 metrics before writing them.
context.telemetry.events_dropped_encoder().increment(1);
continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude zero-point metrics from open V3 ranges

For zero-point metrics that occur after a non-empty metric, this continue leaves current_start unchanged, so the later range (for example 0..metrics.len()) still includes the metric that was just counted as dropped. That means a batch like [valid_metric, empty_metric] still encodes and reports the empty V3 metric instead of dropping it, while also incrementing the dropped counter.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds V3 metrics payload splitting so a single flush can produce multiple HTTP requests while enforcing payload size and point-count limits.

Changes:

  • Introduces V3 payload limit/request helper types.
  • Splits V3 series and sketch flushes by point count and encoded payload size.
  • Adds tests for V3 splitting and batch header propagation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
lib/saluki-components/src/encoders/datadog/metrics/v3/payload.rs Adds V3 payload limit and request metadata structs.
lib/saluki-components/src/encoders/datadog/metrics/v3/mod.rs Exposes the new V3 payload module internally.
lib/saluki-components/src/encoders/datadog/metrics/mod.rs Wires V3 payload limits into the encoder and implements split/flush behavior with tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +296 to +301
let v3_payload_limits = V3PayloadLimits::new(
self.max_series_payload_size,
self.max_series_uncompressed_payload_size,
self.max_metrics_per_payload,
SERIES_V3_POINTS_PER_PAYLOAD_LIMIT,
);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional for datadog-agent parity

Comment on lines +990 to +991
// The Agent drops zero-point V3 metrics before writing them.
context.telemetry.events_dropped_encoder().increment(1);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9a656ae

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 1ad7cde · Comparison: 608d534 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 37.65 MiB (baseline) vs 37.98 MiB (comparison)
Size Change: +333.23 KiB (+0.86%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
saluki_components::encoders::datadog +73.60 KiB 356
core +53.32 KiB 14508
anyhow +39.32 KiB 1699
alloc +36.42 KiB 2407
hyper +27.89 KiB 583
saluki_components::sources::otlp -21.78 KiB 236
hashbrown +20.89 KiB 1114
figment +20.88 KiB 700
[sections] +15.08 KiB 9
axum +13.21 KiB 427
chrono -13.08 KiB 23
http +11.74 KiB 419
serde_core +10.91 KiB 889
hyper_util -10.21 KiB 139
tokio +9.87 KiB 4504
h2 +9.04 KiB 779
saluki_components::common::datadog +8.80 KiB 439
saluki_components::relays::otlp +7.62 KiB 47
tracing -6.83 KiB 175
http_body_util -6.40 KiB 215
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +1.6%  +332Ki  +1.6%  +260Ki    [47066 Others]
  [NEW]  +136Ki  [NEW]  +136Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::hf265beba717a8ffb
  [NEW] +66.7Ki  [NEW] +66.5Ki    saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::hef6f38503239c166
  [NEW] +66.1Ki  [NEW] +66.0Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::hdcdf70adbde87ffc
  [NEW] +58.7Ki  [NEW] +58.5Ki    agent_data_plane::internal::env::workload::build_collector::_{{closure}}::ha56b75595a60c965
  [NEW] +58.2Ki  [NEW] +58.0Ki    saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::hacbbeaf29f1d76dc
  [NEW] +57.1Ki  [NEW] +56.9Ki    agent_data_plane::internal::env::ADPEnvironmentProvider::from_configuration::_{{closure}}::hb77df6cc347f39b1
  [NEW] +56.7Ki  [NEW] +56.5Ki    agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::ha2e6796c57380dae
  [NEW] +56.0Ki  [NEW] +55.8Ki    agent_data_plane::cli::dogstatsd::handle_dogstatsd_command::_{{closure}}::hb45f834f247c7f00
  [NEW] +49.8Ki  [NEW] +49.5Ki    agent_data_plane::main::_{{closure}}::hafe55c655b44bc0f
  [NEW] +49.7Ki  [NEW] +49.6Ki    core::ops::function::FnOnce::call_once::he29f19ec791745c2
  [DEL] -48.9Ki  [DEL] -48.8Ki    core::ops::function::FnOnce::call_once::h669ff363e842dff3
  [DEL] -49.7Ki  [DEL] -49.4Ki    agent_data_plane::main::_{{closure}}::h7d49a9469c34b899
  [DEL] -56.5Ki  [DEL] -56.3Ki    agent_data_plane::cli::dogstatsd::handle_dogstatsd_command::_{{closure}}::he79b0808df75cda1
  [DEL] -56.6Ki  [DEL] -56.4Ki    agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::h7c655b6902e2f89a
  [DEL] -57.1Ki  [DEL] -56.9Ki    agent_data_plane::internal::env::ADPEnvironmentProvider::from_configuration::_{{closure}}::ha0920a2b79b02cce
  [DEL] -57.8Ki  [DEL] -57.6Ki    saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h32219546faa6f65a
  [DEL] -58.4Ki  [DEL] -58.2Ki    agent_data_plane::internal::env::workload::build_collector::_{{closure}}::ha8e6b4b93de14a84
  [DEL] -66.2Ki  [DEL] -66.0Ki    saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::hfe1e9d26bba4c0cd
  [DEL] -66.3Ki  [DEL] -66.1Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h9611a56813a60eb8
  [DEL]  -136Ki  [DEL]  -136Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::h4399bd59161f275e
  +0.9%  +333Ki  +0.8%  +261Ki    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Regression Detector (Agent Data Plane)

Run ID: 90bc22be-e1b8-4fcb-9f94-32c3e4f104a6
Baseline: 1ad7cde2 · Comparison: 608d5349 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ +6.22 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ +4.67 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ +1.43 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ +1.22 metrics profiles logs
quality_gates_rss_idle memory ⚪ +1.17 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ +1.14 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +1.11 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ +0.99 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ +0.92 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +0.89 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ +0.52 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.49 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ +0.47 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +0.47 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ +0.46 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ +0.36 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ +0.33 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +0.30 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.20 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ +0.19 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.03 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ -0.03 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.02 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.02 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ -0.02 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ -0.13 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -0.75 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +1.48 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ -1.55 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ -1.88 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ -3.30 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 123 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 40 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.3 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 178 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 27.3 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@rayz rayz merged commit 56bd735 into tobz/datadog-metrics-v3-payload-support May 28, 2026
73 checks passed
@rayz rayz deleted the rayz/metrics-v3-payload-splitting-clean branch May 28, 2026 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/components Sources, transforms, and destinations. encoder/datadog-metrics Datadog Metrics encoder.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants