Skip to content

enhancement(datadog encoder): add datadog metrics v3 mode switching#1599

Merged
tobz merged 7 commits into
tobz/datadog-metrics-v3-payload-supportfrom
rayz/v3-mode-switching
May 8, 2026
Merged

enhancement(datadog encoder): add datadog metrics v3 mode switching#1599
tobz merged 7 commits into
tobz/datadog-metrics-v3-payload-supportfrom
rayz/v3-mode-switching

Conversation

@rayz
Copy link
Copy Markdown
Contributor

@rayz rayz commented May 6, 2026

Summary

This PR adds Datadog metrics V3 mode switching to ADP, aligning the encoder/forwarder behavior with
the datadog-agent V2/V3 setup. This branch is branched off of tobz/datadog-metrics-v3-payload-support which contains the initial v3 implementation.

It supports the Agent-style modes per endpoint and payload family:

  • V2Only: send V2 payloads only.
  • V3Enabled: send V3 to endpoints configured for V3, while still allowing V2 payloads for legacy/
    additional endpoints.
  • Validation: send both V2 and V3 for comparison, with matching validation batch IDs.

Correctness coverage

  • Adds a dsd-plain-v3-validation correctness scenario for DogStatsD metrics with V3 validation mode
    enabled for series and sketches.
  • Updates the correctness intake to correlate V2/V3 validation payloads using X-Metrics-Request-*
    headers, compare paired payloads, and count V3 payloads in the metrics dump.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

New correctness test added

References

@dd-octo-sts dd-octo-sts Bot added area/components Sources, transforms, and destinations. area/test All things testing: unit/integration, correctness, SMP regression, etc. encoder/datadog-metrics Datadog Metrics encoder. labels May 6, 2026
@rayz rayz changed the title Rayz/v3 mode switching enhancement(datadog encoder): add datadog metrics v3 mode switching May 6, 2026
@dd-octo-sts dd-octo-sts Bot added the area/ci CI/CD, automated testing, etc. label May 6, 2026
@rayz rayz marked this pull request as ready for review May 6, 2026 19:01
@rayz rayz requested a review from a team as a code owner May 6, 2026 19:01
Copilot AI review requested due to automatic review settings May 6, 2026 19:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2dc91276bc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +491 to 493
if endpoint_mode.needs_batch_id() && (v2_flushed || v3_flushed) {
*batch_id = None;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Retain batch ID after V2-triggered split flush

When v2_flushed is true, this branch clears batch_id immediately even though the current metric is pushed back into v3_metrics for the next batch. If no additional metric arrives before the timeout flush, that pending metric is flushed with series_active_batch_id/sketches_active_batch_id = None, so X-Metrics-Request-* headers are omitted and the intake cannot correlate the V2/V3 validation pair. This causes intermittent loss of validation pairing under low traffic or stream shutdown right after a size-triggered flush.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 5716def

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Datadog metrics V3 “mode switching” (V2-only / V3-enabled / validation) to the metrics encoder/forwarder pipeline, and introduces a new correctness scenario to exercise validation mode by sending both V2+V3 with correlated X-Metrics-Request-* headers and pairing them in the correctness intake.

Changes:

  • Introduces per-metric-family encoder modes (series vs sketches) and adjusts V2/V3 batching + tagging/validation header behavior accordingly.
  • Updates the forwarder I/O path to strip validation headers for endpoints that should not receive them (validation headers become endpoint- and family-scoped).
  • Adds a new correctness test case (dsd-plain-v3-validation) and extends CI/Makefile targets to run it; correctness intake now correlates V2/V3 validation payloads by request headers and compares paired batches.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/correctness/dsd-plain-v3-validation/millstone.yaml Adds Millstone workload for the new V3 validation correctness scenario.
test/correctness/dsd-plain-v3-validation/datadog.yaml Configures agent to enable V3 validation mode for series + sketches.
test/correctness/dsd-plain-v3-validation/config.yaml Wires baseline/comparison/intake/millstone containers for the new scenario.
Makefile Adds a test-correctness-dsd-plain-v3-validation target and includes it in the suite.
lib/saluki-components/src/encoders/datadog/metrics/mod.rs Implements per-family encoder modes and updates batching/tagging/validation flush behavior.
lib/saluki-components/src/common/datadog/protocol.rs Removes any_v3_enabled helper (no longer needed with mode switching).
lib/saluki-components/src/common/datadog/io.rs Strips validation headers for endpoints that shouldn’t receive them.
lib/saluki-components/src/common/datadog/endpoints.rs Adds endpoint decision helper for whether validation headers should be sent; includes unit test.
bin/correctness/datadog-intake/src/app/metrics/state.rs Tracks validation batches and compares V2/V3 paired payloads.
bin/correctness/datadog-intake/src/app/metrics/handlers.rs Extracts validation headers and routes requests into validation accumulation vs normal merge.
.gitlab/e2e.yml Adds CI jobs for dsd-plain-v3 and dsd-plain-v3-validation correctness cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread bin/correctness/datadog-intake/src/app/metrics/handlers.rs
Comment thread bin/correctness/datadog-intake/src/app/metrics/state.rs
Comment thread bin/correctness/datadog-intake/src/app/metrics/state.rs Outdated
Comment thread bin/correctness/datadog-intake/src/app/metrics/state.rs
Copilot AI review requested due to automatic review settings May 6, 2026 20:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Member

@tobz tobz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me as-is.

Upstream branch is definitely behind main so hopefully I don't break any of this when rebasing it. 😅

@tobz
Copy link
Copy Markdown
Member

tobz commented May 8, 2026

Going to merge directly despite the failures because, again, the upstream branch needs to be heavily rebased anyways. Will resolve any of the stuff here that carries over when doing so.

@tobz tobz merged commit 669410d into tobz/datadog-metrics-v3-payload-support May 8, 2026
57 of 62 checks passed
@tobz tobz deleted the rayz/v3-mode-switching branch May 8, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci CI/CD, automated testing, etc. area/components Sources, transforms, and destinations. area/test All things testing: unit/integration, correctness, SMP regression, etc. encoder/datadog-metrics Datadog Metrics encoder.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants