enhancement(datadog encoder): add datadog metrics v3 mode switching#1599
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2dc91276bc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if endpoint_mode.needs_batch_id() && (v2_flushed || v3_flushed) { | ||
| *batch_id = None; | ||
| } |
There was a problem hiding this comment.
Retain batch ID after V2-triggered split flush
When v2_flushed is true, this branch clears batch_id immediately even though the current metric is pushed back into v3_metrics for the next batch. If no additional metric arrives before the timeout flush, that pending metric is flushed with series_active_batch_id/sketches_active_batch_id = None, so X-Metrics-Request-* headers are omitted and the intake cannot correlate the V2/V3 validation pair. This causes intermittent loss of validation pairing under low traffic or stream shutdown right after a size-triggered flush.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR adds Datadog metrics V3 “mode switching” (V2-only / V3-enabled / validation) to the metrics encoder/forwarder pipeline, and introduces a new correctness scenario to exercise validation mode by sending both V2+V3 with correlated X-Metrics-Request-* headers and pairing them in the correctness intake.
Changes:
- Introduces per-metric-family encoder modes (series vs sketches) and adjusts V2/V3 batching + tagging/validation header behavior accordingly.
- Updates the forwarder I/O path to strip validation headers for endpoints that should not receive them (validation headers become endpoint- and family-scoped).
- Adds a new correctness test case (
dsd-plain-v3-validation) and extends CI/Makefile targets to run it; correctness intake now correlates V2/V3 validation payloads by request headers and compares paired batches.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/correctness/dsd-plain-v3-validation/millstone.yaml | Adds Millstone workload for the new V3 validation correctness scenario. |
| test/correctness/dsd-plain-v3-validation/datadog.yaml | Configures agent to enable V3 validation mode for series + sketches. |
| test/correctness/dsd-plain-v3-validation/config.yaml | Wires baseline/comparison/intake/millstone containers for the new scenario. |
| Makefile | Adds a test-correctness-dsd-plain-v3-validation target and includes it in the suite. |
| lib/saluki-components/src/encoders/datadog/metrics/mod.rs | Implements per-family encoder modes and updates batching/tagging/validation flush behavior. |
| lib/saluki-components/src/common/datadog/protocol.rs | Removes any_v3_enabled helper (no longer needed with mode switching). |
| lib/saluki-components/src/common/datadog/io.rs | Strips validation headers for endpoints that shouldn’t receive them. |
| lib/saluki-components/src/common/datadog/endpoints.rs | Adds endpoint decision helper for whether validation headers should be sent; includes unit test. |
| bin/correctness/datadog-intake/src/app/metrics/state.rs | Tracks validation batches and compares V2/V3 paired payloads. |
| bin/correctness/datadog-intake/src/app/metrics/handlers.rs | Extracts validation headers and routes requests into validation accumulation vs normal merge. |
| .gitlab/e2e.yml | Adds CI jobs for dsd-plain-v3 and dsd-plain-v3-validation correctness cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tobz
left a comment
There was a problem hiding this comment.
Seems fine to me as-is.
Upstream branch is definitely behind main so hopefully I don't break any of this when rebasing it. 😅
|
Going to merge directly despite the failures because, again, the upstream branch needs to be heavily rebased anyways. Will resolve any of the stuff here that carries over when doing so. |
669410d
into
tobz/datadog-metrics-v3-payload-support
Summary
This PR adds Datadog metrics V3 mode switching to ADP, aligning the encoder/forwarder behavior with
the datadog-agent V2/V3 setup. This branch is branched off of
tobz/datadog-metrics-v3-payload-supportwhich contains the initial v3 implementation.It supports the Agent-style modes per endpoint and payload family:
V2Only: send V2 payloads only.V3Enabled: send V3 to endpoints configured for V3, while still allowing V2 payloads for legacy/additional endpoints.
Validation: send both V2 and V3 for comparison, with matching validation batch IDs.Correctness coverage
dsd-plain-v3-validationcorrectness scenario for DogStatsD metrics with V3 validation modeenabled for series and sketches.
X-Metrics-Request-*headers, compare paired payloads, and count V3 payloads in the metrics dump.
Change Type
How did you test this PR?
New correctness test added
References