Skip to content

chore(datadog_metrics sink): switch to v2 endpoint#24842

Merged
vladimir-dd merged 15 commits intomasterfrom
vladimir-dd/metrics-v2
Mar 17, 2026
Merged

chore(datadog_metrics sink): switch to v2 endpoint#24842
vladimir-dd merged 15 commits intomasterfrom
vladimir-dd/metrics-v2

Conversation

@vladimir-dd
Copy link
Copy Markdown
Contributor

@vladimir-dd vladimir-dd commented Mar 4, 2026

This PR switches the Datadog metrics sink to use the v2 series endpoint by default, automatically caps the batcher per endpoint to fix a memory issue and improve throughput, and adds a statsd_to_datadog_metrics(highload metrics-only pipeline) regression test (run).

Memory: Though this problem existed before on v1, the switch to v2's smaller payload limit (5 MiB vs 60 MiB) lowers the threshold at which unbounded batching causes memory pressure
to much lower loads. The fix automatically caps the batcher to each endpoint's payload limit so batches never need splitting, improving memory efficiency for both v1 and v2. Full benchmark results: #24874.

Throughput improvement after the memory fix detected by the regression tests:
Screenshot 2026-03-11 at 21 23 07
Regression tests before the fix(switching from v1 to v2) showed no difference.

Correctness: End-to-end validated against the real DataDog API (#24879) — all metric types (counter, gauge, set, distribution, aggregated histogram, aggregated summary) pass for both v1 and v2. v1 and v2 produce identical aggregated values.

@vladimir-dd vladimir-dd requested a review from a team as a code owner March 4, 2026 12:39
@github-actions github-actions Bot added the domain: sinks Anything related to the Vector's sinks label Mar 4, 2026
@vladimir-dd vladimir-dd changed the title Switch to Datadog Metrics v2 endpoint chore(datadog_metrics sink): Re-enable v2 endpoint Mar 4, 2026
@vladimir-dd vladimir-dd marked this pull request as draft March 4, 2026 13:03
Copy link
Copy Markdown
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some early feedback, this looks good overall.

It would be interesting to run some stress tests, it would be interesting to compare against v1.

There's an existing regression experiment here:

https://github.com/search?q=repo%3Avectordotdev%2Fvector+path%3A%2F%5Eregression%5C%2F%2F+datadog_metrics&type=code

Comment thread src/sinks/datadog/metrics/config.rs Outdated
@datadog-vectordotdev
Copy link
Copy Markdown

datadog-vectordotdev Bot commented Mar 4, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: f64f245 | Docs | Was this helpful? Give us feedback!

@vladimir-dd
Copy link
Copy Markdown
Contributor Author

Leaving some early feedback, this looks good overall.

It would be interesting to run some stress tests, it would be interesting to compare against v1.

There's an existing regression experiment here:

https://github.com/search?q=repo%3Avectordotdev%2Fvector+path%3A%2F%5Eregression%5C%2F%2F+datadog_metrics&type=code

thanks, was about to ask for some guidance for performance testing here 🙏

Comment thread src/sinks/datadog/metrics/encoder.rs Outdated
@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch 2 times, most recently from 8148cb3 to 9bb9b2b Compare March 5, 2026 17:26
Add regression test to validate datadog_metrics sink v2 endpoint
performance under realistic high-throughput DogStatsD load.

Test Configuration:
- Load: Default lading dogstatsd settings (realistic ~2KB messages)
- Throughput: 500 Mb/s → ~250k events/sec
- Batch: Default settings (100k max_events, 2s timeout)
- Validates batch splitting when payloads exceed v2 size limits

This test ensures v2 endpoint correctly handles batch splitting
with realistic high-cardinality DogStatsD metrics under load.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch from 9bb9b2b to 0cb05d2 Compare March 6, 2026 06:40
Different series endpoints have different uncompressed payload limits (v2
is 12x smaller than v1). This ensures each batch fits in a single HTTP
request without splitting, reducing memory overhead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pront
Copy link
Copy Markdown
Member

pront commented Mar 6, 2026

Note: will unsubscribe from this PR until it's "ready for review". When it's ready, we will prioritize reviewing it over other PRs.

Comment thread dd_metrics_v2_bench_results.md Fixed
vladimir-dd and others added 2 commits March 10, 2026 13:22
… config

- Remove references to DatadogMetricsCompression and request_compression
  from encoder.rs tests (those symbols don't exist in current codebase;
  they belong to an unmerged compression-options branch)
- Fix batcher_user_max_bytes_is_preserved test to avoid struct update
  syntax with private PhantomData fields in BatchConfig

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…TRICS_SERIES_V2_API env var

The old opt-in env var is now a no-op since v2 is the default. Emit a
one-time warning so existing users know they can safely remove it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch from 55877a4 to 8ecf755 Compare March 10, 2026 12:24
@vladimir-dd vladimir-dd marked this pull request as ready for review March 10, 2026 14:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ecf75526a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sinks/datadog/metrics/config.rs Outdated
vladimir-dd and others added 3 commits March 10, 2026 16:00
… and Sketches

Series v2 has a 5 MiB uncompressed payload limit while Sketches allows 60 MiB.
Previously both used the same (Series-derived) cap, which over-fragmented
sketch-heavy workloads into many small requests.

To support per-partition batch configuration, `PartitionedBatcher` now passes
the partition key to the batch config factory closure (`Fn(&Key) -> C` instead
of `Fn() -> C`). The explicit `timeout: Duration` parameter replaces the
previous extraction via `settings().timeout()`. All existing callers are
updated mechanically.

The `datadog_metrics` sink uses the new capability to select the appropriate
byte size limit per endpoint partition, keeping Series at 5 MiB and Sketches
at 60 MiB.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch from 104855e to 365d824 Compare March 11, 2026 18:09
@vladimir-dd vladimir-dd changed the title chore(datadog_metrics sink): Re-enable v2 endpoint chore(datadog_metrics sink): switch to v2 endpoint Mar 11, 2026
…refactor

- Add `Batch = B` constraint to `with_timer` test constructor so the
  compiler can infer `B` from the batch config type
- Replace `Box::new(move |_| ...)` with unboxed `move |_: &u8| ...` in
  tests to avoid HRTB inference issues with `Fn(&Key) -> C`
- Run `make fmt` to fix formatting in several sink files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch from 474cced to 3988235 Compare March 11, 2026 18:57
@vladimir-dd vladimir-dd requested a review from pront March 11, 2026 19:33
Copy link
Copy Markdown
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good to me. I left a comment about a pre-existing issue, interesting to hear your thoughts.

Comment thread changelog.d/datadog_metrics_series_v2_default.enhancement.md Outdated
@vladimir-dd vladimir-dd requested a review from a team as a code owner March 13, 2026 14:24
@github-actions github-actions Bot added the domain: external docs Anything related to Vector's external, public documentation label Mar 13, 2026
@vladimir-dd vladimir-dd requested a review from pront March 13, 2026 14:37
hestonhoffman
hestonhoffman previously approved these changes Mar 13, 2026
pront
pront previously approved these changes Mar 13, 2026
Copy link
Copy Markdown
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, excited about this change.

@vladimir-dd vladimir-dd added this pull request to the merge queue Mar 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Mar 16, 2026
…zation strips interval_ms

`into_absolute()` always sets `interval_ms: None`, so the encoded
interval for gauges is always 0. The test expectation of 10 was
dead code until v2 became the default endpoint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vladimir-dd vladimir-dd enabled auto-merge March 16, 2026 12:12
@vladimir-dd vladimir-dd added this pull request to the merge queue Mar 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Mar 16, 2026
Copy link
Copy Markdown
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check failed: https://github.com/vectordotdev/vector/actions/runs/23143552898/job/67226322004

We need to update the DD metrics E2E tests: https://github.com/vectordotdev/vector/blob/master/tests/e2e/datadog-metrics/config/test.yaml#L14

I would add series_api_version: ['v1', 'v2'] entry to the matrix above.

@vladimir-dd vladimir-dd dismissed stale reviews from pront and hestonhoffman via 08f4032 March 16, 2026 14:24
@pront
Copy link
Copy Markdown
Member

pront commented Mar 17, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch from b72c6d2 to 8635670 Compare March 17, 2026 17:51
…in e2e tests

- Add series_api_version matrix dimension ['v1', 'v2'] to test.yaml, expanding coverage from 3 to 6 environments
- Pass CONFIG_SERIES_API_VERSION through docker compose environment to the Vector container
- Parameterize series_api_version in vector.toml from the env var
- Refactor series::validate() to dispatch to v1 or v2 fetch function based on CONFIG_SERIES_API_VERSION
- Extract assertions into compare_intakes() helper (no new assertion logic)
- Replace blocking std::thread::sleep with async tokio::time::sleep in the test entry point

Rationale: Vector now supports configurable series_api_version (v1/v2) in the datadog_metrics sink. The e2e test previously hardcoded v1 for the vector pipeline. This change runs the same assertions against both API versions via the CI matrix, ensuring correctness is validated for each version independently without duplicating test logic.
@vladimir-dd vladimir-dd force-pushed the vladimir-dd/metrics-v2 branch from 8635670 to e7b77ca Compare March 17, 2026 17:52
@pront
Copy link
Copy Markdown
Member

pront commented Mar 17, 2026

LGTM - feel free to enqueue whenever you want

@vladimir-dd vladimir-dd added this pull request to the merge queue Mar 17, 2026
Merged via the queue into master with commit dc9d8de Mar 17, 2026
71 checks passed
@vladimir-dd vladimir-dd deleted the vladimir-dd/metrics-v2 branch March 17, 2026 19:12
@github-actions github-actions Bot locked and limited conversation to collaborators Mar 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants