feat!: add encoder from v04 to v1 by anais-raison · Pull Request #1896 · DataDog/libdatadog

anais-raison · 2026-04-20T14:00:44Z

What does this PR do?

Implements the v0.4 to V1 encoder for the Trace Exporter.

Key additions:

StringTable for streaming string interning: first occurrence writes the string, subsequent ones write an integer
ID
Chunk-level attribute extraction (trace_id 128-bit, sampling_priority, origin) promoted from the root span
Integer keys for all msgpack fields

Motivation

APMSP-2808

Additional Notes

V04 and V05 formats are untouched, no existing behavior changes
TraceExporterOutputFormat::V1 is not exposed in the C FFI yet so there is no production impact
The string interning is scoped per payload (table reset on each to_vec call)

How to test the change?

Added unit tests.

github-actions · 2026-04-20T14:01:50Z

📚 Documentation Check Results

⚠️ 1474 documentation warning(s) found

📦 `libdd-data-pipeline` - 912 warning(s)

📦 `libdd-trace-utils` - 562 warning(s)

Updated: 2026-04-20 14:16:09 UTC | Commit: f67cd79 | missing-docs job results

github-actions · 2026-04-20T14:03:16Z

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

Base Branch: origin/main
PR Branch: origin/anais/add-v1-encoder-from-v04

Summary by Rule

Rule	Base Branch	PR Branch	Change
expect_used	0	2	⚠️ +2 (N/A)
unimplemented	1	1	No change (0%)
unwrap_used	2	2	No change (0%)
Total	3	5	⚠️ +2 (+66.7%)

Annotation Counts by File

File	Base Branch	PR Branch	Change
`libdd-data-pipeline/src/trace_exporter/mod.rs`	2	2	No change (0%)
`libdd-trace-utils/src/msgpack_encoder/v1/mod.rs`	0	2	⚠️ +2 (N/A)
`libdd-trace-utils/src/tracer_payload.rs`	1	1	No change (0%)

Annotation Stats by Crate

Crate	Base Branch	PR Branch	Change
`clippy-annotation-reporter`	5	5	No change (0%)
`datadog-ffe-ffi`	1	1	No change (0%)
`datadog-ipc`	21	21	No change (0%)
`datadog-live-debugger`	6	6	No change (0%)
`datadog-live-debugger-ffi`	10	10	No change (0%)
`datadog-profiling-replayer`	4	4	No change (0%)
`datadog-remote-config`	3	3	No change (0%)
`datadog-sidecar`	56	56	No change (0%)
`libdd-common`	10	10	No change (0%)
`libdd-common-ffi`	12	12	No change (0%)
`libdd-data-pipeline`	4	4	No change (0%)
`libdd-ddsketch`	2	2	No change (0%)
`libdd-dogstatsd-client`	1	1	No change (0%)
`libdd-profiling`	13	13	No change (0%)
`libdd-telemetry`	19	19	No change (0%)
`libdd-tinybytes`	4	4	No change (0%)
`libdd-trace-normalization`	2	2	No change (0%)
`libdd-trace-obfuscation`	8	8	No change (0%)
`libdd-trace-stats`	1	1	No change (0%)
`libdd-trace-utils`	15	17	⚠️ +2 (+13.3%)
Total	197	199	⚠️ +2 (+1.0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

github-actions · 2026-04-20T14:03:38Z

🔒 Cargo Deny Results

⚠️ 7 issue(s) found, showing only errors (advisories, bans, sources)

📦 `libdd-data-pipeline` - 4 error(s)

Show output

error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:208:1
    │
208 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
    │
    ├ ID: RUSTSEC-2026-0097
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
    ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
      
      - The `log` and `thread_rng` features are enabled
      - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
      - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
      - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
      - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
      
      `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
    ├ Announcement: https://github.com/rust-random/rand/pull/1763
    ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
    ├ rand v0.8.5
      ├── libdd-common v3.0.2
      │   ├── libdd-capabilities-impl v0.1.0
      │   │   ├── (dev) libdd-data-pipeline v3.0.1
      │   │   ├── libdd-trace-stats v2.0.0
      │   │   │   └── libdd-data-pipeline v3.0.1 (*)
      │   │   └── libdd-trace-utils v3.0.1
      │   │       ├── libdd-data-pipeline v3.0.1 (*)
      │   │       ├── libdd-trace-stats v2.0.0 (*)
      │   │       └── (dev) libdd-trace-utils v3.0.1 (*)
      │   ├── libdd-data-pipeline v3.0.1 (*)
      │   ├── libdd-dogstatsd-client v2.0.0
      │   │   └── libdd-data-pipeline v3.0.1 (*)
      │   ├── libdd-shared-runtime v0.1.0
      │   │   ├── libdd-data-pipeline v3.0.1 (*)
      │   │   ├── libdd-telemetry v4.0.0
      │   │   │   └── libdd-data-pipeline v3.0.1 (*)
      │   │   └── libdd-trace-stats v2.0.0 (*)
      │   ├── libdd-telemetry v4.0.0 (*)
      │   ├── libdd-trace-stats v2.0.0 (*)
      │   └── libdd-trace-utils v3.0.1 (*)
      ├── (dev) libdd-data-pipeline v3.0.1 (*)
      ├── (dev) libdd-trace-normalization v2.0.0
      │   └── libdd-trace-utils v3.0.1 (*)
      ├── (dev) libdd-trace-stats v2.0.0 (*)
      ├── libdd-trace-utils v3.0.1 (*)
      └── proptest v1.5.0
          └── (dev) libdd-tinybytes v1.1.0
              ├── libdd-data-pipeline v3.0.1 (*)
              ├── (dev) libdd-tinybytes v1.1.0 (*)
              └── libdd-trace-utils v3.0.1 (*)

error[vulnerability]: Name constraints for URI names were incorrectly accepted
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:230:1
    │
230 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0098
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0098
    ├ Name constraints for URI names were ignored and therefore accepted.
      
      Note this library does not provide an API for asserting URI names, and URI name constraints are otherwise not implemented.  URI name constraints are now rejected unconditionally.
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-965h-392x-2mh5](https://github.com/rustls/webpki/security/advisories/GHSA-965h-392x-2mh5). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v3.0.2
          │       ├── libdd-capabilities-impl v0.1.0
          │       │   ├── (dev) libdd-data-pipeline v3.0.1
          │       │   ├── libdd-trace-stats v2.0.0
          │       │   │   └── libdd-data-pipeline v3.0.1 (*)
          │       │   └── libdd-trace-utils v3.0.1
          │       │       ├── libdd-data-pipeline v3.0.1 (*)
          │       │       ├── libdd-trace-stats v2.0.0 (*)
          │       │       └── (dev) libdd-trace-utils v3.0.1 (*)
          │       ├── libdd-data-pipeline v3.0.1 (*)
          │       ├── libdd-dogstatsd-client v2.0.0
          │       │   └── libdd-data-pipeline v3.0.1 (*)
          │       ├── libdd-shared-runtime v0.1.0
          │       │   ├── libdd-data-pipeline v3.0.1 (*)
          │       │   ├── libdd-telemetry v4.0.0
          │       │   │   └── libdd-data-pipeline v3.0.1 (*)
          │       │   └── libdd-trace-stats v2.0.0 (*)
          │       ├── libdd-telemetry v4.0.0 (*)
          │       ├── libdd-trace-stats v2.0.0 (*)
          │       └── libdd-trace-utils v3.0.1 (*)
          ├── libdd-common v3.0.2 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v3.0.2 (*)

error[vulnerability]: Name constraints were accepted for certificates asserting a wildcard name
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:230:1
    │
230 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0099
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0099
    ├ Permitted subtree name constraints for DNS names were accepted for certificates asserting a wildcard name.
      
      This was incorrect because, given a name constraint of `accept.example.com`, `*.example.com` could feasibly allow a name of `reject.example.com` which is outside the constraint.
      This is very similar to [CVE-2025-61727](https://go.dev/issue/76442).
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-xgp8-3hg3-c2mh](https://github.com/rustls/webpki/security/advisories/GHSA-xgp8-3hg3-c2mh). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v3.0.2
          │       ├── libdd-capabilities-impl v0.1.0
          │       │   ├── (dev) libdd-data-pipeline v3.0.1
          │       │   ├── libdd-trace-stats v2.0.0
          │       │   │   └── libdd-data-pipeline v3.0.1 (*)
          │       │   └── libdd-trace-utils v3.0.1
          │       │       ├── libdd-data-pipeline v3.0.1 (*)
          │       │       ├── libdd-trace-stats v2.0.0 (*)
          │       │       └── (dev) libdd-trace-utils v3.0.1 (*)
          │       ├── libdd-data-pipeline v3.0.1 (*)
          │       ├── libdd-dogstatsd-client v2.0.0
          │       │   └── libdd-data-pipeline v3.0.1 (*)
          │       ├── libdd-shared-runtime v0.1.0
          │       │   ├── libdd-data-pipeline v3.0.1 (*)
          │       │   ├── libdd-telemetry v4.0.0
          │       │   │   └── libdd-data-pipeline v3.0.1 (*)
          │       │   └── libdd-trace-stats v2.0.0 (*)
          │       ├── libdd-telemetry v4.0.0 (*)
          │       ├── libdd-trace-stats v2.0.0 (*)
          │       └── libdd-trace-utils v3.0.1 (*)
          ├── libdd-common v3.0.2 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v3.0.2 (*)

error[vulnerability]: Denial of Service via Stack Exhaustion
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:270:1
    │
270 │ time 0.3.41 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0009
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0009
    ├ ## Impact
      
      When user-provided input is provided to any type that parses with the RFC 2822 format, a denial of
      service attack via stack exhaustion is possible. The attack relies on formally deprecated and
      rarely-used features that are part of the RFC 2822 format used in a malicious manner. Ordinary,
      non-malicious input will never encounter this scenario.
      
      ## Patches
      
      A limit to the depth of recursion was added in v0.3.47. From this version, an error will be returned
      rather than exhausting the stack.
      
      ## Workarounds
      
      Limiting the length of user input is the simplest way to avoid stack exhaustion, as the amount of
      the stack consumed would be at most a factor of the length of the input.
    ├ Announcement: https://github.com/time-rs/time/blob/main/CHANGELOG.md#0347-2026-02-05
    ├ Solution: Upgrade to >=0.3.47 (try `cargo update -p time`)
    ├ time v0.3.41
      └── tracing-appender v0.2.3
          └── libdd-log v1.0.0
              └── (dev) libdd-data-pipeline v3.0.1

advisories FAILED, bans ok, sources ok

📦 `libdd-trace-utils` - 3 error(s)

Show output

error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:177:1
    │
177 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
    │
    ├ ID: RUSTSEC-2026-0097
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
    ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
      
      - The `log` and `thread_rng` features are enabled
      - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
      - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
      - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
      - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
      
      `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
    ├ Announcement: https://github.com/rust-random/rand/pull/1763
    ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
    ├ rand v0.8.5
      ├── (dev) libdd-common v3.0.2
      │   ├── libdd-capabilities-impl v0.1.0
      │   │   └── libdd-trace-utils v3.0.1
      │   │       └── (dev) libdd-trace-utils v3.0.1 (*)
      │   └── libdd-trace-utils v3.0.1 (*)
      ├── (dev) libdd-trace-normalization v2.0.0
      │   └── libdd-trace-utils v3.0.1 (*)
      ├── libdd-trace-utils v3.0.1 (*)
      └── proptest v1.5.0
          └── (dev) libdd-tinybytes v1.1.0
              ├── (dev) libdd-tinybytes v1.1.0 (*)
              └── libdd-trace-utils v3.0.1 (*)

error[vulnerability]: Name constraints for URI names were incorrectly accepted
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:199:1
    │
199 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0098
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0098
    ├ Name constraints for URI names were ignored and therefore accepted.
      
      Note this library does not provide an API for asserting URI names, and URI name constraints are otherwise not implemented.  URI name constraints are now rejected unconditionally.
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-965h-392x-2mh5](https://github.com/rustls/webpki/security/advisories/GHSA-965h-392x-2mh5). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v3.0.2
          │       ├── libdd-capabilities-impl v0.1.0
          │       │   └── libdd-trace-utils v3.0.1
          │       │       └── (dev) libdd-trace-utils v3.0.1 (*)
          │       └── libdd-trace-utils v3.0.1 (*)
          ├── libdd-common v3.0.2 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v3.0.2 (*)

error[vulnerability]: Name constraints were accepted for certificates asserting a wildcard name
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:199:1
    │
199 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0099
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0099
    ├ Permitted subtree name constraints for DNS names were accepted for certificates asserting a wildcard name.
      
      This was incorrect because, given a name constraint of `accept.example.com`, `*.example.com` could feasibly allow a name of `reject.example.com` which is outside the constraint.
      This is very similar to [CVE-2025-61727](https://go.dev/issue/76442).
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-xgp8-3hg3-c2mh](https://github.com/rustls/webpki/security/advisories/GHSA-xgp8-3hg3-c2mh). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v3.0.2
          │       ├── libdd-capabilities-impl v0.1.0
          │       │   └── libdd-trace-utils v3.0.1
          │       │       └── (dev) libdd-trace-utils v3.0.1 (*)
          │       └── libdd-trace-utils v3.0.1 (*)
          ├── libdd-common v3.0.2 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v3.0.2 (*)

advisories FAILED, bans ok, sources ok

Updated: 2026-04-20 14:18:08 UTC | Commit: f67cd79 | dependency-check job results

datadog-datadog-prod-us1-2 · 2026-04-20T14:21:35Z

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
• Patch Coverage: 62.28%
• Overall Coverage: 71.50% (-0.06%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 01e38e3 | Docs | Datadog PR Page | Give us feedback!}

codecov-commenter · 2026-04-20T14:27:54Z

Codecov Report

❌ Patch coverage is 62.27848% with 149 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.50%. Comparing base (ca7a74b) to head (01e38e3).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1896      +/-   ##
==========================================
- Coverage   71.55%   71.50%   -0.06%     
==========================================
  Files         431      434       +3     
  Lines       69158    69547     +389     
==========================================
+ Hits        49489    49729     +240     
- Misses      19669    19818     +149

Components	Coverage Δ
libdd-crashtracker	`65.99% <ø> (-0.05%)`	⬇️
libdd-crashtracker-ffi	`34.47% <ø> (ø)`
libdd-alloc	`98.77% <ø> (ø)`
libdd-data-pipeline	`85.31% <26.66%> (-0.17%)`	⬇️
libdd-data-pipeline-ffi	`71.94% <ø> (ø)`
libdd-common	`79.16% <ø> (ø)`
libdd-common-ffi	`73.87% <ø> (ø)`
libdd-telemetry	`68.06% <ø> (ø)`
libdd-telemetry-ffi	`19.37% <ø> (ø)`
libdd-dogstatsd-client	`82.64% <ø> (ø)`
datadog-ipc	`74.84% <ø> (ø)`
libdd-profiling	`81.61% <ø> (+0.01%)`	⬆️
libdd-profiling-ffi	`64.36% <ø> (ø)`
datadog-sidecar	`29.43% <ø> (ø)`
datdog-sidecar-ffi	`8.04% <ø> (ø)`
spawn-worker	`54.69% <ø> (ø)`
libdd-tinybytes	`93.16% <ø> (ø)`
libdd-trace-normalization	`81.71% <ø> (ø)`
libdd-trace-obfuscation	`87.26% <ø> (ø)`
libdd-trace-protobuf	`68.25% <ø> (ø)`
libdd-trace-utils	`87.90% <63.68%> (-1.35%)`	⬇️
datadog-tracer-flare	`86.88% <ø> (ø)`
libdd-log	`74.69% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dd-octo-sts · 2026-04-20T14:28:21Z

Artifact Size Benchmark Report

aarch64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a	83.31 MB	83.42 MB	+.12% (+110.62 KB) 🔍
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so	7.63 MB	7.63 MB	0% (0 B) 👌

aarch64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a	99.40 MB	99.49 MB	+.09% (+99.67 KB) 🔍
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so	10.10 MB	10.17 MB	+.62% (+65.00 KB) 🔍

libdatadog-x64-windows

Artifact	Baseline	Commit	Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll	25.18 MB	25.21 MB	+.11% (+30.50 KB) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib	79.90 KB	79.90 KB	0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb	184.34 MB	184.43 MB	+.04% (+88.00 KB) 🔍
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib	917.46 MB	917.76 MB	+.03% (+307.50 KB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll	7.89 MB	7.90 MB	+.08% (+6.50 KB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib	79.90 KB	79.90 KB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb	23.66 MB	23.69 MB	+.13% (+32.00 KB) 🔍
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib	46.18 MB	46.22 MB	+.08% (+41.91 KB) 🔍

libdatadog-x86-windows

Artifact	Baseline	Commit	Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll	21.65 MB	21.68 MB	+.11% (+26.00 KB) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib	81.14 KB	81.14 KB	0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb	188.43 MB	188.51 MB	+.04% (+80.00 KB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib	902.77 MB	903.08 MB	+.03% (+325.72 KB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll	6.13 MB	6.14 MB	+.11% (+7.50 KB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib	81.14 KB	81.14 KB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb	25.35 MB	25.38 MB	+.12% (+32.00 KB) 🔍
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib	43.67 MB	43.71 MB	+.09% (+40.97 KB) 🔍

x86_64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a	74.42 MB	74.51 MB	+.12% (+98.09 KB) 🔍
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so	8.52 MB	8.53 MB	+.09% (+8.00 KB) 🔍

x86_64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a	91.77 MB	91.86 MB	+.09% (+91.44 KB) 🔍
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so	10.20 MB	10.20 MB	+.04% (+4.76 KB) 🔍

Aaalibaba42

I'm not learnt in the data-flow etc, I mostly just commented on the code, which do have a few smells but nothing blocking I'd say

Aaalibaba42 · 2026-04-21T10:37:04Z

+                let chunks = trace_utils::collect_trace_chunks(traces, false).map_err(|e| {
+                    TraceExporterError::Deserialization(DecodeError::InvalidFormat(e.to_string()))
+                })?;
+                match chunks {
+                    tracer_payload::TraceChunks::V04(traces) => {
+                        Ok(tracer_payload::TraceChunks::V1(traces))
+                    }
+                    other => Ok(other),
+                }
+            }


You call collect_trace_chunks saying false on use_v05_format, so it will always return TraceChunks::V04(traces), the other arm should be unreachable iiuc. I don't know even what's the point of this all or if the collect_trace_chunks function should change to take the traces and the enum of formats directly, it seems to do the same logic twice somewhat

Plus in terms of "design", passing true/false to a function is without context is not great, I had to look up the function in my editor to know what/why etc, I'd be keen to make use_v05_format of collect_trace_chunk an enum anyway (but maybe that's out of scope for this PR idk).

Aaalibaba42 · 2026-04-21T12:11:31Z

+    fn new() -> Self {
+        Self {
+            seen: HashMap::new(),
+            next_id: 1,


You could probably derive that from the size of the hashmap (I'm not saying it would be more readable, or better, but just to have the option thought of)

[EDIT: Ok I get what you meant, sorry for the noise] Are you talking about next_id?

Aaalibaba42 · 2026-04-21T12:21:20Z

+    fn encode_array_element<W: RmpWrite, T: TraceData>(
+        writer: &mut W,
+        value: &AttributeArrayValue<T>,
+        table: &mut StringTable,
+    ) -> Result<(), ValueWriteError<W::Error>> {
+        match value {
+            AttributeArrayValue::String(s) => {
+                write_uint8(writer, AnyValueKey::String as u8)?;
+                table.write_interned(writer, s.borrow())?;
+            }
+            AttributeArrayValue::Boolean(b) => {
+                write_uint8(writer, AnyValueKey::Bool as u8)?;
+                write_bool(writer, *b).map_err(ValueWriteError::InvalidDataWrite)?;
+            }
+            AttributeArrayValue::Integer(i) => {
+                write_uint8(writer, 4u8)?; // Int64
+                write_sint(writer, *i)?;
+            }
+            AttributeArrayValue::Double(d) => {
+                write_uint8(writer, AnyValueKey::Double as u8)?;
+                write_f64(writer, *d)?;
+            }
+        }
+        Ok(())
+    }


I'm not against this per-se, but if it needs to be a function and not a lambda, I'd write it out of the function's body. Like to me either it's an auxiliary function and it's not in the body, or it's a lambda and it can be there. Except if you have good reasons not to do it like that.

I kind of like this pattern. This keeps the function scoped, not need to wonder if it's called elsewhere, and having an actual function is not much worse than a closure.
The only thing gained by making this a closure would probably be a shorter declaration thanks to inference

I guess it's just a point of style, I agree it's not a problem in any way, it's just feels icky idk

It's a matter of taste, but I kinda agree with @paullegranddc here. If it's a local re-usable block but doesn't actually need to capture any environment, I personally use the fn syntax that I find more uniform (you also don't have to worry about making it mut or a reference etc.). For that matter I've seen struct, enum or even trait defined locally. I don't think it's wrong in any way, when justified.

Just to pile on, I agree with @paullegranddc and @yannham. The scoping here makes the intentions clear.

Aaalibaba42 · 2026-04-21T12:23:15Z

+/// - `trace_id` is not encoded in the span (it belongs to the chunk).
+/// - `error` is encoded as a boolean.
+/// - String values use streaming string interning via `StringTable`.
+#[inline(always)]


Why ? It doesn't look like your typical function that benefits from being inline, I'd leave that decision to the compiler except if you have good reasons not to.

I agree, especially given that this function is quite big. FWIW, my "runbook" for inline annotations:

if it's a very small function (like, getter or setter, no more than a few operations really), and it's a pub item that might be re-used from other crates: #[inline] (it often doesn't change much the decision of the compiler but enable cross-crate inlining, which is otherwise not possible by default, and is usually the actual motivation for #inline annotations)

if it's a small-ish function, that is very performance critical (e.g. called repeatedly in a hot loop) and I have thoroughly measured with benchmarks that inlining it shows clear, reproducible improvement: #[inline(always)] might be considered. But inlining can be a pessimization, whether it is or not can heavily depends on the callsites as well (and change over time), so it's risky business.

Agreed. I think our msgpack encoder / decoders had a lot of inline annotations in the past (and probably some still hanging around) and most of them made no difference.

paullegranddc · 2026-04-21T11:45:49Z

+pub(crate) struct StringTable {
+    seen: HashMap<String, u32>,
+    next_id: u32,
+}
+
+impl StringTable {
+    fn new() -> Self {
+        Self {
+            seen: HashMap::new(),
+            next_id: 1,
+        }
+    }


Could be

Suggested change

pub(crate) struct StringTable {

seen: HashMap<String, u32>,

next_id: u32,

}

impl StringTable {

fn new() -> Self {

Self {

seen: HashMap::new(),

next_id: 1,

}

}

pub(crate) struct StringTable {

seen: HashMap<String, u32>,

}

impl StringTable {

fn new() -> Self {

let mut seen = HashMap::new();

seen.insert(String::new(), 0);

Self {

seen,

}

}

}

Then you can remove the special empty string caee in write_interned and do

let id = self.seen.len(); self.seen.insert(s.to_string(), id);

I said that first è_é #1896 (comment)

when I started my review you hadn't posted yours I think 😄

paullegranddc · 2026-04-21T11:47:46Z

+    /// Origin tag from `_dd.origin` meta on the root span.
+    origin: Option<String>,


Is it possible to borrow the origin as an &str instead of copying it?

paullegranddc · 2026-04-21T11:51:53Z

+            }
+        }


Can we break out of the loop here if we encountered a root span?
This seems to take the "last root span" in the chunk if there are multiple which i don't think is really deliberate

@ajgajg1134 - Weren't we discussing something similar in the agent recently? A scenario where we encounter multiple root spans?

paullegranddc · 2026-04-21T12:28:44Z

+        // Chunk-level attributes come from the root span (parent_id == 0).
+        if span.parent_id == 0 {


This is not complete.
Span chunks can have a remote parent, in which case the top level is a span where the parent exists but is not in the chunk. In which case we need to promote it's attributes

You can look for _dd.top_level I think, as it is tagged before in the pipeline https://github.com/DataDog/libdatadog/blob/main/libdd-trace-utils/src/span/trace_utils.rs#L16

Btw the agent this is not obvious when looking at the agent v04 to v1 conversion in the agent, but that does happen when the converter iterates over all spans.

It put fields to be promoted in a SpanConvertedFields and then promotes these fields to chunk level attributes latter.

(Example here on how this work for the sampling priority:

// Collect promoted fields // https://github.com/DataDog/datadog-agent/blob/paullgdc/trace/add_agent_telemetry_batching/pkg/proto/pbgo/trace/idx/span.go#L1651-L1657 // handlePromotedMetricsFields processes promoted metrics fields that have dedicated span fields // If we fail to parse a value we don't use the promoted value, but the original string will still be in the span attributes func (s *InternalSpan) handlePromotedMetricsFields(key uint32, val float64, convertedFields *SpanConvertedFields) { if s.Strings.Get(key) == "_sampling_priority_v1" { convertedFields.SamplingPriority = int32(val) } } // promotes them https://github.com/DataDog/datadog-agent/blob/paullgdc/trace/add_agent_telemetry_batching/pkg/proto/pbgo/trace/idx/span.go#L931 // ApplyPromotedFields applies span promoted fields to the chunk, copying chunk promoted fields to chunkConvertedFields func (c *InternalTraceChunk) ApplyPromotedFields(convertedFields *SpanConvertedFields, chunkConvertedFields *ChunkConvertedFields) { // ... c.Priority = int32(convertedFields.SamplingPriority) // ... }

)

paullegranddc · 2026-04-21T12:30:32Z

+/// Promoted fields extracted from spans and written at the chunk level.
+struct ChunkAttrs {


I think you are missing the sampling mechanism (_dd.p.dm)

paullegranddc · 2026-04-21T12:36:45Z

+) -> Result<(), ValueWriteError<W::Error>> {
+    let mut table = StringTable::new();
+
+    // Top-level map contains only the chunks array for now.


What about the payload promoted fields?

env

apm mode

hostname

app version

git commit sha

bantonsson · 2026-04-21T12:53:01Z

+                    tracer_payload::TraceChunks::V04(traces) => {
+                        Ok(tracer_payload::TraceChunks::V1(traces))
+                    }
+                    other => Ok(other),


This shouldn't be reachable, right, so it should error.

bantonsson · 2026-04-21T13:01:35Z

+    write_map_len(writer, fields)?;
+
+    // 128-bit trace ID as 16-byte big-endian binary.
+    write_uint8(writer, ChunkKey::TraceId as u8)?;


Suggested change

write_uint8(writer, ChunkKey::TraceId as u8)?;

write_uint(writer, ChunkKey::TraceId)?;

The write methods will use the smallest possible encoding anyway, and you can remove all the casts.

The casts are there because ChunkKey is a repr(u8) enum I think no, so using write_uint will nto change anything?

I guess I would prefer something like this pattern though. There is no need for an enum because we never match back

mod chunk_key { const PRIORITY: u8 = 1; // other keys const TRACE_ID: u8 = 6; }

yannham · 2026-04-21T16:00:19Z

+/// The string table is scoped per payload: each `to_vec` / `write_to_slice` call starts with a
+/// fresh table so deduplication is payload-local.
+pub(crate) struct StringTable {
+    seen: HashMap<String, u32>,


It's probably for much later, I'm just thinking out loud here. I wonder if we can make the string table slightly more performant with the following scheme, if all the strings have the same lifetime:

allocate the strings in a local bump arena

have seen be a Hashmap<&'self str, u32> (it's self referential but easy to do with something like ourboros)

Something like this:

#[ouroboros::self_referencing] struct StringTable { /// Preallocates space where strings are stored. arena: Arena<u8>, /// Deduplicate strings. #[borrows(arena)] #[covariant] map: HashMap<&'this str, u32>, }

The gain would be faster string allocation and de-allocation. This is only if the encoding is perf sensitive.

I agree it doesn't have to be in this PR, but I think this is worth exploring deeper. String allocation is traditionally the biggest drag on performance. @anais-raison Can you create a ticket for the epic to investigate later?

yannham · 2026-04-21T16:01:31Z

+            let id = self.next_id;
+            self.next_id += 1;
+            self.seen.insert(s.to_string(), id);


Suggested change

let id = self.next_id;

self.next_id += 1;

self.seen.insert(s.to_string(), id);

self.seen.insert(s.to_string(), self.next_id);

self.next_id += 1;

yannham · 2026-04-21T16:05:58Z

+/// # Errors
+/// Returns a `ValueWriteError` if the underlying writer fails.
+pub fn write_to_slice<T: TraceData, S: AsRef<[Span<T>]>>(
+    slice: &mut &mut [u8],


Out of curiosity, why does it need to be &mut &mut here?

yannham · 2026-04-21T16:10:05Z

+) -> Vec<u8> {
+    let mut buf = ByteBuf::with_capacity(capacity as usize);
+    #[allow(clippy::expect_used)]
+    encode_payload(&mut buf, traces).expect("infallible: the error is std::convert::Infallible");


I think there's a way to unwrap an infallible without actually needing a panicking function, although the second match is a bit confusing:

let unwrapped = match result { Ok(x) => x, Err(e) => match e {}, }

It's not uncommon to have a little helper unwrap_infallible in some crates to do that for you.

Additionally, since you don't do anything with the result, you could also do something like:
let _ = encode_payalod with a comment explaining why you ignore the result. Might slightly nicer than the allow(clippy) and the expect and error message that will never happen.

+1 we should try to avoid panics unless absolutely necessary (like acquiring mutex locks, for example)

yannham · 2026-04-21T16:11:47Z

+}
+
+/// Returns the number of bytes the V1 payload for `traces` would occupy.
+pub fn to_len<T: TraceData, S: AsRef<[Span<T>]>>(traces: &[S]) -> u32 {


Nitpick: I find the name of this function a bit obscure.

yannham · 2026-04-21T16:28:48Z

+
+    for link in span_links.iter() {
+        let trace_id_128 = ((link.trace_id_high as u128) << 64) | link.trace_id as u128;
+        let link_len = 1 /* trace_id (always) */


Suggested change

let link_len = 1 /* trace_id (always) */

let link_len = 1 // trace_id (always)

yannham · 2026-04-21T16:30:51Z

+    write_uint8(writer, SpanKey::SpanEvents as u8)?;
+    rmp::encode::write_array_len(writer, span_events.len() as u32)?;
+
+    for event in span_events.iter() {


I think &[_] implements IntoIter<Item = &_>?

Suggested change

for event in span_events.iter() {

for event in span_events {

yannham · 2026-04-21T16:31:02Z

+    rmp::encode::write_array_len(writer, span_events.len() as u32)?;
+
+    for event in span_events.iter() {
+        let event_len = 2 /* time_unix_nano, name */


Suggested change

let event_len = 2 /* time_unix_nano, name */

let event_len = 2 // time_unix_nano, name

ekump · 2026-04-21T21:04:41Z

            match self {
                TraceExporterOutputFormat::V04 => "/v0.4/traces",
                TraceExporterOutputFormat::V05 => "/v0.5/traces",
+                TraceExporterOutputFormat::V1 => "/v1.0/traces",


I believe the builder is going to fail validation if input is v0.4 and output is v1.0. Is that intentional? With this PR we should have no issue serializing to v1.0, right?

ekump · 2026-04-21T21:05:43Z

    #[default]
    V04,
    V05,
+    V1,


Can we include an integration test that does a snapshot test with the test agent? I assume you'll have to bump the version of the test agent to pick up the V1 support.

feat: add encoder from v04 to v1

1ca2eab

anais-raison self-assigned this Apr 20, 2026

github-actions Bot added mini-agent data-pipeline labels Apr 20, 2026

anais-raison and others added 2 commits April 20, 2026 16:13

Merge branch 'main' into anais/add-v1-encoder-from-v04

93626ba

fix: fmt

01e38e3

anais-raison changed the title ~~feat: add encoder from v04 to v1~~ feat!: add encoder from v04 to v1 Apr 20, 2026

anais-raison marked this pull request as ready for review April 21, 2026 09:32

anais-raison requested review from a team as code owners April 21, 2026 09:32

Aaalibaba42 approved these changes Apr 21, 2026

View reviewed changes

paullegranddc reviewed Apr 21, 2026

View reviewed changes

bantonsson reviewed Apr 21, 2026

View reviewed changes

yannham reviewed Apr 21, 2026

View reviewed changes

ekump reviewed Apr 21, 2026

View reviewed changes

		/// Origin tag from `_dd.origin` meta on the root span.
		origin: Option<String>,

		// Chunk-level attributes come from the root span (parent_id == 0).
		if span.parent_id == 0 {

		/// Promoted fields extracted from spans and written at the chunk level.
		struct ChunkAttrs {

	write_uint8(writer, ChunkKey::TraceId as u8)?;
	write_uint(writer, ChunkKey::TraceId)?;

	let link_len = 1 /* trace_id (always) */
	let link_len = 1 // trace_id (always)

	let event_len = 2 /* time_unix_nano, name */
	let event_len = 2 // time_unix_nano, name

Conversation

anais-raison commented Apr 20, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Additional Notes

How to test the change?

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📚 Documentation Check Results

📦 libdd-data-pipeline - 912 warning(s)

📦 libdd-trace-utils - 562 warning(s)

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Clippy Allow Annotation Report

Summary by Rule

Annotation Counts by File

Annotation Stats by Crate

About This Report

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Cargo Deny Results

📦 libdd-data-pipeline - 4 error(s)

📦 libdd-trace-utils - 3 error(s)

Uh oh!

datadog-datadog-prod-us1-2 Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Apr 20, 2026

Codecov Report

Uh oh!

dd-octo-sts Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Artifact Size Benchmark Report

Uh oh!

Aaalibaba42 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Aaalibaba42 Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yannham Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yannham Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

anais-raison commented Apr 20, 2026 •

edited by atlassian Bot

Loading

github-actions Bot commented Apr 20, 2026 •

edited

Loading

📦 `libdd-data-pipeline` - 912 warning(s)

📦 `libdd-trace-utils` - 562 warning(s)

github-actions Bot commented Apr 20, 2026 •

edited

Loading

github-actions Bot commented Apr 20, 2026 •

edited

Loading

📦 `libdd-data-pipeline` - 4 error(s)

📦 `libdd-trace-utils` - 3 error(s)

datadog-datadog-prod-us1-2 Bot commented Apr 20, 2026 •

edited

Loading

dd-octo-sts Bot commented Apr 20, 2026 •

edited

Loading

Aaalibaba42 Apr 21, 2026 •

edited

Loading

yannham Apr 21, 2026 •

edited

Loading

yannham Apr 21, 2026 •

edited

Loading