perf(encode): emit span fields and event attributes as compact msgpack ints#8229
Merged
BridgeAR merged 3 commits intoMay 11, 2026
Merged
Conversation
Contributor
Overall package sizeSelf size: 5.76 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.1 | 82.56 kB | 817.39 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: ac38d52 | Docs | Datadog PR Page | Give us feedback! |
BenchmarksBenchmark execution time: 2026-05-06 22:13:08 Comparing candidate commit ac38d52 in PR branch Found 23 performance improvements and 1 performance regressions! Performance is the same for 1724 metrics, 96 unstable metrics. scenario:encoders-0.4-20
scenario:encoders-0.4-22
scenario:encoders-0.4-24
scenario:encoders-0.4-events-native-18
scenario:encoders-0.4-events-native-20
scenario:encoders-0.4-events-native-22
scenario:encoders-0.4-events-native-24
scenario:encoders-0.5-22
scenario:encoders-0.5-events-legacy-18
scenario:encoders-0.5-events-legacy-20
scenario:encoders-0.5-events-legacy-22
scenario:encoders-0.5-events-legacy-24
|
…k ints The 0.4 trace export used to emit `span.error`, `span.start`, `span.duration`, every integer value in `span.meta` / `span.metrics`, and every `int_value` in span events as a fixed 9-byte float64. msgpack's compact int variants (fixint / uint8 / uint16 / uint32 / int*) carry the same number across the wire in 1-5 bytes for the small values these fields actually hold in production. The agent decodes both encodings as the same Go int64 / float64. The following improvements are implemented: 1. The pre-encoded `KEY_*_PREFIX` buffers for `error`, `start`, and `duration` are replaced with the bare key buffers. The hot loop now emits the key, then calls `#encodeIntOrFloat` which picks the smallest valid msgpack encoding for the value. `#writeIntegerField` and `#writeLongField` are removed with their only callers. 2. `#encodeMetaEntries` (the 0.4-private fast path for `span.meta` and `span.metrics`) emits the numeric value via `#encodeIntOrFloat`. Integer metrics like `process.pid: 4321` shrink from 9 to 3 bytes; small ones like `_sampling_priority_v1: 1` shrink to a single fixint byte. 3. Span event attribute `int_value` in `#emitAttribute` and `#emitArrayItem` emits via the same helper. The `type: 2` tag stays; only the wire encoding of the value byte changes. `#encodeIntOrFloat` deliberately avoids the NaN-coerces-to-0 behavior of `MsgpackEncoder.encodeNumber` — `Number.isInteger(NaN)` is `false`, so NaN keeps its float64 bits. `_encodeMap` (shared with the 0.5 encoder and the CI-visibility encoders) is left on float64 for now: those go to different agent intakes that weren't part of the wire-format review. The 0.4 spec test updates the four `start, 123n` / `duration, 456n` assertions to bare numbers, since small ints now decode as `Number` regardless of `useBigInt64`. Trace / span / parent IDs still use uint64 and stay on bigint. Microbench numbers are flat — per-span CPU is unchanged. The win is on the wire: a typical 30-span trace shrinks by ~50 B per span (4 B from `error`, ~8 B per small `duration`, ~6 B per integer metric, ~8 B per integer event attribute).
Following the 0.4 change, the 0.5 trace export now also emits `span.error`, `span.start`, `span.duration`, and every integer value in `span.meta` / `span.metrics` as the smallest valid msgpack int. Same agent decoder behavior, smaller wire. Two targeted edits: 1. The `_encode` loop swaps `_encodeInteger(span.error)`, `_encodeLong(span.start || 0)`, and `_encodeLong(span.duration || 0)` for `_encodeIntOrFloat`. Large timestamps and durations still take the uint64 branch (same 9 bytes), but small values like `error: 0` collapse to a single fixint byte. 2. `_encodeMap` is overridden in 0.5 with the same shape as the inherited 0.4 method, but the numeric branch goes through `_encodeIntOrFloat`. The 0.4 base method is left on float64 because the CI-visibility and span-stats encoders inherit it and target a different intake that wasn't part of this review. `#encodeIntOrFloat` moves from `#private` to `_encodeIntOrFloat` so the subclass can call it. NaN handling stays — `Number.isInteger(NaN)` is `false`, so NaN keeps its float64 bits instead of coercing to fixint 0 the way `MsgpackEncoder.encodeNumber` would.
`error: 0`, `_sampling_priority_v1: 1`, attribute counts, http status codes, and most small metric values land in the msgpack positive-fixint range (0..127). The current dispatch through `Number.isInteger`, the `>= 0` branch, and `MsgpackEncoder.encodeUnsigned`'s size cascade costs more than the actual encoding for these values. A single check identifies them: `value === (value & 0x7F)` is true iff `value` is an exact integer in [0, 127]. The bitwise `&` coerces non-integer and out-of-range values into something that can't equal `value` again. NaN, ±Infinity, negatives, and floats with a fractional part all fall through to the existing dispatch. Recovers the ~3 % CPU the compact-int change cost on metric-heavy spans without growing the wire format.
ca25265 to
ac38d52
Compare
bengl
approved these changes
May 11, 2026
Merged
rochdev
pushed a commit
that referenced
this pull request
May 13, 2026
…k ints (#8229) * perf(encode): emit span fields and event attributes as compact msgpack ints The 0.4 trace export used to emit `span.error`, `span.start`, `span.duration`, every integer value in `span.meta` / `span.metrics`, and every `int_value` in span events as a fixed 9-byte float64. msgpack's compact int variants (fixint / uint8 / uint16 / uint32 / int*) carry the same number across the wire in 1-5 bytes for the small values these fields actually hold in production. The agent decodes both encodings as the same Go int64 / float64. The following improvements are implemented: 1. The pre-encoded `KEY_*_PREFIX` buffers for `error`, `start`, and `duration` are replaced with the bare key buffers. The hot loop now emits the key, then calls `#encodeIntOrFloat` which picks the smallest valid msgpack encoding for the value. `#writeIntegerField` and `#writeLongField` are removed with their only callers. 2. `#encodeMetaEntries` (the 0.4-private fast path for `span.meta` and `span.metrics`) emits the numeric value via `#encodeIntOrFloat`. Integer metrics like `process.pid: 4321` shrink from 9 to 3 bytes; small ones like `_sampling_priority_v1: 1` shrink to a single fixint byte. 3. Span event attribute `int_value` in `#emitAttribute` and `#emitArrayItem` emits via the same helper. The `type: 2` tag stays; only the wire encoding of the value byte changes. `#encodeIntOrFloat` deliberately avoids the NaN-coerces-to-0 behavior of `MsgpackEncoder.encodeNumber` — `Number.isInteger(NaN)` is `false`, so NaN keeps its float64 bits. `_encodeMap` (shared with the 0.5 encoder and the CI-visibility encoders) is left on float64 for now: those go to different agent intakes that weren't part of the wire-format review. The 0.4 spec test updates the four `start, 123n` / `duration, 456n` assertions to bare numbers, since small ints now decode as `Number` regardless of `useBigInt64`. Trace / span / parent IDs still use uint64 and stay on bigint. Microbench numbers are flat — per-span CPU is unchanged. The win is on the wire: a typical 30-span trace shrinks by ~50 B per span (4 B from `error`, ~8 B per small `duration`, ~6 B per integer metric, ~8 B per integer event attribute). * perf(encode): extend compact msgpack int encoding to the 0.5 wire Following the 0.4 change, the 0.5 trace export now also emits `span.error`, `span.start`, `span.duration`, and every integer value in `span.meta` / `span.metrics` as the smallest valid msgpack int. Same agent decoder behavior, smaller wire. Two targeted edits: 1. The `_encode` loop swaps `_encodeInteger(span.error)`, `_encodeLong(span.start || 0)`, and `_encodeLong(span.duration || 0)` for `_encodeIntOrFloat`. Large timestamps and durations still take the uint64 branch (same 9 bytes), but small values like `error: 0` collapse to a single fixint byte. 2. `_encodeMap` is overridden in 0.5 with the same shape as the inherited 0.4 method, but the numeric branch goes through `_encodeIntOrFloat`. The 0.4 base method is left on float64 because the CI-visibility and span-stats encoders inherit it and target a different intake that wasn't part of this review. `#encodeIntOrFloat` moves from `#private` to `_encodeIntOrFloat` so the subclass can call it. NaN handling stays — `Number.isInteger(NaN)` is `false`, so NaN keeps its float64 bits instead of coercing to fixint 0 the way `MsgpackEncoder.encodeNumber` would. * perf(encode): inline a fixint fast path in _encodeIntOrFloat `error: 0`, `_sampling_priority_v1: 1`, attribute counts, http status codes, and most small metric values land in the msgpack positive-fixint range (0..127). The current dispatch through `Number.isInteger`, the `>= 0` branch, and `MsgpackEncoder.encodeUnsigned`'s size cascade costs more than the actual encoding for these values. A single check identifies them: `value === (value & 0x7F)` is true iff `value` is an exact integer in [0, 127]. The bitwise `&` coerces non-integer and out-of-range values into something that can't equal `value` again. NaN, ±Infinity, negatives, and floats with a fractional part all fall through to the existing dispatch. Recovers the ~3 % CPU the compact-int change cost on metric-heavy spans without growing the wire format.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See commits