Skip to content

Reduce allocations in extendReuseSlice growth path#6863

Merged
mapno merged 3 commits intografana:mainfrom
mapno:perf/extend-reuse-slice-single-alloc
Apr 7, 2026
Merged

Reduce allocations in extendReuseSlice growth path#6863
mapno merged 3 commits intografana:mainfrom
mapno:perf/extend-reuse-slice-single-alloc

Conversation

@mapno
Copy link
Copy Markdown
Contributor

@mapno mapno commented Apr 7, 2026

What this PR does:

Replaces the append + make pattern in extendReuseSlice with a single make + copy when the slice needs to grow. The old code created a temporary slice via make([]T, sz-len(in)) and then append-ed it, causing two heap allocations on the growth path. The new code performs one allocation and copies existing data directly.

Internal continuous profiling shows extendReuseSlice accounts for ~11.8% of total allocated bytes (memory:alloc_space), called from traceToParquetWithMapping during WAL block writes and block creation. While the growth path is infrequent per-buffer in steady state, the aggregate impact across many tenant instances is significant.

Benchmark results
goos: darwin
goarch: arm64
pkg: github.com/grafana/tempo/tempodb/encoding/vparquet5
cpu: Apple M2 Pro
                    │   old.txt    │               new.txt                │
                    │    sec/op    │    sec/op     vs base                │
ExtendReuseSlice-12   15.99µ ± 25%   10.41µ ± 51%  -34.90% (p=0.019 n=10)

                    │    old.txt    │               new.txt                │
                    │     B/op      │     B/op      vs base                │
ExtendReuseSlice-12   105.56Ki ± 0%   89.56Ki ± 0%  -15.16% (p=0.000 n=10)

                    │  old.txt   │            new.txt             │
                    │ allocs/op  │ allocs/op   vs base            │
ExtendReuseSlice-12   6.000 ± 0%   6.000 ± 0%  ~ (p=1.000 n=10)

Which issue(s) this PR fixes:
N/A — found via continuous profiling.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes extendReuseSlice in the vParquet5 encoding to reduce allocation overhead on the slice growth path, targeting a hot allocation site during trace-to-parquet conversion used in WAL/block creation.

Changes:

  • Replaces append(...make...) growth logic with a single make + copy in extendReuseSlice.
  • Updates the micro-benchmark to exercise a more varied resize pattern using []Attribute.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tempodb/encoding/vparquet5/schema.go Implements the new one-allocation growth strategy for extendReuseSlice.
tempodb/encoding/vparquet5/schema_test.go Adjusts the benchmark to resize a reusable buffer across a sequence of sizes.

in = in[:cap(in)]
return append(in, make([]T, sz-len(in))...)
out := make([]T, sz)
copy(out, in)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new growth path allocates out := make([]T, sz) and only copys len(in) elements, which changes behavior vs the previous implementation when len(in) < cap(in) and sz > cap(in) (previously it effectively preserved/copy-pasted elements up to cap(in)). Could we add a unit test that covers this scenario (e.g., shrink a slice to a smaller len, then grow past cap) to lock in the intended semantics and prevent regressions?

Suggested change
copy(out, in)
copy(out, in[:cap(in)])

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] that's wrong for our use case — callers always overwrite all elements (traceToParquet). The existing test already covers the len < cap < sz case and expects zero-filled elements. I'll add a test for clarity though.

Comment thread tempodb/encoding/vparquet5/schema.go
Comment thread tempodb/encoding/vparquet5/schema.go
@mapno mapno force-pushed the perf/extend-reuse-slice-single-alloc branch from d4295fe to c627e5a Compare April 7, 2026 08:24
mapno added 2 commits April 7, 2026 10:29
Replace append+make pattern with a single make+copy when the slice
needs to grow. The old code allocated a temporary slice via make then
appended it, causing two allocations. The new code does one allocation
and copies existing data directly.
Copilot AI review requested due to automatic review settings April 7, 2026 08:32
@mapno mapno force-pushed the perf/extend-reuse-slice-single-alloc branch from c627e5a to 4d37321 Compare April 7, 2026 08:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment thread tempodb/encoding/vparquet3/schema.go
Comment thread tempodb/encoding/vparquet4/schema.go
Comment thread CHANGELOG.md
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment on lines +930 to +934
{
// len < cap < sz: slice was shrunk then grown past cap
sz: 6,
in: append(make([]int, 0, 4), 1, 2),
expected: []int{1, 2, 0, 0, 0, 0},
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new len < cap < sz test case uses a zero-initialized backing array (append(make(...), 1, 2)), so it would pass both the old and new extendReuseSlice implementations and doesn’t actually lock in the intended semantic change (discarding any stale elements beyond len(in)). Could this be adjusted to populate the underlying array beyond len with non-zero values, then reslice down and extend, so the test would fail under the previous behavior? Same idea applies to the equivalent tests in vparquet3/vparquet4.

Copilot generated this review using guidance from repository custom instructions.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not worth. The old code preserving elements between len and cap was an accidental side effect of the append pattern, not intentional. Those elements are stale data that callers will overwrite anyway (traceToParquetWithMapping fills every position).

@mapno mapno marked this pull request as ready for review April 7, 2026 12:51
Copy link
Copy Markdown
Contributor

@javiermolinar javiermolinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new approach, cleaner, more predictable and safer

Comment thread Makefile
endif

FILES_TO_FMT=$(shell find . -type d \( -path ./vendor -o -path ./tools/vendor -o -path ./opentelemetry-proto -o -path ./vendor-fix \) -prune -o -name '*.go' -not -name "*.pb.go" -not -name '*.y.go' -not -name '*.gen.go' -print)
FILES_TO_FMT=$(shell find . -type d \( -path ./vendor -o -path ./tools/vendor -o -path ./opentelemetry-proto -o -path ./vendor-fix -o -path ./.claude \) -prune -o -name '*.go' -not -name "*.pb.go" -not -name '*.y.go' -not -name '*.gen.go' -print)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was causing trouble when running make fmt due to how claude creates worktrees in .claude.

@mapno mapno merged commit f386416 into grafana:main Apr 7, 2026
31 checks passed
@mapno mapno deleted the perf/extend-reuse-slice-single-alloc branch April 7, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants