Skip to content

[turbopack-trace-server] optimize loading#93264

Merged
lukesandberg merged 8 commits intocanaryfrom
04-25-_turbopack-trace-server_inline_1_arg_per_span_via_smallvec
Apr 28, 2026
Merged

[turbopack-trace-server] optimize loading#93264
lukesandberg merged 8 commits intocanaryfrom
04-25-_turbopack-trace-server_inline_1_arg_per_span_via_smallvec

Conversation

@lukesandberg
Copy link
Copy Markdown
Contributor

@lukesandberg lukesandberg commented Apr 26, 2026

Land a few optimizations to the trace server

  • Change SpanEvent so it is 32 bytes instead of 40 bytes by triggering a niche optimization
  • Change args and events to be a smallvec with inline size 1
    • for args it is size <=1 ~31% of the time
    • for events it is size <=1 69% of the time
  • Compute min/max timestamps in a single pass instead of 2 when inserting into the selftimetree
  • Bundle dynamically computed 'total' fields behind a single OnceLock
    • saves 40 bytes per span due to Oncelock overheads
  • Inline SpanTimeData and SpanNames into Span
    • We get little benefit from deferring the allocations and by inlining we save time and improve memory locality.
      • post load SpanTimeData is allocated for 94% of spans, but after loading trace.nextjs.org it is 100%
      • post load SpanNames is allocated for 0% of spans, but after loading it is 96.2% of spans
  • Remove the inner OnceLocks from SpanNames we can just allocate these all together

Measuring with one 10gb trace file I see loading times progress from 75.7s (33G of ram) to 60.5s (19.5G of ram). With loading times hitting >200mb/s occasionally

Copy link
Copy Markdown
Contributor Author

lukesandberg commented Apr 26, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions Bot added created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js. labels Apr 26, 2026
@lukesandberg lukesandberg changed the base branch from trace_zip_file to graphite-base/93264 April 26, 2026 18:24
@lukesandberg lukesandberg force-pushed the 04-25-_turbopack-trace-server_inline_1_arg_per_span_via_smallvec branch from 48953a4 to 13fe392 Compare April 26, 2026 18:24
@lukesandberg lukesandberg changed the base branch from graphite-base/93264 to canary April 26, 2026 18:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 26, 2026

Stats from current PR

✅ No significant changes detected

📊 All Metrics
📖 Metrics Glossary

Dev Server Metrics:

  • Listen = TCP port starts accepting connections
  • First Request = HTTP server returns successful response
  • Cold = Fresh build (no cache)
  • Warm = With cached build artifacts

Build Metrics:

  • Fresh = Clean build (no .next directory)
  • Cached = With existing .next directory

Change Thresholds:

  • Time: Changes < 50ms AND < 10%, OR < 2% are insignificant
  • Size: Changes < 1KB AND < 1% are insignificant
  • All other changes are flagged to catch regressions

⚡ Dev Server

Metric Canary PR Change Trend
Cold (Listen) 812ms 812ms █████
Cold (Ready in log) 795ms 798ms ▇█▇▇█
Cold (First Request) 1.287s 1.288s ▃▆▅▇█
Warm (Listen) 812ms 813ms █████
Warm (Ready in log) 794ms 804ms ▇█▇██
Warm (First Request) 616ms 637ms ▃▇▄▇█
📦 Dev Server (Webpack) (Legacy)

📦 Dev Server (Webpack)

Metric Canary PR Change Trend
Cold (Listen) 811ms 810ms ▅▁█▅█
Cold (Ready in log) 782ms 781ms ▆▇██▂
Cold (First Request) 3.156s 3.150s ▇▇▇█▄
Warm (Listen) 811ms 810ms ▃▃▇▅▇
Warm (Ready in log) 781ms 784ms █▇██▁
Warm (First Request) 3.159s 3.161s █▇▅▆▄

⚡ Production Builds

Metric Canary PR Change Trend
Fresh Build 5.035s 5.016s ▆▇▆▇▇
Cached Build 5.082s 5.125s ▅▇▇██
📦 Production Builds (Webpack) (Legacy)

📦 Production Builds (Webpack)

Metric Canary PR Change Trend
Fresh Build 23.574s 23.795s ▅▃▅█▂
Cached Build 23.732s 23.844s ▅▆██▄
node_modules Size 495 MB 495 MB ▁▁▁▁▁
📦 Bundle Sizes

Bundle Sizes

⚡ Turbopack

Client

Main Bundles
Canary PR Change
0_09canb0ezn8.js gzip 153 B N/A -
02k5onwb4z3s6.js gzip 167 B N/A -
03t9nzq88k1pe.js gzip 155 B N/A -
04tqxk-qcsi2f.js gzip 156 B N/A -
0cz1d0mv5g_q7.js gzip 39.4 kB 39.4 kB
0fli3_wppnim5.js gzip 12.9 kB N/A -
0kb7_ep3r1z0_.js gzip 10.1 kB N/A -
0kw8xgqdrilf6.js gzip 8.56 kB N/A -
0nnx767ck3zz4.js gzip 157 B N/A -
0ojkk2e654xsc.js gzip 8.59 kB N/A -
0wxpyd8r-vipl.js gzip 1.47 kB N/A -
0xrh8l7b4d3s2.js gzip 156 B N/A -
0xy2fhla48_rd.js gzip 9.24 kB N/A -
10wqsvi2mgfmi.js gzip 9.82 kB N/A -
16lhqjoqbznyg.js gzip 220 B 220 B
16vepdkipri3r.js gzip 8.51 kB N/A -
17n96uu6y1pxq.js gzip 8.6 kB N/A -
18y4_8-9or0mn.js gzip 8.51 kB N/A -
1elt1qium-r2m.css gzip 115 B 115 B
1gq145j3kps-h.js gzip 8.62 kB N/A -
1k3mvlb4-0ngf.js gzip 155 B N/A -
1ke_4s9soy654.js gzip 156 B N/A -
1nsh-mbn0e-se.js gzip 8.56 kB N/A -
1qngoc418rk6i.js gzip 65.6 kB N/A -
1r3s1n8vyb7h6.js gzip 161 B N/A -
1tsrrp1tdngti.js gzip 13.3 kB N/A -
1v-qecyz63-0b.js gzip 154 B N/A -
2__-e_ym8n788.js gzip 450 B N/A -
22o6xd9_ywdu6.js gzip 233 B N/A -
25n272-g99oa1.js gzip 7.61 kB N/A -
2c9mvd-i9rxxl.js gzip 160 B N/A -
2d4njk_907vw4.js gzip 157 B N/A -
2faj3acmavn9n.js gzip 13.1 kB N/A -
2kvj8yrfznmwx.js gzip 5.69 kB N/A -
2qv7m7xjnokgr.js gzip 8.58 kB N/A -
2ue5g3yr_f1ds.js gzip 70.9 kB N/A -
342ijzvrpe53h.js gzip 2.29 kB N/A -
3afk9e9-iuwwd.js gzip 157 B N/A -
3k1k5gtofm6eq.js gzip 10.4 kB N/A -
3xq6of2nocani.js gzip 49.5 kB N/A -
42_02jza_7yny.js gzip 13.8 kB N/A -
turbopack-04..w00-.js gzip 4.19 kB N/A -
turbopack-0g..9a3o.js gzip 4.19 kB N/A -
turbopack-0l..g-ev.js gzip 4.19 kB N/A -
turbopack-0m..2-2t.js gzip 4.18 kB N/A -
turbopack-0t..xwt-.js gzip 4.19 kB N/A -
turbopack-0u..t26z.js gzip 4.2 kB N/A -
turbopack-1k..1uu6.js gzip 4.19 kB N/A -
turbopack-1m..q5n1.js gzip 4.19 kB N/A -
turbopack-2j..87q5.js gzip 4.19 kB N/A -
turbopack-2q..y41_.js gzip 4.19 kB N/A -
turbopack-2y..xjoo.js gzip 4.19 kB N/A -
turbopack-3l..z3of.js gzip 4.17 kB N/A -
turbopack-3s..s0yf.js gzip 4.19 kB N/A -
turbopack-3u..-zq7.js gzip 4.19 kB N/A -
03kysncgx5l7w.js gzip N/A 155 B -
0arkbdqpxc37i.js gzip N/A 8.6 kB -
0bz-xifewa17d.js gzip N/A 8.63 kB -
0efh6erg1kc4c.js gzip N/A 158 B -
0tvekitj587fh.js gzip N/A 8.51 kB -
0yvk6-wi8e9wh.js gzip N/A 13.3 kB -
1-jqyfc89tixo.js gzip N/A 1.46 kB -
10y3h86mnhs_2.js gzip N/A 10.4 kB -
12hxdatac0fxj.js gzip N/A 49.5 kB -
139jydanoq6-d.js gzip N/A 154 B -
14t1kneseb8th.js gzip N/A 2.3 kB -
15sb1-dsqfk_j.js gzip N/A 8.59 kB -
1ab2xruymo-oj.js gzip N/A 449 B -
1b3xo3p2pa8_a.js gzip N/A 70.9 kB -
1dt49_v4y8lxb.js gzip N/A 13.8 kB -
1tu25qtsmfhar.js gzip N/A 9.82 kB -
1v3ftpmn8m_ud.js gzip N/A 161 B -
1vein_gnv3mwr.js gzip N/A 8.56 kB -
1vmibvuhp1gey.js gzip N/A 13.1 kB -
1wzrm0xjjbzn5.js gzip N/A 10.1 kB -
1z1geo4e53wn1.js gzip N/A 156 B -
1z3g0uaqtv9_3.js gzip N/A 8.56 kB -
2-2ld71a0y6d5.js gzip N/A 157 B -
213wdc0nef-no.js gzip N/A 168 B -
248gz0gduuney.js gzip N/A 157 B -
2bi5hx402juv-.js gzip N/A 8.58 kB -
2hy56297fog9u.js gzip N/A 8.52 kB -
2k0exemzm1ral.js gzip N/A 157 B -
2pch5duiz7pl1.js gzip N/A 155 B -
2u_rpxq3tzytl.js gzip N/A 233 B -
2zg0rr542d7qb.js gzip N/A 156 B -
314cbinszt68n.js gzip N/A 155 B -
35nh2lh_i5pyh.js gzip N/A 7.61 kB -
368lim5wq0o0r.js gzip N/A 12.9 kB -
3drqjohogojbw.js gzip N/A 5.69 kB -
3inn3g12k7ggr.js gzip N/A 153 B -
3lx6lyx6jwnsa.js gzip N/A 65.5 kB -
3wpp8nvyoj121.js gzip N/A 9.24 kB -
turbopack-02..h8bk.js gzip N/A 4.19 kB -
turbopack-03..d822.js gzip N/A 4.19 kB -
turbopack-04..-sj1.js gzip N/A 4.19 kB -
turbopack-0d..etty.js gzip N/A 4.19 kB -
turbopack-0e..h9db.js gzip N/A 4.19 kB -
turbopack-0h..um4t.js gzip N/A 4.18 kB -
turbopack-0m.._t4k.js gzip N/A 4.19 kB -
turbopack-0o..jeg1.js gzip N/A 4.19 kB -
turbopack-15..3bjj.js gzip N/A 4.2 kB -
turbopack-18..r3ht.js gzip N/A 4.19 kB -
turbopack-1b..dtt2.js gzip N/A 4.19 kB -
turbopack-3b..mz6c.js gzip N/A 4.19 kB -
turbopack-3n..av5a.js gzip N/A 4.17 kB -
turbopack-3y..9apy.js gzip N/A 4.19 kB -
Total 465 kB 465 kB ⚠️ +48 B

Server

Middleware
Canary PR Change
middleware-b..fest.js gzip 718 B 719 B
Total 718 B 719 B ⚠️ +1 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 435 B 433 B
Total 435 B 433 B ✅ -2 B

📦 Webpack

Client

Main Bundles
Canary PR Change
2637-HASH.js gzip 4.63 kB N/A -
7724.HASH.js gzip 169 B N/A -
8274-HASH.js gzip 61.4 kB N/A -
8817-HASH.js gzip 5.59 kB N/A -
c3500254-HASH.js gzip 62.8 kB N/A -
framework-HASH.js gzip 59.7 kB 59.7 kB
main-app-HASH.js gzip 254 B 255 B
main-HASH.js gzip 39.4 kB 39.4 kB
webpack-HASH.js gzip 1.68 kB 1.68 kB
5887-HASH.js gzip N/A 5.61 kB -
6522-HASH.js gzip N/A 60.7 kB -
6779-HASH.js gzip N/A 4.63 kB -
8854.HASH.js gzip N/A 169 B -
eab920f9-HASH.js gzip N/A 62.8 kB -
Total 236 kB 235 kB ✅ -652 B
Polyfills
Canary PR Change
polyfills-HASH.js gzip 39.4 kB 39.4 kB
Total 39.4 kB 39.4 kB
Pages
Canary PR Change
_app-HASH.js gzip 193 B 193 B
_error-HASH.js gzip 182 B 182 B
css-HASH.js gzip 333 B 334 B
dynamic-HASH.js gzip 1.81 kB 1.8 kB
edge-ssr-HASH.js gzip 255 B 255 B
head-HASH.js gzip 353 B 349 B 🟢 4 B (-1%)
hooks-HASH.js gzip 384 B 382 B
image-HASH.js gzip 581 B 581 B
index-HASH.js gzip 260 B 259 B
link-HASH.js gzip 2.52 kB 2.52 kB
routerDirect..HASH.js gzip 316 B 318 B
script-HASH.js gzip 386 B 386 B
withRouter-HASH.js gzip 313 B 314 B
1afbb74e6ecf..834.css gzip 106 B 106 B
Total 7.99 kB 7.98 kB ✅ -10 B

Server

Edge SSR
Canary PR Change
edge-ssr.js gzip 126 kB 126 kB
page.js gzip 274 kB 274 kB
Total 400 kB 399 kB ✅ -542 B
Middleware
Canary PR Change
middleware-b..fest.js gzip 616 B 616 B
middleware-r..fest.js gzip 156 B 156 B
middleware.js gzip 43.9 kB 44.4 kB 🔴 +465 B (+1%)
edge-runtime..pack.js gzip 842 B 842 B
Total 45.5 kB 46 kB ⚠️ +465 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 722 B 719 B
Total 722 B 719 B ✅ -3 B
Build Cache
Canary PR Change
0.pack gzip 4.4 MB 4.4 MB
index.pack gzip 115 kB 116 kB
index.pack.old gzip 116 kB 114 kB 🟢 1.68 kB (-1%)
Total 4.63 MB 4.63 MB ✅ -3.94 kB

🔄 Shared (bundler-independent)

Runtimes
Canary PR Change
app-page-exp...dev.js gzip 348 kB 348 kB
app-page-exp..prod.js gzip 193 kB 193 kB
app-page-tur...dev.js gzip 348 kB 348 kB
app-page-tur..prod.js gzip 193 kB 193 kB
app-page-tur...dev.js gzip 344 kB 344 kB
app-page-tur..prod.js gzip 191 kB 191 kB
app-page.run...dev.js gzip 345 kB 345 kB
app-page.run..prod.js gzip 191 kB 191 kB
app-route-ex...dev.js gzip 77.3 kB 77.3 kB
app-route-ex..prod.js gzip 52.8 kB 52.8 kB
app-route-tu...dev.js gzip 77.4 kB 77.4 kB
app-route-tu..prod.js gzip 52.8 kB 52.8 kB
app-route-tu...dev.js gzip 77 kB 77 kB
app-route-tu..prod.js gzip 52.5 kB 52.5 kB
app-route.ru...dev.js gzip 76.9 kB 76.9 kB
app-route.ru..prod.js gzip 52.5 kB 52.5 kB
dist_client_...dev.js gzip 324 B 324 B
dist_client_...dev.js gzip 326 B 326 B
dist_client_...dev.js gzip 318 B 318 B
dist_client_...dev.js gzip 317 B 317 B
pages-api-tu...dev.js gzip 44.2 kB 44.2 kB
pages-api-tu..prod.js gzip 33.7 kB 33.7 kB
pages-api.ru...dev.js gzip 44.2 kB 44.2 kB
pages-api.ru..prod.js gzip 33.7 kB 33.7 kB
pages-turbo....dev.js gzip 53.7 kB 53.7 kB
pages-turbo...prod.js gzip 39.4 kB 39.4 kB
pages.runtim...dev.js gzip 53.6 kB 53.6 kB
pages.runtim..prod.js gzip 39.4 kB 39.4 kB
server.runti..prod.js gzip 63.1 kB 63.1 kB
Total 3.08 MB 3.08 MB
📎 Tarball URL
https://vercel-packages.vercel.app/next/commits/e400faf9f16229767b4c3bb92c83cf098b89e711/next

Commit: e400faf

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 26, 2026

Tests Passed

Commit: e400faf

@lukesandberg lukesandberg changed the title [turbopack-trace-server] optimize SelfTimeTree query+split [turbopack-trace-server] optimize loading Apr 26, 2026
@lukesandberg lukesandberg marked this pull request as ready for review April 26, 2026 23:01
@lukesandberg lukesandberg requested a review from a team April 26, 2026 23:02
@graphite-app
Copy link
Copy Markdown
Contributor

graphite-app Bot commented Apr 27, 2026

Merge activity

  • Apr 27, 3:34 PM UTC: This pull request can not be added to the Graphite merge queue. Please try rebasing and resubmitting to merge when ready.
  • Apr 27, 3:34 PM UTC: Graphite disabled "merge when ready" on this PR due to: a merge conflict with the target branch; resolve the conflict and try again..
  • Apr 28, 10:46 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Apr 28, 10:46 PM UTC: @lukesandberg merged this pull request with Graphite.

@lukesandberg lukesandberg force-pushed the 04-25-_turbopack-trace-server_inline_1_arg_per_span_via_smallvec branch from d57d362 to 50d5373 Compare April 27, 2026 22:28
@mischnic mischnic requested a review from sokra April 28, 2026 08:42
Comment thread turbopack/crates/turbopack-trace-server/src/store.rs Outdated
Two small wins in self_time_tree.rs:

1. Replace the two-pass min_by_key/max_by_key in distribute_entries
   with a single fold. Only fires on the first split of a node so the
   absolute saving is small, but the change is free.

2. Add a fast path to lookup_range_corrected_time that skips the
   sort-and-sweep when every overlapping interval fully contains the
   query window. This is the common case for short SelfTime events
   queried via SpanRef::corrected_self_time / SpanEventRef::corrected_self_time.
   Pre-reserves the changes vec to avoid amortized RawVec growth on
   the slow path. Also defensively returns ZERO when there are no
   overlapping intervals (the original code would have divided by
   zero, but the calling span's own self-time event is always in the
   tree so this case isn't reachable in practice).

Tests: 4 new unit tests cover no-overlap, single-interval,
full-containment fast path, and partial overlap.
Change `SpanEventSelfTime { start, end, ... }` to
`SpanEventSelfTime { start, duration: NonZeroU64, ... }`. The
`NonZeroU64` gives the larger variant a niche, and combined with
`Child`'s existing `NonZeroUsize` the compiler packs the enum without
a separate discriminant byte. Verified by
`const _: () = assert!(size_of == 32)`.

Saves ~8 bytes per event. With ~10 events/span on average (47M spans
giving ~470M events), that's roughly 3.5 GB.

Callers must filter zero-duration self-time events before constructing.
The construction sites in store.rs (`add_self_time` and
`set_total_time`) are updated:

- `add_self_time` now uses `SpanEvent::self_time(start, end)` which
  returns `None` for zero/negative duration (early-returns instead of
  pushing).
- `set_total_time`'s three internal pushes use `if let Some(...)` to
  defensively skip zero-duration events.

`SpanEventSelfTimeRef::end()` now computes `start + duration.get()`
on demand. The redundant zero-duration check in `corrected_self_time()`
is removed since `NonZeroU64` guarantees the invariant.

Tests: 3 new tests for SpanEvent (size, zero-duration filter, ctor).
Replace `Span::args: Vec<(RcStr, RcStr)>` with
`pub type SpanArgs = SmallVec<[(RcStr, RcStr); 1]>`. Most spans have
0–1 args (typically just a `name` key for `turbo_tasks::function`
spans), and `SmallVec<[T; 1]>` with the workspace's `union` feature is
the same 24 bytes as `Vec` while inlining one entry. Net effect:
zero-arg spans pay no heap allocation; single-arg spans (the common
case) also pay no heap allocation; spans with 2+ args spill to heap as
before.

Backed by `const _: () = assert!(size_of::<SpanArgs>() == 24)` so any
layout regression breaks the build.

Cap intentionally pinned at 1, not 2: bumping it would grow `Span` by
16 bytes (~750 MB at 47M spans) for a marginal additional saving.
Change `LazySortedVec`'s backing storage from `Vec<T>` to
`SmallVec<[T; 1]>`. ~69% of spans have <=1 event (a single self-time
event for leaf spans), so inlining one entry avoids a heap allocation
in this common case. `Deref` now returns `&[T]` so callers iterate
through the slice rather than `&Vec<T>`.

`set_total_time` builds events into a local `Vec<SpanEvent>` and
converts via the existing `From<Vec<T>>` impl on assignment.
…ceLock

Replace `Span`'s six `OnceLock<u32|u64>` fields (`max_depth`,
`total_allocations`, `total_deallocations`,
`total_persistent_allocations`, `total_allocation_count`,
`total_span_count`) with a single `OnceLock<SpanTotals>` bundling all
six values. Computation walks the subtree once on first access and
fills every field; subsequent calls — regardless of which getter — hit
the cache.

Trades a small amount of read-side work (always populating all six
fields, even if the caller only wanted one) for a much smaller
per-Span lock count. With ~47M spans this saves on the order of
hundreds of MB of OnceLock overhead.

`SpanRef::max_depth`, `total_allocations`, etc. now read through
`SpanRef::totals()`. Invalidation in `Store::invalidate_outdated_spans`
collapses the per-field `take()` calls into one `span.totals.take()`.
@lukesandberg lukesandberg force-pushed the 04-25-_turbopack-trace-server_inline_1_arg_per_span_via_smallvec branch from 648f228 to e400faf Compare April 28, 2026 21:52
@lukesandberg lukesandberg merged commit 8dee7ac into canary Apr 28, 2026
340 of 342 checks passed
@lukesandberg lukesandberg deleted the 04-25-_turbopack-trace-server_inline_1_arg_per_span_via_smallvec branch April 28, 2026 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants