Turbopack Persistence: Improve heuristic for compacted database access #89497

sokra · 2026-02-04T17:03:14Z

What?

Adjusts block sizing constants and heuristics in turbo-persistence to improve the balance between small and medium values, reduce block count, and improve access performance.

Changes

MAX_SMALL_VALUE_SIZE: 64 KiB → 4 KiB. Values up to 4 KiB are now stored as small values (packed into shared blocks). Values larger than 4 KiB become medium values with dedicated blocks that can be copied without decompression during compaction.
MAX_SMALL_VALUE_BLOCK_SIZE → MIN_SMALL_VALUE_BLOCK_SIZE: Renamed and changed from a maximum (64 KiB) to a minimum (8 KiB). Small value blocks are now emitted once they accumulate at least 8 KiB, resulting in actual block sizes of 8–12 KiB.
KEY_BLOCK_ENTRY_META_OVERHEAD: Updated from 8 to 20 to reflect the actual worst-case overhead per entry in a key block (type, position, hash, block index, size, position in block).
Block count overflow protection: Added ValueBlockCountTracker to prevent exceeding the u16 block index limit (MAX_VALUE_BLOCK_COUNT = u16::MAX / 2), which accounts for the 50/50 merge-and-split during compaction.
README: Updated value type documentation with size boundaries and added a trade-off table covering compression, compaction, access cost, and storage overhead.

Value type trade-offs

	Inline	Small	Medium	Blob
Size	≤ 8 B	9 B .. 4 kB	4 kB .. 64 MB	> 64 MB
Compression unit size	≤ 16 kB	8 kB .. 12 kB	4 kB .. 64 MB	> 64 MB
Access cost	none	decompress ~8 kB	decompress value size	open file, decompress value size
Compaction	re-compressed	re-compressed	copied compressed	pointer copied

sokra · 2026-02-04T17:03:32Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

nextjs-bot · 2026-02-04T17:21:15Z

Tests Passed

codspeed-hq · 2026-02-04T17:24:43Z

Merging this PR will not alter performance

✅ 17 untouched benchmarks
⏩ 3 skipped benchmarks¹

_{Comparing sokra/db-compaction-heuristic (5da224d) with canary (c76b0fe)}

3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

nextjs-bot · 2026-02-04T17:24:46Z

Stats from current PR

✅ No significant changes detected

📊 All Metrics

📖 Metrics Glossary

Dev Server Metrics:

Listen = TCP port starts accepting connections
First Request = HTTP server returns successful response
Cold = Fresh build (no cache)
Warm = With cached build artifacts

Build Metrics:

Fresh = Clean build (no .next directory)
Cached = With existing .next directory

Change Thresholds:

Time: Changes < 50ms AND < 10%, OR < 2% are insignificant
Size: Changes < 1KB AND < 1% are insignificant
All other changes are flagged to catch regressions

⚡ Dev Server

Metric	Canary	PR	Change	Trend
Cold (Listen)	559ms	559ms	✓	▅█▅▁█
Cold (Ready in log)	544ms	534ms	✓	▇▇▇▆▇
Cold (First Request)	1.036s	1.027s	✓	▇██▇▇
Warm (Listen)	558ms	559ms	✓	▁█▁▁▁
Warm (Ready in log)	542ms	543ms	✓	▃▇▄▁▆
Warm (First Request)	415ms	413ms	✓	▂▂█▁▂

📦 Dev Server (Webpack) (Legacy)

📦 Dev Server (Webpack)

Metric	Canary	PR	Change	Trend
Cold (Listen)	456ms	456ms	✓	▁▁▁█▁
Cold (Ready in log)	435ms	435ms	✓	▃▁▂█▂
Cold (First Request)	1.834s	1.816s	✓	▁▁▁█▁
Warm (Listen)	455ms	456ms	✓	▁▁▁█▁
Warm (Ready in log)	436ms	436ms	✓	▂▁▂█▁
Warm (First Request)	1.857s	1.842s	✓	▁▁▁█▁

⚡ Production Builds

Metric	Canary	PR	Change	Trend
Fresh Build	4.910s	4.938s	✓	▄▃▄▂▂
Cached Build	4.912s	4.814s	✓	▂▁▂▁▁

📦 Production Builds (Webpack) (Legacy)

📦 Production Builds (Webpack)

Metric	Canary	PR	Change	Trend
Fresh Build	13.773s	13.772s	✓	▁▁▁█▁
Cached Build	13.868s	13.894s	✓	▁▁▁█▁
node_modules Size	467 MB	467 MB	✓	▁▁▁▁▁

📦 Bundle Sizes

Bundle Sizes

⚡ Turbopack

Client

Main Bundles: **437 kB** → **437 kB** ⚠️ +11 B

81 files with content-based hashes (individual files not comparable between builds)

Server

Middleware

	Canary	PR	Change
middleware-b..fest.js gzip	756 B	759 B	✓
Total	756 B	759 B	⚠️ +3 B

Build Details

Build Manifests

	Canary	PR	Change
_buildManifest.js gzip	451 B	452 B	✓
Total	451 B	452 B	⚠️ +1 B

📦 Webpack

Client

Main Bundles

	Canary	PR	Change
5528-HASH.js gzip	5.47 kB	N/A	-
6280-HASH.js gzip	57 kB	N/A	-
6335.HASH.js gzip	169 B	N/A	-
912-HASH.js gzip	4.53 kB	N/A	-
e8aec2e4-HASH.js gzip	62.5 kB	N/A	-
framework-HASH.js gzip	59.7 kB	59.7 kB	✓
main-app-HASH.js gzip	255 B	253 B	✓
main-HASH.js gzip	39.1 kB	39.1 kB	✓
webpack-HASH.js gzip	1.68 kB	1.68 kB	✓
262-HASH.js gzip	N/A	4.53 kB	-
2889.HASH.js gzip	N/A	169 B	-
5602-HASH.js gzip	N/A	5.49 kB	-
6948ada0-HASH.js gzip	N/A	62.5 kB	-
9544-HASH.js gzip	N/A	57.6 kB	-
Total	230 kB	231 kB	⚠️ +612 B

Polyfills

	Canary	PR	Change
polyfills-HASH.js gzip	39.4 kB	39.4 kB	✓
Total	39.4 kB	39.4 kB	✓

Pages

	Canary	PR	Change
_app-HASH.js gzip	194 B	194 B	✓
_error-HASH.js gzip	183 B	180 B	🟢 3 B (-2%)
css-HASH.js gzip	331 B	330 B	✓
dynamic-HASH.js gzip	1.81 kB	1.81 kB	✓
edge-ssr-HASH.js gzip	256 B	256 B	✓
head-HASH.js gzip	351 B	352 B	✓
hooks-HASH.js gzip	384 B	383 B	✓
image-HASH.js gzip	580 B	581 B	✓
index-HASH.js gzip	260 B	260 B	✓
link-HASH.js gzip	2.49 kB	2.49 kB	✓
routerDirect..HASH.js gzip	320 B	319 B	✓
script-HASH.js gzip	386 B	386 B	✓
withRouter-HASH.js gzip	315 B	315 B	✓
1afbb74e6ecf..834.css gzip	106 B	106 B	✓
Total	7.97 kB	7.97 kB	✅ -1 B

Server

Edge SSR

	Canary	PR	Change
edge-ssr.js gzip	126 kB	126 kB	✓
page.js gzip	249 kB	249 kB	✓
Total	375 kB	376 kB	⚠️ +446 B

Middleware

	Canary	PR	Change
middleware-b..fest.js gzip	615 B	614 B	✓
middleware-r..fest.js gzip	156 B	155 B	✓
middleware.js gzip	33.1 kB	33.2 kB	✓
edge-runtime..pack.js gzip	842 B	842 B	✓
Total	34.7 kB	34.8 kB	⚠️ +142 B

Build Details

Build Manifests

	Canary	PR	Change
_buildManifest.js gzip	733 B	735 B	✓
Total	733 B	735 B	⚠️ +2 B

Build Cache

	Canary	PR	Change
0.pack gzip	3.84 MB	3.85 MB	🔴 +8.26 kB (+0%)
index.pack gzip	103 kB	103 kB	✓
index.pack.old gzip	104 kB	103 kB	🟢 1.23 kB (-1%)
Total	4.05 MB	4.05 MB	⚠️ +7.54 kB

🔄 Shared (bundler-independent)

Runtimes

	Canary	PR	Change
app-page-exp...dev.js gzip	315 kB	315 kB	✓
app-page-exp..prod.js gzip	167 kB	167 kB	✓
app-page-tur...dev.js gzip	315 kB	315 kB	✓
app-page-tur..prod.js gzip	167 kB	167 kB	✓
app-page-tur...dev.js gzip	312 kB	312 kB	✓
app-page-tur..prod.js gzip	166 kB	166 kB	✓
app-page.run...dev.js gzip	312 kB	312 kB	✓
app-page.run..prod.js gzip	166 kB	166 kB	✓
app-route-ex...dev.js gzip	70.5 kB	70.5 kB	✓
app-route-ex..prod.js gzip	49 kB	49 kB	✓
app-route-tu...dev.js gzip	70.5 kB	70.5 kB	✓
app-route-tu..prod.js gzip	49 kB	49 kB	✓
app-route-tu...dev.js gzip	70.1 kB	70.1 kB	✓
app-route-tu..prod.js gzip	48.8 kB	48.8 kB	✓
app-route.ru...dev.js gzip	70.1 kB	70.1 kB	✓
app-route.ru..prod.js gzip	48.7 kB	48.7 kB	✓
dist_client_...dev.js gzip	324 B	324 B	✓
dist_client_...dev.js gzip	326 B	326 B	✓
dist_client_...dev.js gzip	318 B	318 B	✓
dist_client_...dev.js gzip	317 B	317 B	✓
pages-api-tu...dev.js gzip	43.2 kB	43.2 kB	✓
pages-api-tu..prod.js gzip	32.9 kB	32.9 kB	✓
pages-api.ru...dev.js gzip	43.2 kB	43.2 kB	✓
pages-api.ru..prod.js gzip	32.8 kB	32.8 kB	✓
pages-turbo....dev.js gzip	52.5 kB	52.5 kB	✓
pages-turbo...prod.js gzip	39.4 kB	39.4 kB	✓
pages.runtim...dev.js gzip	52.5 kB	52.5 kB	✓
pages.runtim..prod.js gzip	39.4 kB	39.4 kB	✓
server.runti..prod.js gzip	62.7 kB	62.7 kB	✓
Total	2.8 MB	2.8 MB	✓

turbopack/crates/turbo-persistence/src/static_sorted_file_builder.rs

lukesandberg · 2026-02-06T19:13:07Z

i don't understand the PR description. This adjusts the mix of medium and small values, not the size of SST files? or if it does it is indirect from changing the block overheads?

lukesandberg

this will greatly increase the number of small value blocks (and medium blocks, and blocks overall)

each block has a 4 byte overhead from the SST file level 'directory' plus of course the 4 byte decompressed size header. im guessing this will greatly increase the number of blocks.

i see the benefit of medium values during compaction, but couldn't we leverage the same optimization for small values (remapping block indices when merging?)

Also the statement in the PR description: "Small values benefit from better compression by being merged together in blocks, avoiding the need for a compression dictionary." is only partially true, we only use compression dictionaries for key blocks, not value blocks

… block size

…ck count Track value block count during collection and compaction to prevent exceeding the u16 block index limit in SST files. Adds a ValueBlockCountTracker that monitors medium values (1 block each) and small value block packing, triggering is_full when approaching the limit.

sokra · 2026-02-10T08:52:34Z

Small values vs medium values is basically a trade-off.

What?	Inlined	Small value	Medium value	Blob value
Compression Size	<= 16kB `MAX_KEY_BLOCK_SIZE`	<= 4kB (before 64kB) `MAX_SMALL_VALUE_BLOCK_SIZE`	> 1kB (before 64kB) `MAX_SMALL_VALUE_SIZE`	> 64 MB `MAX_MEDIUM_VALUE_SIZE`
Compaction	re-compressed	re-compressed	copied compressed	pointer copied
Access cost	no extra overhead	uncompress 4kB (before 64kB)	uncompress value size	open separate file, uncompress value size
Storage overhead	0	8 + value size / 4kB * 8	2 + 8	4 + 4

Access cost

The idea of my change was to make accessing small values cheaper: Before every access to a small value needed to decompress 64KiB before it could read the value. Now it only need to decompress 4kiB. Since decompression dominated the access time, this is a big improvement.

Storage overhead

This change also causes more blocks. The storage overhead of every small value increased from 8 + value size / 64kB * 8 to 8 + value size / 4kB * 8 (so 8 bytes in the key block + 8 bytes in the block table which is distributed to 4KiB of small values). The worse case (max small value of 1024 bytes) would be 8 + 2 = 10 bytes overhead which is 1% of the value. Smaller values have a bigger percentage of overhead, but that was true before:

size	old	new
1	0 (inlined)	0 (inlined)
2	0 (inlined)	0 (inlined)
4	0 (inlined)	0 (inlined)
8	0 (inlined)	0 (inlined)
16	8.0020 (50.01%)	8.0313 (50.20%)
32	8.0039 (25.01%)	8.0625 (25.20%)
64	8.0078 (12.51%)	8.1250 (12.70%)
128	8.0156 (6.26%)	8.2500 (6.45%)
256	8.0313 (3.14%)	8.5000 (3.32%)
512	8.0625 (1.57%)	9.0000 (1.76%)
1024	8.1250 (0.79%)	10.0000 (0.98%)
2048	8.2500 (0.40%)	10.0000 (0.49%)
4096	8.5000 (0.21%)	10.0000 (0.24%)
8192	9.0000 (0.11%)	10.0000 (0.12%)
16384	10.0000 (0.06%)	10.0000 (0.06%)
32768	12.0000 (0.04%)	10.0000 (0.03%)
65536	16.0000 (0.02%)	10.0000 (0.02%)
131072	10.0000 (0.01%)	10.0000 (0.01%)

So the storage cost difference is very small according to this formula. It's a bit inefficient to use small values that are larger than 1/4 of the small value block size.

But we have to take into account that the 8 bytes in the key block are compressed (tho luke figured out that key blocks are usually not compressed anyway). The 8 bytes in the block table are not compressed. Maybe we should compress the block table and decompress it into memory when we open the SST file?

But a problem we are running into with this increased block size is that we are running into the u16::MAX block limit. I added some code to prevent that, but it makes SST files smaller when hitting the limit. That's probably not what we want here...

Compression Size

By decreasing the block size (for both small and medium values), we reducing our compression ratio. Compression is more efficient when more data is compressed. For smaller blocks it's recommended to use a compression dictionary. When the compression size is >4kB a compression dictionary isn't really needed anymore as there is enough stuff in the block itself to compress well.

My change caused some (medium value) blocks to be between 1kB and 4kB, which isn't optimal.

I think we could increase the MAX_SMALL_VALUE_SIZE to 4kB to address that.
MAX_SMALL_VALUE_BLOCK_SIZE must be at least 2x of that to avoid very small small value blocks due to fragmentation, but I think we could change it to MIN_SMALL_VALUE_BLOCK_SIZE and make it only emit a small value block when we reach that size.
That would cause all small value blocks to be at least that size. In rare cases this could cause the block size to be up to 2 * MIN_SMALL_VALUE_BLOCK_SIZE.

Summary

Due to hitting the block count limit and the better compression ratio, I think it makes sense to increase the MIN_SMALL_VALUE_BLOCK_SIZE to 8kB, which cuts the block count in half, but double the decompression cost for the access. Sounds acceptable.

This would be the updated table:

size	old	new
1	0 (inlined)	0 (inlined)
2	0 (inlined)	0 (inlined)
4	0 (inlined)	0 (inlined)
8	0 (inlined)	0 (inlined)
16	8.0020 (50.01%)	8.0156 (50.10%)
32	8.0039 (25.01%)	8.0313 (25.10%)
64	8.0078 (12.51%)	8.0625 (12.60%)
128	8.0156 (6.26%)	8.1250 (6.35%)
256	8.0313 (3.14%)	8.2500 (3.22%)
512	8.0625 (1.57%)	8.5000 (1.66%)
1024	8.1250 (0.79%)	9.0000 (0.88%)
2048	8.2500 (0.40%)	10.0000 (0.49%)
4096	8.5000 (0.21%)	10.0000 (0.24%)
8192	9.0000 (0.11%)	10.0000 (0.12%)
16384	10.0000 (0.06%)	10.0000 (0.06%)
32768	12.0000 (0.04%)	10.0000 (0.03%)
65536	16.0000 (0.02%)	10.0000 (0.02%)

…IN_SMALL_VALUE_BLOCK_SIZE of 8kB Increase MAX_SMALL_VALUE_SIZE from 1kB to 4kB so more values are packed into shared blocks instead of getting dedicated medium value blocks. This significantly reduces block count, avoiding the u16::MAX block limit. Rename MAX_SMALL_VALUE_BLOCK_SIZE to MIN_SMALL_VALUE_BLOCK_SIZE (8kB) and change semantics from "don't exceed this size" to "emit block once this size is reached". This halves the block count compared to the previous 4kB setting while keeping access cost acceptable (~8-12kB decompression per lookup). Update README with value type trade-off documentation.

…e size boundaries Update outdated comments referencing old MAX_SMALL_VALUE_SIZE (1024) to reflect the new value (4096). Expand batch_get_different_sizes test to cover all value types: empty, inline, small, medium, larger, and blob.

nextjs-bot added created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js. labels Feb 4, 2026

sokra mentioned this pull request Feb 4, 2026

Turbopack: add turbo-persistence benchmark #89446

Merged

sokra force-pushed the sokra/db-bench branch from 93e9773 to e9d0b60 Compare February 5, 2026 11:01

sokra force-pushed the sokra/db-compaction-heuristic branch 2 times, most recently from 9fc1be9 to 7f428b3 Compare February 5, 2026 18:01

sokra force-pushed the sokra/db-bench branch from c421395 to f8dfdb9 Compare February 5, 2026 21:15

sokra force-pushed the sokra/db-compaction-heuristic branch 2 times, most recently from 265b4f7 to e44a55f Compare February 5, 2026 22:27

sokra changed the base branch from sokra/db-bench to graphite-base/89497 February 5, 2026 23:44

sokra force-pushed the sokra/db-compaction-heuristic branch from e44a55f to cf0ebe2 Compare February 5, 2026 23:44

sokra changed the base branch from graphite-base/89497 to sokra/remove-amqf-cache February 5, 2026 23:44

sokra mentioned this pull request Feb 5, 2026

Turbopack Persistence: Remove amqf cache, store all amqfs in memory #89562

Merged

sokra force-pushed the sokra/db-compaction-heuristic branch from cf0ebe2 to dbc4dee Compare February 6, 2026 00:01

sokra force-pushed the sokra/remove-amqf-cache branch 2 times, most recently from bf8ee6c to 3a076c1 Compare February 6, 2026 09:50

sokra force-pushed the sokra/db-compaction-heuristic branch from dbc4dee to 008301a Compare February 6, 2026 09:50

sokra mentioned this pull request Feb 6, 2026

Turbopack: remove unneeded benchmarks #89592

Merged

sokra force-pushed the sokra/db-compaction-heuristic branch 2 times, most recently from 22404fd to d6784c9 Compare February 6, 2026 10:41

sokra force-pushed the sokra/remove-amqf-cache branch 2 times, most recently from c69fc0d to 7703abd Compare February 6, 2026 15:34

sokra force-pushed the sokra/db-compaction-heuristic branch from d6784c9 to 5821f7d Compare February 6, 2026 15:34

lukesandberg reviewed Feb 6, 2026

View reviewed changes

turbopack/crates/turbo-persistence/src/static_sorted_file_builder.rs Outdated Show resolved Hide resolved

lukesandberg reviewed Feb 6, 2026

View reviewed changes

turbopack/crates/turbo-persistence/src/static_sorted_file_builder.rs Outdated Show resolved Hide resolved

sokra force-pushed the sokra/remove-amqf-cache branch 2 times, most recently from 6004949 to 7d03ae9 Compare February 8, 2026 08:10

sokra force-pushed the sokra/db-compaction-heuristic branch from 5821f7d to 4c141f6 Compare February 8, 2026 08:10

sokra marked this pull request as ready for review February 8, 2026 08:19

sokra changed the base branch from sokra/remove-amqf-cache to graphite-base/89497 February 8, 2026 08:44

sokra force-pushed the sokra/db-compaction-heuristic branch from 4c141f6 to 13e142c Compare February 8, 2026 08:45

sokra force-pushed the graphite-base/89497 branch from 7d03ae9 to fc13ca9 Compare February 8, 2026 08:45

graphite-app bot changed the base branch from graphite-base/89497 to canary February 8, 2026 08:46

sokra force-pushed the sokra/db-compaction-heuristic branch from 13e142c to 26dd63f Compare February 8, 2026 08:46

lukesandberg requested changes Feb 9, 2026

View reviewed changes

sokra force-pushed the sokra/db-compaction-heuristic branch from 26dd63f to 324e005 Compare February 9, 2026 10:42

sokra added 2 commits February 10, 2026 08:57

Turbopack Persistence: Improve sizes for small values and small value…

e91888d

… block size

sokra added 2 commits February 10, 2026 10:53

sokra force-pushed the sokra/db-compaction-heuristic branch from 324e005 to 5da224d Compare February 10, 2026 10:08

sokra requested review from bgw and lukesandberg February 10, 2026 10:16

lukesandberg approved these changes Feb 10, 2026

View reviewed changes

sokra merged commit 384cb2d into canary Feb 11, 2026
165 checks passed

sokra deleted the sokra/db-compaction-heuristic branch February 11, 2026 07:25

Turbopack Persistence: Improve heuristic for compacted database access #89497

Turbopack Persistence: Improve heuristic for compacted database access #89497

Uh oh!

Conversation

sokra commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Changes

Value type trade-offs

Uh oh!

sokra commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nextjs-bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests Passed

Uh oh!

codspeed-hq bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

nextjs-bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Stats from current PR

✅ No significant changes detected

⚡ Dev Server

📦 Dev Server (Webpack)

⚡ Production Builds

📦 Production Builds (Webpack)

Bundle Sizes

⚡ Turbopack

📦 Webpack

🔄 Shared (bundler-independent)

Uh oh!

Uh oh!

Uh oh!

lukesandberg commented Feb 6, 2026

Uh oh!

lukesandberg left a comment

Choose a reason for hiding this comment

Uh oh!

sokra commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Access cost

Storage overhead

Compression Size

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sokra commented Feb 4, 2026 •

edited

Loading

sokra commented Feb 4, 2026 •

edited

Loading

nextjs-bot commented Feb 4, 2026 •

edited

Loading

codspeed-hq bot commented Feb 4, 2026 •

edited

Loading

nextjs-bot commented Feb 4, 2026 •

edited

Loading

sokra commented Feb 10, 2026 •

edited

Loading