perf: reduce TOML and repeated join allocations by He-Pin · Pull Request #828 · databricks/sjsonnet

He-Pin · 2026-05-07T21:31:21Z

Motivation

std.manifestTomlEx spends avoidable time and allocation in TOML string/key rendering and table traversal. The go_suite/manifestTomlEx benchmark is sensitive to regex matches, intermediate escaped strings, Seq path copies, repeated sorting, and per-value renderer allocation.

cpp_suite/large_string_join builds std.makeArray(76846, function(_) 'x') and then joins it. That used to allocate a large backing array filled with the same constant value, then walk it again in std.join.

Modification

Write TOML strings and keys directly to the StringWriter.
Replace bare-key regex matching with a small ASCII scan.
Avoid TOML array-render string concatenations and StringWriter.flush().
Use the sorted key cache plus section flags instead of partition plus two sorts.
Reuse one TomlRenderer per table level and keep table paths in a mutable buffer.
Represent large constant std.makeArray results as a constant array view.
Teach string std.join to consume constant/repeated string arrays with exact-size StringBuilder output and ASCII-safe result tagging.

Results

JMH regression cases, upstream/master at 8b67cb1e:

benchmark	upstream/master	this PR	delta
`go_suite/manifestTomlEx.jsonnet`	0.073 ms/op	0.054 ms/op	26.0% faster
`cpp_suite/large_string_join.jsonnet`	0.625 ms/op	0.461 ms/op	26.2% faster

Native CLI hyperfine against latest source-built jrsonnet master 5b43fa8 (jrsonnet 0.5.0-pre98):

benchmark	upstream/master	this PR	latest jrsonnet	highlight
`cpp_suite/large_string_join.jsonnet`	5.2 +/- 0.4 ms	4.7 +/- 1.2 ms	8.3 +/- 1.2 ms	this PR is 1.77x faster than latest jrsonnet; CLI timings are short/noisy

Verification

git diff --check
./mill -i __.checkFormat
./mill -i 'sjsonnet.jvm[3.3.7].test'
./mill -i 'sjsonnet.js[3.3.7].test'
./mill -i 'sjsonnet.native[3.3.7].test'
./mill -i 'sjsonnet.jvm[2.12.21].compile'
./mill -i 'sjsonnet.jvm[2.13.18].compile'
./mill -i 'sjsonnet.js[2.13.18].compile'
./mill -i 'sjsonnet.native[2.13.18].compile'
./mill -i 'sjsonnet.native[3.3.7].nativeLink'
./mill -i bench.runRegressions bench/resources/go_suite/foldl.jsonnet bench/resources/cpp_suite/bench.04.jsonnet bench/resources/cpp_suite/large_string_join.jsonnet bench/resources/go_suite/manifestTomlEx.jsonnet
hyperfine --warmup 10 --min-runs 50

Boundary Checks

TOML output order remains non-section keys first, then section tables; both groups stay sorted.
TOML section classification still follows visibleKeyNames order to preserve the previous evaluation/trace order.
Existing TomlRenderer.escapeKey API is preserved.
Constant array views materialize back to the same repeated Eval array when an existing caller asks for asLazyArray.
std.join falls back to the existing element-by-element path for non-string constants, mixed arrays, and lazy values.

References

Head: a2a96ed47bf87ea33fab7cbc6a47642c358b3e75
Latest source-built jrsonnet: 5b43fa88b8c43856dd5a2daa9c5c251153c5e14d

Motivation: std.manifestTomlEx spends avoidable time and allocation in TOML string/key rendering and table traversal. The go_suite manifestTomlEx regression benchmark is small but sensitive to regex matches, intermediate escaped strings, Seq path copies, repeated sorting, and per-value renderer allocation. Modification: Write TOML strings and keys directly to the StringWriter, replace bare-key regex matching with a small ASCII scan, avoid array-render string concatenations and StringWriter.flush(), use sorted key cache plus section flags instead of partition + sorting, reuse one TomlRenderer per table level, and keep table paths in a mutable buffer while preserving visible-key section classification order. Result: bench/resources/go_suite/manifestTomlEx.jsonnet: upstream/master 0.069 ms/op, this change 0.052 ms/op (-24.6%). Verification: ./mill -i __.checkFormat git diff --check ./mill -i 'sjsonnet.jvm[3.3.7].test' ./mill -i 'sjsonnet.js[3.3.7].test' JDK_JAVA_OPTIONS="--enable-native-access=ALL-UNNAMED -Xmx8G -XX:+UseG1GC" ./mill -i 'sjsonnet.native[3.3.7].test' ./mill -i bench.runRegressions bench/resources/go_suite/manifestTomlEx.jsonnet

He-Pin added 2 commits May 8, 2026 05:30

perf: speed up constant string joins

a2a96ed

He-Pin changed the title ~~perf: reduce TOML manifestation allocations~~ perf: reduce TOML and repeated join allocations May 8, 2026

He-Pin mentioned this pull request May 8, 2026

perf: speed up constant string joins #825

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: reduce TOML and repeated join allocations#828

perf: reduce TOML and repeated join allocations#828
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/toml-render-fastpath

He-Pin commented May 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Results

Verification

Boundary Checks

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

He-Pin commented May 7, 2026 •

edited

Loading