Skip to content

perf: reduce TOML and repeated join allocations#828

Open
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/toml-render-fastpath
Open

perf: reduce TOML and repeated join allocations#828
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/toml-render-fastpath

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 7, 2026

Motivation

std.manifestTomlEx spends avoidable time and allocation in TOML string/key rendering and table traversal. The go_suite/manifestTomlEx benchmark is sensitive to regex matches, intermediate escaped strings, Seq path copies, repeated sorting, and per-value renderer allocation.

cpp_suite/large_string_join builds std.makeArray(76846, function(_) 'x') and then joins it. That used to allocate a large backing array filled with the same constant value, then walk it again in std.join.

Modification

  • Write TOML strings and keys directly to the StringWriter.
  • Replace bare-key regex matching with a small ASCII scan.
  • Avoid TOML array-render string concatenations and StringWriter.flush().
  • Use the sorted key cache plus section flags instead of partition plus two sorts.
  • Reuse one TomlRenderer per table level and keep table paths in a mutable buffer.
  • Represent large constant std.makeArray results as a constant array view.
  • Teach string std.join to consume constant/repeated string arrays with exact-size StringBuilder output and ASCII-safe result tagging.

Results

JMH regression cases, upstream/master at 8b67cb1e:

benchmark upstream/master this PR delta
go_suite/manifestTomlEx.jsonnet 0.073 ms/op 0.054 ms/op 26.0% faster
cpp_suite/large_string_join.jsonnet 0.625 ms/op 0.461 ms/op 26.2% faster

Native CLI hyperfine against latest source-built jrsonnet master 5b43fa8 (jrsonnet 0.5.0-pre98):

benchmark upstream/master this PR latest jrsonnet highlight
cpp_suite/large_string_join.jsonnet 5.2 +/- 0.4 ms 4.7 +/- 1.2 ms 8.3 +/- 1.2 ms this PR is 1.77x faster than latest jrsonnet; CLI timings are short/noisy

Verification

  • git diff --check
  • ./mill -i __.checkFormat
  • ./mill -i 'sjsonnet.jvm[3.3.7].test'
  • ./mill -i 'sjsonnet.js[3.3.7].test'
  • ./mill -i 'sjsonnet.native[3.3.7].test'
  • ./mill -i 'sjsonnet.jvm[2.12.21].compile'
  • ./mill -i 'sjsonnet.jvm[2.13.18].compile'
  • ./mill -i 'sjsonnet.js[2.13.18].compile'
  • ./mill -i 'sjsonnet.native[2.13.18].compile'
  • ./mill -i 'sjsonnet.native[3.3.7].nativeLink'
  • ./mill -i bench.runRegressions bench/resources/go_suite/foldl.jsonnet bench/resources/cpp_suite/bench.04.jsonnet bench/resources/cpp_suite/large_string_join.jsonnet bench/resources/go_suite/manifestTomlEx.jsonnet
  • hyperfine --warmup 10 --min-runs 50

Boundary Checks

  • TOML output order remains non-section keys first, then section tables; both groups stay sorted.
  • TOML section classification still follows visibleKeyNames order to preserve the previous evaluation/trace order.
  • Existing TomlRenderer.escapeKey API is preserved.
  • Constant array views materialize back to the same repeated Eval array when an existing caller asks for asLazyArray.
  • std.join falls back to the existing element-by-element path for non-string constants, mixed arrays, and lazy values.

References

  • Head: a2a96ed47bf87ea33fab7cbc6a47642c358b3e75
  • Latest source-built jrsonnet: 5b43fa88b8c43856dd5a2daa9c5c251153c5e14d

He-Pin added 2 commits May 8, 2026 05:30
Motivation:
std.manifestTomlEx spends avoidable time and allocation in TOML string/key rendering and table traversal. The go_suite manifestTomlEx regression benchmark is small but sensitive to regex matches, intermediate escaped strings, Seq path copies, repeated sorting, and per-value renderer allocation.

Modification:
Write TOML strings and keys directly to the StringWriter, replace bare-key regex matching with a small ASCII scan, avoid array-render string concatenations and StringWriter.flush(), use sorted key cache plus section flags instead of partition + sorting, reuse one TomlRenderer per table level, and keep table paths in a mutable buffer while preserving visible-key section classification order.

Result:
bench/resources/go_suite/manifestTomlEx.jsonnet: upstream/master 0.069 ms/op, this change 0.052 ms/op (-24.6%).

Verification:
./mill -i __.checkFormat
git diff --check
./mill -i 'sjsonnet.jvm[3.3.7].test'
./mill -i 'sjsonnet.js[3.3.7].test'
JDK_JAVA_OPTIONS="--enable-native-access=ALL-UNNAMED -Xmx8G -XX:+UseG1GC" ./mill -i 'sjsonnet.native[3.3.7].test'
./mill -i bench.runRegressions bench/resources/go_suite/manifestTomlEx.jsonnet
@He-Pin He-Pin changed the title perf: reduce TOML manifestation allocations perf: reduce TOML and repeated join allocations May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant