Skip to content

Improve stringification performance#106

Merged
DecimalTurn merged 18 commits intolatestfrom
perf-stringify
Feb 14, 2026
Merged

Improve stringification performance#106
DecimalTurn merged 18 commits intolatestfrom
perf-stringify

Conversation

@DecimalTurn
Copy link
Copy Markdown
Owner

Performance Improvements for stringification process:

  • Replaced the generic traverse function in to-toml.ts.
  • Introduced a dirty_roots WeakSet to track AST roots with pending modifications,
  • Minor refactor in walkInlineTable to streamline iteration.

Add early return for single-line nodes in write() to skip the
regex split + filter that allocates two temporary arrays per call.
The vast majority of AST nodes are single-line, so this avoids
unnecessary allocations on the hot path.

Also add STRINGIFY_OPTIMIZATION.md tracking the optimization plan.
Three targeted optimizations to reduce traversal overhead in writer.ts:

1. Dirty tracking for applyWrites — WeakSet<Root> tracks which roots have
   pending offsets. applyWrites returns immediately when clean, skipping
   redundant full-tree traversals between phases.

2. Early return in shiftNode for (0,0) shifts — the first item inserted
   into a container always has zero shift since generated nodes start at
   position (1,0).

3. Fast-path shiftNode for leaf nodes and simple KeyValues — direct
   property access instead of generic traverse() with function call and
   switch dispatch overhead. Covers String, Integer, Float, Boolean,
   DateTime, Key, Comment, and KeyValue-with-leaf-value.

Results on benchmark suite:
- Scalar-heavy tests: +73-85%
- Inline table/array tests: +14-51%
- Spec example: +9% (927 -> 1,011 ops/sec)
- Small doc: +42% (26,871 -> 38,149 ops/sec)
- Overall average: +18%
- Replace generic traverse() visitor dispatch in applyWrites with
  type-specific switch/case inline traversal. Eliminates visitor
  object creation, dynamic property lookup, and enter/exit checks.
  Enables V8 to use monomorphic inline caches instead of megamorphic.

- Replace traverse() in toTOML with inline emitNode() switch/case.
  Adds writeSingle()/writeChars() fast paths for bracket/comma writes
  that avoid creating temporary Location objects.

- Replace pipe() with direct function calls in parseJS. Avoids rest
  args array allocation, reduce callback, and wrapper closures.

- Remove unused formatPrintWidth call (currently a no-op).

- P6: Remove [...walkObject()] spread in walkInlineTable, iterate
  the generator directly.

Results (spec example): 1,031 → 1,177 ops/sec (+14%)
Full suite average: 3,185 → 3,495 ops/sec (+10%)
Small doc: 36,581 → 45,738 ops/sec (+25%)
Nested-1000: 4.93 → 6.61 ops/sec (+34%)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements performance optimizations for the stringification process in a TOML parser/serializer library. The changes focus on replacing generic traversal patterns with specialized inline implementations to improve V8 inline caching and reduce function call overhead.

Changes:

  • Replaced the generic traverse function with inline switch-based traversal in applyWrites (writer.ts) and toTOML (to-toml.ts) to avoid megamorphic inline cache misses
  • Introduced a dirty_roots WeakSet to track which AST roots have pending modifications, enabling early return from applyWrites when no changes have been made
  • Added fast-path optimizations for common cases in shiftNode (leaf nodes and KeyValue with leaf values) and write (single-line content)
  • Streamlined walkInlineTable to iterate directly over the generator without creating an intermediate array
  • Removed the no-op formatPrintWidth function call and simplified the formatting pipeline

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/writer.ts Added dirty_roots tracking, replaced traverse with inline visitNode in applyWrites, added fast paths in shiftNode
src/to-toml.ts Replaced traverse with inline emitNode, added writeSingle/writeChars helpers, added fast path for single-line writes
src/parse-js.ts Removed pipe composition and formatPrintWidth call, streamlined walkInlineTable iteration
benchmark/stringify-benchmark.mjs Added @rainbowatcher/toml-edit-js to benchmarks with WASM init support
benchmark/parse-benchmark.mjs Added @rainbowatcher/toml-edit-js to benchmarks with WASM init support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@DecimalTurn DecimalTurn marked this pull request as ready for review February 14, 2026 18:32
- Add markdown output to benchmark scripts (benchmark-parse.md, benchmark-stringify.md)
- Create check-performance.mjs to verify performance vs baseline
- Add workflow steps to check performance and write results to GitHub Actions summary
- Fail build if current performance is >10% slower than latest release
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The paddingChar parameter was computed but never actually used - write()
always pads with SPACE regardless. Tab conversion happens in post-processing
for leading indentation only (lines 165-177).

Fixes issue identified by Copilot PR review.
getLine() always returns a string (initializes empty lines as needed),
so the 'line === undefined' checks are unreachable dead code in the hot path.

Reduces bundle size from 67.92 KB to 67.61 KB.

Fixes issue identified by Copilot PR review.
If visitNode() throws (e.g., unrecognized node type), we must not mark
the root as clean while offset maps still contain data. This would leave
inconsistent state where subsequent applyWrites() calls no-op but stale
offsets remain.

Now dirty_roots.delete() happens only after successful traversal,
ensuring atomic success or full rollback to dirty state.

Fixes issue identified by Copilot PR review.
@DecimalTurn DecimalTurn merged commit f06ae66 into latest Feb 14, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants