Add profiling and improve parsing performance by DecimalTurn · Pull Request #105 · DecimalTurn/toml-patch

DecimalTurn · 2026-02-13T05:37:04Z

This pull request introduces performance improvements and a profiling script.
The main optimizations are in string parsing and bare key validation, improvements to line/column location finding and some micro-optimizations for array merging and iterator completion.

Replace Cursor/utf16Iterator generator with direct index-based scanning: - Eliminate per-character IteratorResult object allocation (~14% CPU) - Replace IS_WHITESPACE regex with charCode comparisons on hot path - Replace string concatenation (raw += char) with input.slice(start, end) - Optimize checkThree: backward backslash walk instead of input.slice(0,n).match() - Inline specialCharacter() — yield token objects directly in main loop 549 unit tests + 855 TOML spec compliance tests pass. Parse speedup: ~2x across all file sizes (spec example: 9x→4.7x vs smol-toml).

parse-string.ts: - Replace 7-pass regex pipeline with single-pass state machine - Handle all escape sequences (\b,\t,\n,\f,\r,\",\\,\e,\uXXXX,\UXXXXXXXX,\xHH) in one traversal instead of separate regex passes - Fold multiline preprocessing (line-ending backslash, newline escaping, quote escaping) into the same single pass - Use hex lookup table for fast hex digit parsing - Add fast path: skip processing entirely when no backslash present - Properly reject \<space> without newline in multiline strings location.ts: - Replace Array.findIndex (linear scan) with binary search in findPosition - Lines array is already sorted; O(T*log L) instead of O(T*L) 549 unit tests + 855 TOML spec compliance tests pass. Parse improvement vs smol-toml: spec example: 4.7x -> 3.4x inline-arrays: 12.8x -> 5.3x inline-tables: 5.4x -> 4.3x

…concat+join

…+ merge

…anch

…ng per call

The generic traverse visits every AST node (Key, String, Integer, etc.) even though toJS only cares about Table, TableArray, and KeyValue. The specialized walk iterates only the relevant nodes and skips Comments and all Value sub-nodes entirely, eliminating traverseNode dispatch, visitor lookup, and type-check overhead for ~60% of nodes.

…te findLines pass Instead of scanning the entire input upfront to find newline positions (createLocate → findLines), record newlines as the tokenizer encounters them in its main loop and multiline string scanner. findPosition is updated to handle the missing end-sentinel gracefully. This eliminates one full O(n) pass through the input on every parse.

…ofile files

Copilot

Pull request overview

This PR introduces significant performance improvements to the TOML parser through algorithmic optimizations and micro-optimizations across multiple modules. The main focus is on reducing allocations, replacing regex with character code comparisons, and optimizing hot-path operations. A comprehensive profiling script is also added to help identify performance bottlenecks.

Changes:

Refactored tokenizer from cursor-based to index-based scanning with incremental line indexing
Optimized string parsing with single-pass unescaping and hex lookup tables
Replaced generic AST traversal with specialized inline walk for faster processing
Added micro-optimizations for bare key validation, array operations, and iterator completion

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/tokenizer.ts	Replaced cursor-based scanning with index-based approach, added character code constants, implemented incremental line indexing
src/to-js.ts	Removed traverse dependency, added specialized inline walk, optimized validateKey with pre-computed candidates
src/parse-toml.ts	Added isBareKeyCode function using charCode ranges, removed unnecessary cloneLocation calls, replaced merge with direct array push
src/parse-string.ts	Implemented single-pass unescapeBasicString with hex lookup table, removed regex-based approaches
src/location.ts	Replaced linear search with binary search in findPosition, optimized findLines with charCode comparisons
src/cursor.ts	Added frozen DONE sentinel object to avoid repeated allocations
src/utils.ts	Optimized has() function to use `in` operator instead of hasOwnProperty.call
benchmark/profile.mjs	Added comprehensive profiling script with V8 CPU profiling integration
package.json	Added "profile" npm script
.gitignore	Added *.cpuprofile to ignore profiling outputs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tokenizer.ts

DecimalTurn added 17 commits February 13, 2026 05:19

WIP: Exploration

3069b09

WIP: Plan — performance improvement roadmap for Phase 2b/2c

ae5a3b9

perf: use charCodeAt in findLines to avoid per-char string allocation

37c1573

perf: precompute candidate keys in validateKey to eliminate repeated …

e3ed269

…concat+join

perf: push directly into target arrays instead of intermediate alloc …

c9d1f7f

…+ merge

perf: replace bare key regex with charCode range checks

86ca83d

perf: use 'key in object' instead of hasOwnProperty.call in has()

bab47a9

perf: defer array concat in trackNestedInlineTables to InlineTable br…

b36cde6

…anch

perf: reuse frozen singleton for done() in Cursor instead of allocati…

2c88e10

…ng per call

perf: skip cloneLocation for ephemeral token locs in parser

dbb155d

docs: mark Phase 2c items #8-10 complete in perf plan

e6a3e9e

CleanUp

06b2834

chore: add profile script to package.json and ignore generated .cpupr…

8004394

…ofile files

DecimalTurn requested a review from Copilot February 13, 2026 05:37

Copilot started reviewing on behalf of DecimalTurn February 13, 2026 05:37 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

src/tokenizer.ts Outdated Show resolved Hide resolved

src/tokenizer.ts Outdated Show resolved Hide resolved

src/tokenizer.ts Outdated Show resolved Hide resolved

src/tokenizer.ts Outdated Show resolved Hide resolved

refactor: Minor edits

314a41c

DecimalTurn marked this pull request as ready for review February 14, 2026 06:20

docs: Update changelog

23d9b25

DecimalTurn merged commit 3187316 into latest Feb 14, 2026
2 checks passed

DecimalTurn changed the title ~~Add profiling and improve performance~~ Add profiling and improve parsing performance Feb 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add profiling and improve parsing performance#105

Add profiling and improve parsing performance#105
DecimalTurn merged 19 commits intolatestfrom
perf-impr2

DecimalTurn commented Feb 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DecimalTurn commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DecimalTurn commented Feb 13, 2026 •

edited

Loading