Skip to content

Perf/handlebars v2 parser single pass#17

Open
johanrd wants to merge 12 commits intomainfrom
perf/compile-pipeline-exploration
Open

Perf/handlebars v2 parser single pass#17
johanrd wants to merge 12 commits intomainfrom
perf/compile-pipeline-exploration

Conversation

@johanrd
Copy link
Copy Markdown
Owner

@johanrd johanrd commented Apr 15, 2026

Unified single-pass HTML+HBS parser for @glimmer/syntax

Replaces both @handlebars/parser (Jison) and simple-html-tokenizer with a single left-to-right indexOf-based scanner that builds ASTv1 directly — no tokenizer pipeline, no intermediate representation, one pass.

Builds on the index-based cursor parser from #13, which replaces only the HBS layer (Jison → recursive descent) while keeping simple-html-tokenizer. This PR goes further and replaces the HTML layer too.

Exported as unifiedPreprocess() alongside the existing preprocess(). All tests pass.

Benchmark (pnpm bench:precompile)

Apple M1 Max, Node 24.14, prod dist.

phase size main (Jison) index-based this PR (index-based single-pass)
precompile small (1517c) 1.76 ms 1.46 ms (1.2×) 1.26 ms (1.4×)
medium (4551c) 5.36 ms 4.39 ms (1.2×) 3.81 ms (1.4×)
large (33374c) 42.26 ms 35.57 ms (1.2×) 31.14 ms (1.4×)
parse small 592 µs 295 µs (2.0×) 156 µs (3.8×)
medium 1.73 ms 874 µs (2.0×) 488 µs (3.5×)
large 14.68 ms 7.70 ms (1.9×) 3.86 ms (3.8×)
normalize small 747 µs 453 µs (1.6×) 299 µs (2.5×)
medium 2.24 ms 1.34 ms (1.7×) 896 µs (2.5×)
large 18.53 ms 11.32 ms (1.6×) 7.85 ms (2.4×)

The index-based cursor parser (#13) replaces Jison's HBS layer only. This PR extends that by also replacing simple-html-tokenizer, yielding a further ~2× on parse. End-to-end precompile is 1.4× faster than main.

Reproduce: pnpm build && pnpm bench:precompile, compare branches.

Exploration tooling (also included)

Per-phase timing instrumentation used to investigate where compile time goes after parse is no longer the bottleneck:

  • _precompileJSONWithPhaseTiming — instrumented precompileJSON returning {timings: {parse, normalize, pass0, pass2, stringify, total}}.
  • normalizeAST — factored out of normalize() so you can bench normalize without re-parsing.
  • bench-phases.mjs — per-phase benchmark at realistic sizes.

This tooling found the hbsPosFor O(n²) hotspot (~28% self-time at large sizes), fixed separately in emberjs#21314. Also found the double-normalization in callParts(), extracted as johanrd/ember.js#18.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 15, 2026

📊 Package size report   -3.47%↓

File Before (Size / Brotli) After (Size / Brotli)
dist/dev/packages/shared-chunks/compiler-BkQGSgew.js 177.7 kB / 34 kB 24%↑220.8 kB / 24%↑42.2 kB
dist/dev/packages/shared-chunks/transform-resolutions-lE6wZX-U.js 188.6 kB / 38.1 kB -69.2%↓58.1 kB / -66.7%↓12.7 kB
dist/prod/packages/shared-chunks/compiler-ZJEGDERI.js 190.9 kB / 36.4 kB 23%↑234.1 kB / 22%↑44.5 kB
dist/prod/packages/shared-chunks/transform-resolutions-C8A20IBl.js 174.2 kB / 35.3 kB -75%↓43.6 kB / -72.3%↓9.8 kB
types/stable/@glimmer/compiler/lib/compiler.d.ts 697 B / 249 B 92%↑1.3 kB / 71%↑427 B
types/stable/@glimmer/syntax/lib/parser.d.ts 5.1 kB / 1.1 kB
types/stable/@glimmer/syntax/lib/parser/handlebars-node-visitors.d.ts 2.3 kB / 640 B
types/stable/@glimmer/syntax/lib/parser/parser.d.ts 583 B / 274 B
types/stable/@glimmer/syntax/lib/parser/tokenizer-event-handlers.d.ts 5.4 kB / 1.5 kB -29.5%↓3.8 kB / -26.2%↓1.1 kB
types/stable/@glimmer/syntax/lib/v2/normalize.d.ts 2.3 kB / 696 B 18%↑2.7 kB / 17%↑814 B
types/stable/@handlebars/parser/types/ast.d.ts 3.6 kB / 614 B
types/stable/@handlebars/parser/types/index.d.ts 400 B / 183 B
types/stable/ember-template-compiler/minimal.d.ts 343 B / 148 B 33%↑457 B / 26%↑186 B
types/stable/index.d.ts 42.8 kB / 4 kB -0.47%↓42.6 kB / -0.9%↓4 kB
Total (Includes all files) 5.4 MB / 1.3 MB -3.47%↓5.2 MB / -2.9%↓1.2 MB
Tarball size 1.2 MB -3.73%↓1.2 MB

🤖 This report was automatically generated by pkg-size-action

…ract tests

Split into three files by concern:

- parser-escape-test.ts: backslash escape sequences (\{{, \\{{, \\\{{)
  in top-level text, elements, attributes, and unclosed cases.
- parser-whitespace-test.ts: tilde stripping and standalone detection.
- parser-error-test.ts: inputs that must be rejected ({{}}}, {{~}}, {{@}}, etc).

parser-node-test.ts is unchanged.
@johanrd johanrd force-pushed the perf/compile-pipeline-exploration branch from 2b7d50e to 8369d3f Compare April 16, 2026 22:20
@johanrd johanrd changed the title Perf: compile-pipeline exploration (normalize hotspot + bench) Perf/handlebars v2 parser single pass Apr 16, 2026
johanrd added 7 commits April 17, 2026 08:11
Replaces both @handlebars/parser (Jison) and simple-html-tokenizer with
a single left-to-right indexOf-based scanner that builds ASTv1 directly.
No tokenizer pipeline, no intermediate representation, one pass.

- parser.ts: 2507-line unified scanner, exported as unifiedPreprocess()
- tokenizer-event-handlers.ts: simplified to delegate to the unified scanner
- handlebars-node-visitors.ts: removed (logic absorbed into parser.ts)
- @handlebars/parser: package deleted entirely
- bench.mjs: 3-way comparison bench (Jison vs unified, retained for investigation)
- CI/config: updated to remove @handlebars/parser references, add prettier
  snapshot update step for error-message differences

All existing tests pass. Escape handling matches Jison (backslash counting,
emu-state content merging). Error messages aligned where ember tests check
specific text (Expecting 'ID').
Adds _precompileJSONWithPhaseTiming to @glimmer/compiler for benchmarking.
Exposes normalizeAST (the ASTv1→ASTv2 part of normalize without re-parse)
so phases can be measured independently. Expose the timing function
through ember-template-compiler/minimal.ts (real ESM entry).

Findings (bench-phases.mjs): on real-world templates, normalize is 54%
of compile time and pass0 is only 11% — opposite of what code volume
suggested. The ASTv1→ASTv2 transform is the next target, not Pass0.
…00, extra-large ~25000)

Route templates in real apps are nowhere near 25 chars. The previous
size ladder (25 → 3500) didn't stress the compile path enough to
surface the O(n²) normalize behavior that the newline-cache fix
eliminates, nor the IDE-case scaling of Jison.

At realistic sizes the deltas become concrete:

  IDE (parse-only)          Jison       unified-1pass   speedup
  small       (462c)        0.22ms      0.06ms          3.4×
  medium     (1494c)        0.69ms      0.17ms          4.1×
  large      (4482c)        2.45ms      0.57ms          4.3×
  extra-large(25398c)      24.73ms      3.15ms          7.8×

  Full build                Jison       unified-1pass   speedup
  extra-large(25398c)      93.32ms     25.94ms          3.6×

Jison's per-char cost roughly doubles from medium to extra-large
(0.46 → 0.97 μs/char), while unified-1pass holds steady at
~0.12 μs/char. Route templates like the one the user was just
inspecting (~98k chars) will see the biggest impact.
@johanrd johanrd force-pushed the perf/compile-pipeline-exploration branch from b5b1a77 to a27b008 Compare April 17, 2026 06:13
johanrd added 4 commits April 17, 2026 08:18
callParts() was normalizing all parameters twice — once to calculate
paramLoc (line 277), then again for the actual positional builder
(line 282). Reuse the already-computed paramList instead.

Impact: normalize phase 12-22% faster (measured via bench-phases.mjs
on dev build, confirmed on prod build). End-to-end precompile ~6%
faster on large templates.
element.attributes was iterated twice — once filtering for non-@ attrs,
once for @-prefixed args. Combined into a single loop.

Minor win (~5% normalize improvement on medium templates) but clean
code: single allocation, single pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant