POC: perf(@glimmer/syntax): unified single-pass HTML+HBS scanner (3.2x–4.2x faster parse) by johanrd · Pull Request #14 · johanrd/ember.js

johanrd · 2026-04-14T08:11:42Z

POC Unified single-pass HTML+HBS scanner

Claude continuing the perf work from perf/handlebars-v2-parser. Where that branch replaced the Jison-generated HBS parser with a hand-written recursive descent JS parser (keeping `simple-html-tokenizer` for the HTML layer), this PR replaces both parsers with a single left-to-right indexOf-based scanner that builds ASTv1 directly — no tokenizer pipeline at all.

The current parse path scans every template twice:

`@handlebars/parser` — treats HTML as opaque `ContentStatement` nodes, scans only for `{{}}` mustache syntax
`simple-html-tokenizer` — re-tokenizes all `ContentStatement` content character-by-character to extract HTML element structure

The unified scanner does one left-to-right pass with cursor arithmetic (`indexOf('{{', pos)`, `indexOf('<', pos)`) and builds the full ASTv1 tree — `ElementNode`, `MustacheStatement`, `BlockStatement`, `TextNode`, etc. — without the intermediate representation.

Exports as `unifiedPreprocess()` alongside the existing `preprocess()`.

Four parsers compared

Jison — current Jison-generated `@handlebars/parser` + `simple-html-tokenizer`
v2-parser (perf/handlebars-v2-parser) — hand-written recursive descent HBS parser, keeps `simple-html-tokenizer`
unified-1pass — this PR: single indexOf-based scanner, no tokenizer pipeline
rust/wasm — PR [FEATURE rust-parser] Rust/WASM template parser using pest.rs emberjs/ember.js#21313

All benchmarks: Node 24, warmed JIT, same machine.

IDE case: parse-only (ms/call)

The Glint hot-path: one `preprocess()` call per keystroke in a `.gts` file.

template	chars	Jison	v2-parser	unified-1pass	rust/wasm
small	25	0.0211ms	0.0056ms	0.0049ms	0.0334ms
medium	352	0.1652ms	0.0534ms	0.0435ms	0.4627ms
real-world	1494	0.6082ms	0.1973ms	0.1492ms	3.8217ms
large (10x med)	3520	1.6801ms	0.4935ms	0.4243ms	23.4431ms

Build case: full precompile() → wire format (ms/call)

The ember-cli/Vite path: parse + ASTv2 normalize + opcode encode + wire format.

The unified-1pass column is `unified_parse + (precompile_v2 − preprocess_v2)` — the compile step is identical code in all parsers.

template	chars	Jison	v2-parser	unified-1pass	rust/wasm
small	25	0.0385ms	0.0302ms	0.0295ms	0.0508ms
medium	352	0.4851ms	0.4022ms	0.3922ms	0.7764ms
real-world	1494	1.7992ms	1.4712ms	1.4231ms	4.9425ms
large (10x med)	3520	5.0392ms	4.3665ms	4.2974ms	27.0429ms

Parse vs compile split (medium template)

phase	Jison	v2-parser	unified-1pass	rust/wasm
`preprocess()` only	0.165ms (38%)	0.049ms (14%)	0.041ms (12%)	0.459ms (63%)
compile only (shared)	0.272ms (62%)	0.302ms (86%)	0.302ms (88%)	0.267ms (37%)
total	0.437ms	0.351ms	0.342ms	0.726ms

500-template build projection

Using real-world template timing:

Jison: 900ms (1.799ms × 500)
v2-parser: 736ms (1.471ms × 500) — 1.22x faster than Jison
unified-1pass: 712ms (1.423ms × 500) — 1.26x faster than Jison, 1.03x faster than v2
rust/wasm: 2471ms (4.942ms × 500) — 2.7x slower than Jison

What this shows

The two use cases have very different profiles:

IDE case (parse-only): The unified scanner is 3.8x–4.3x faster than Jison and 1.1x–1.3x faster than v2-parser on real-world templates. The per-keystroke parse cost drops from 0.61ms to 0.15ms on a real-world template. This directly benefits Glint's reparse-on-keystroke hot path.

Build case (full pipeline): The parse improvement is smaller in absolute terms because the compile step (ASTv2 normalization + opcode encoding) costs ~0.30ms regardless of which parser is used. At real-world templates the unified scanner is 1.26x faster end-to-end vs Jison. The v2-parser already captured 1.22x of the build-time gain; unified takes it to 1.26x.

rust/wasm: The JSON bridge (`serde_json::to_string` → `JSON.parse()` → `convertLocations()` walk) is O(AST size), so the gap grows with template size (1.6x slower than Jison at medium → 5.4x at large). The unified scanner is faster than Jison at all sizes without any FFI overhead.

Parse is now 12% of the pipeline for unified vs 38% for Jison and 14% for v2. The compile step dominates, so further parse improvements have diminishing returns on build time — though the IDE case still benefits fully since it's parse-only.

Correctness

All 8778 tests pass, including the WhitespaceControl/standalone-stripping semantics which required careful port of the Handlebars post-pass for block helpers on their own lines and chained `{{else if}}` blocks.

Replace the two-phase parse (Handlebars parser → simple-html-tokenizer) with a single left-to-right indexOf-based scanner that builds ASTv1 directly. Exports unifiedPreprocess() alongside the existing preprocess(). Parse-only speedup (warmed JIT): small 4.2x faster (0.0195ms → 0.0047ms) medium 3.2x faster (0.1569ms → 0.0495ms) real-world 3.7x faster (0.5862ms → 0.1583ms) large 3.9x faster (1.6488ms → 0.4209ms) Full precompile() pipeline (parse + normalize + encode): medium 1.31x faster (0.449ms → 0.342ms) real-world 1.33x faster (1.716ms → 1.288ms) All 8778 tests pass.

github-actions · 2026-04-14T08:14:16Z

📊 Package size report `4%↑`

File	Before (Size / Brotli)	After (Size / Brotli)
`dist/ember-template-compiler.js`	`765.9 kB` / `113.8 kB`	^21%↑`923.9 kB` / ^17%↑`133.3 kB`
`dist/ember-template-compiler.js.map`	`1.5 MB` / `259.7 kB`	^18%↑`1.7 MB` / ^16%↑`301.4 kB`
`dist/packages/@ember/template-compiler/lib/template.js`	`185.3 kB` / `35.8 kB`	^69%↑`312.7 kB` / ^57%↑`56.4 kB`
`dist/packages/ember-template-compiler/index.js`	`466.4 kB` / `92.1 kB`	^26%↑`587.1 kB` / ^21%↑`111 kB`
`dist/packages/shared-chunks/transform-resolutions-xZ1sfMcr.js`	`188.2 kB` / `38 kB`	^-3.55%↓`181.5 kB` / ^-3.96%↓`36.5 kB`
`types/stable/@glimmer/syntax/lib/parser/unified-scanner.d.ts`	—	`814 B` / `376 B`
Total _{(Includes all files)}	`17.1 MB` / `3.2 MB`	^4%↑`17.7 MB` / ^3%↑`3.3 MB`
Tarball size	`3.8 MB`	^3%↑`4 MB`

_{🤖 This report was automatically generated by pkg-size-action}

@0

…error messages parse.js was reverted to Jison (commit 7bbe305) during bug investigation but never re-wired after the v2-parser fixes were complete. Re-enable it. Also fix error messages for invalid @-prefixed paths (@, @0, @1, @@, etc.) to match Jison's "Expecting 'ID'" pattern that the test suite asserts against. All 8778 tests pass.

…anner - Track inverseStart (pos after {{else}}/{{else if}}'s }}) and programEnd (start of {{else}} tag) in BlockFrame so inverse block and default program body get exact source spans matching the reference v2-parser. - Chained blocks ({{else if}}) now end their loc at the start of {{/if}}, consistent with Handlebars AST conventions. - Switch Source import to namespace import (import * as srcApi) to avoid a Rollup circular-dependency TDZ error introduced by the direct import. - Wire unifiedPreprocess as the fast-path in tokenizer-event-handlers.ts preprocess(); falls back to original pipeline only for codemod mode or Source-object inputs. All 8778 tests pass (0 failures, 13 skipped).

johanrd mentioned this pull request Apr 14, 2026

[FEATURE rust-parser] Rust/WASM template parser using pest.rs emberjs/ember.js#21313

Closed

7 tasks

johanrd changed the title ~~perf(@glimmer/syntax): unified single-pass HTML+HBS scanner (3.2x–4.2x faster parse)~~ POC: perf(@glimmer/syntax): unified single-pass HTML+HBS scanner (3.2x–4.2x faster parse) Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: perf(@glimmer/syntax): unified single-pass HTML+HBS scanner (3.2x–4.2x faster parse)#14

POC: perf(@glimmer/syntax): unified single-pass HTML+HBS scanner (3.2x–4.2x faster parse)#14
johanrd wants to merge 3 commits intoperf/handlebars-v2-parserfrom
perf/handlebars-v2-parser-single-pass

johanrd commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johanrd commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

POC Unified single-pass HTML+HBS scanner

Four parsers compared

IDE case: parse-only (ms/call)

Build case: full precompile() → wire format (ms/call)

Parse vs compile split (medium template)

500-template build projection

What this shows

Correctness

Uh oh!

github-actions bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Package size report 4%↑

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

johanrd commented Apr 14, 2026 •

edited

Loading

github-actions bot commented Apr 14, 2026 •

edited

Loading

📊 Package size report `4%↑`