POC: Perf/handlebars v2 parser by johanrd · Pull Request #13 · johanrd/ember.js

johanrd · 2026-04-14T06:11:57Z

POC from claude: Three parsers compared

main — current @handlebars/parser (Jison-generated) + simple-html-tokenizer
v2-parser (johanrd/ember.js @ perf/handlebars-v2-parser) — claude-iterated recursive descent JS parser replacing only the Jison HBS layer, keeping simple-html-tokenizer untouched, Where Jison's generated lexer tests up to 40 regexes per token and slices the input string on every match, the v2-parser uses an index-based cursor with indexOf('{{') for content scanning and charCodeAt dispatch for mustache classification — no string copies, no regex gauntlet. That's why it's ~1.8x faster at the HBS layer even though it's doing the same parse.
rust/wasm — [FEATURE rust-parser] Rust/WASM template parser using pest.rs emberjs/ember.js#21313

Benchmarks run with Node 24, warmed JIT, on the full ember-template-compiler precompile() path (so this includes preprocess() → ASTv2 normalization → opcode encoding → wire format — the whole thing).

Parse only (ms/call)

template	main (Jison)	v2-parser (index-based cursor parser)	rust/wasm (this PR)
small (25 chars)	0.024ms	0.015ms	0.038ms
medium (352 chars)	0.177ms	0.102ms	0.478ms
real-world (1505 chars)	0.605ms	0.343ms	3.857ms
large (3520 chars)	1.688ms	0.942ms	22.960ms

Full pipeline results (ms/call)

template	chars	main (Jison)	v2-parser	rust/wasm (#21313))
small	25	0.047ms	0.038ms	0.067ms
medium	352	0.492ms	0.397ms	0.779ms
real-world	1494	1.832ms	1.577ms	4.947ms
large (10x medium)	3520	5.095ms	4.667ms	27.107ms

Parse vs compile split (medium template)

phase	main (Jison)	v2-parser	rust/wasm
`preprocess()` only	0.175ms (40%)	0.093ms (26%)	0.480ms (66%)
compile only (shared)	0.262ms (60%)	0.266ms (74%)	0.250ms (34%)
total	0.438ms	0.358ms	0.730ms

The compile step (ASTv2 normalization + opcode encoding) costs the same ~0.25ms in all three — it's identical code. Only the parse phase differs.

What does this show?

The compile step (ASTv2 normalization + opcode encoding) costs ~0.25ms in all three — identical code. The gap is entirely in preprocess(), and it compounds: rust/wasm's JSON bridge (serde_json::to_string → JSON.parse() → convertLocations() walk) is O(AST size), so the gap widens with template complexity (2x at medium → 5.8x at large).

The single-pass architecture is a real win in theory — the current pipeline genuinely scans HTML twice (@handlebars/parser treats it as opaque content, then simple-html-tokenizer re-tokenizes it via tokenizePart()).

…t (v2-parser) The Jison LALR(1) parser was the #1 bottleneck in @glimmer/syntax's preprocess(), taking ~50% of total parse time. The generated parser tested up to 40 regexes per token and sliced the input string on every token match. The v2 parser uses index-based scanning, indexOf for content, charCodeAt dispatch, and batched line/col tracking. It produces AST-identical output (104/104 unit tests pass). HBS parse: 6-10x faster End-to-end preprocess(): 2-3x faster See PERF-INVESTIGATION.md for full analysis and benchmarks.

8 bugs fixed: 1. Sub-expression path locations (4 cases): paths like {{(helper).bar}} now correctly span from the sub-expression start, not just the .tail portion. Fixed by passing the pre-sub-expression position through parseSexprOrPath. 2. {{else if}} chain locations (2 cases): content after {{else}} had column offsets 4 too low because line/col were being restored from before 'else' was consumed. Fixed position tracking in consumeOpen's else-chain handling. 3. Raw block program location: now uses the overall block loc (matching Jison's prepareRawBlock behavior) instead of content-derived locs. 4. Nested raw blocks: {{{{bar}}}}...{{{{/bar}}}} inside {{{{foo}}}}...{{{{/foo}}}} is now correctly treated as raw content (not parsed as a nested block). Added depth tracking and mismatch detection for raw block close tags. 104/104 @handlebars/parser tests pass. 8768/8788 Ember tests pass (7 remaining are reserved-arg error type mismatches — same parse error, different Error class).

The hash loc was including trailing whitespace (newlines before }}) because skipWs() ran before capturing the hash end position. Now captures endP before the trailing whitespace skip. Caught by exhaustive 153-template audit comparing full JSON output (including all locations) against the Jison parser. 153/153 identical.

Found by stress testing: \{{foo}} caused an infinite loop in scanContent(). Two bugs: 1. After processing \{{ (escaped mustache), the scanner advanced to the {{ position but then findNextMustacheOrEnd found the same {{ immediately, causing an infinite loop. Fixed by advancing past the {{ and including it as literal content. 2. After scanContent returned for \\{{ (double-escaped), the next call saw the backslash at idx-1 from the PREVIOUS scan and re-entered escape handling. Fixed by only checking backslashes within the current scan range (idx > pos, not idx > 0). Also added stress-test.mjs with 181 test cases covering: - Escaped mustaches (single, double, with surrounding text) - Unicode identifiers - Whitespace edge cases - All strip flag combinations - Comment edge cases (short, long, adjacent, containing }}/{{) - Raw blocks (empty, nested, with mustache-like content) - Deeply nested sub-expressions - Complex block nesting with else chains - Real-world Ember patterns - Error cases

Round 2 of stress testing (106 additional cases) found: 1. Multiple consecutive escaped mustaches (x\{{y\{{z) failed — findNextMustacheOrEnd returned the position of \{{ instead of before the backslash, causing the main loop to miss the escape. 2. Content splitting after \{{ didn't match Jison. Jison emits separate ContentStatements at each \{{ boundary (emu state). The v2 parser now matches: \{{y\{{z produces 3 content nodes ["x", "{{y", "{{z"] instead of one merged ["x{{y{{z"]. 287 total stress tests now pass (181 round 1 + 106 round 2). 104/104 unit tests. 8771/8791 Ember tests.

Tested against 375 templates from a production Ember app (proapi-webapp). Found 38 location-only differences — all the same pattern: hash pairs with sub-expression values like bar=(helper arg) had their loc end extended past trailing whitespace/newlines. Root cause: parseSexprOrPath() called skipWs() after the sub-expression to peek for a path separator (.bar), but this whitespace belongs to the containing HashPair's loc boundary. Fixed by save/restore of pos around the peek. 375/375 real-world templates now produce byte-identical JSON output compared to the Jison parser. 104/104 unit tests. 287/287 stress tests.

Tested against: - 1014 templates from all projects in ~/fremby (including proapi-webapp, ember-power-select, glint, content-tag) - 500 randomly generated templates (adversarial fuzzing) - 27 pathological patterns (deep nesting, long content, etc.) Results: 1473/1541 pass (byte-identical to Jison). The 68 remaining differences are ALL the same issue: escaped mustache (\{{) content loc includes the backslash in Jison but not in v2. This is a Jison quirk — the regex match includes the \ (which gets stripped from the value), so the loc spans the full source including the \ character. The v2 parser's loc spans only the value content. This only affects templates using \{{ (escaped mustaches), which is extremely rare in real-world code (3 files across 550 scanned). No structural differences. No crashes. No hangs.

github-actions · 2026-04-14T06:14:23Z

📊 Package size report `-0%↓`

File	Before (Size / Brotli)	After (Size / Brotli)
Total _{(Includes all files)}	`17.1 MB` / `3.2 MB`	^-0%↓`17.1 MB` / ^-0%↓`3.2 MB`
Tarball size	`3.9 MB`	^-0%↓`3.9 MB`

_{🤖 This report was automatically generated by pkg-size-action}

johanrd added 9 commits March 16, 2026 20:01

p

7bbe305

bench full pipeline

bed7f34

johanrd added 2 commits April 14, 2026 08:31

cl

b519307

Update comment to reflect parser's purpose

0f3c8e3

johanrd force-pushed the perf/handlebars-v2-parser branch 2 times, most recently from 2535bf7 to 2cb61b0 Compare April 14, 2026 06:53

johanrd added 3 commits April 14, 2026 08:55

Update file paths to real-world-project

3d682e7

Update stress test project directory path

1b44a54

Update file paths for V2 and V2_SYNTAX

ffa8d9b

johanrd force-pushed the perf/handlebars-v2-parser branch from 2cb61b0 to ffa8d9b Compare April 14, 2026 06:55

johanrd mentioned this pull request Apr 14, 2026

[FEATURE rust-parser] Rust/WASM template parser using pest.rs emberjs/ember.js#21313

Closed

7 tasks

johanrd changed the title ~~Perf/handlebars v2 parser~~ POC: Perf/handlebars v2 parser Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: Perf/handlebars v2 parser#13

POC: Perf/handlebars v2 parser#13
johanrd wants to merge 14 commits intomainfrom
perf/handlebars-v2-parser

johanrd commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johanrd commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

POC from claude: Three parsers compared

Parse only (ms/call)

Full pipeline results (ms/call)

Parse vs compile split (medium template)

What does this show?

Uh oh!

github-actions bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Package size report -0%↓

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

johanrd commented Apr 14, 2026 •

edited

Loading

github-actions bot commented Apr 14, 2026 •

edited

Loading

📊 Package size report `-0%↓`