After the lexer is emitted (the token-IR / char-code-scanner issue), the parser layer is still ~6× the official TS parser. The decomposition of the gap (PR #4) is even: lexer ≈ parser layer, each ~6× tsc.
Monogram's non-recursive rules do longest-match — try every alternative, keep the one that consumes most — and the engine backtracks (save/restore position). tsc is predictive: it commits to one production from a small lookahead with minimal backtracking. The parserharness.ts bench file shows the cost — its parser layer is ~6.8× tsc vs ~5.8× on the other three files (the backtracking signature).
Levers
- When an alternative's FIRST set uniquely identifies it, commit to it — skip the longest-match scan of the remaining alternatives.
- Strengthen first-token dispatch so provably-dead alternatives are never entered (extends the existing
canStartFT / FIRST-set pruning).
Honest ceiling
This is partly the price of a non-deterministic, grammar-derived engine vs a hand-coded predictive parser — it likely won't fully close in JS while staying grammar-derived. Beating tsc is the native-target path (the emitted lexer + the target-agnostic emitter issues), where no-GC + value types + char-code scanning raise the ceiling above V8.
After the lexer is emitted (the token-IR / char-code-scanner issue), the parser layer is still ~6× the official TS parser. The decomposition of the gap (PR #4) is even: lexer ≈ parser layer, each ~6× tsc.
Monogram's non-recursive rules do longest-match — try every alternative, keep the one that consumes most — and the engine backtracks (save/restore position). tsc is predictive: it commits to one production from a small lookahead with minimal backtracking. The
parserharness.tsbench file shows the cost — its parser layer is ~6.8× tsc vs ~5.8× on the other three files (the backtracking signature).Levers
canStartFT/ FIRST-set pruning).Honest ceiling
This is partly the price of a non-deterministic, grammar-derived engine vs a hand-coded predictive parser — it likely won't fully close in JS while staying grammar-derived. Beating tsc is the native-target path (the emitted lexer + the target-agnostic emitter issues), where no-GC + value types + char-code scanning raise the ceiling above V8.