Perf: cache newline offsets in Source for O(log n) loc conversion#21314
Merged
NullVoxPopuli merged 2 commits intoemberjs:mainfrom Apr 16, 2026
Merged
Perf: cache newline offsets in Source for O(log n) loc conversion#21314NullVoxPopuli merged 2 commits intoemberjs:mainfrom
NullVoxPopuli merged 2 commits intoemberjs:mainfrom
Conversation
Contributor
|
before we look at this, we first need a mitata benchmark for syntax parse/print <3 |
Contributor
|
there is an example of this in the ember-eslint-parser repo |
johanrd
added a commit
to johanrd/ember.js
that referenced
this pull request
Apr 15, 2026
Reproducible side-by-side benchmark of preprocess, normalize (ASTv1 → ASTv2), and full precompile() against a control worktree. Run via: pnpm bench:syntax # standalone pnpm bench:syntax -- --control-dir /path/to/main-checkout Size ladder matches the PR emberjs#21314 body (462 / 1494 / 4482 / 32868 chars, the last sized to match Discourse's admin-user/index.gjs). Fixtures live under bench/fixtures/; large and extra-large are medium × 3 / × 22. Emits mitata's standard ms/iter + boxplot + summary output, plus a µs/char summary table across the size ladder that makes the O(n²) → O(n) flattening visible at a glance.
johanrd
added a commit
to johanrd/ember.js
that referenced
this pull request
Apr 15, 2026
Reproducible side-by-side benchmark of preprocess, normalize (ASTv1 → ASTv2), and full precompile() against a control worktree. Run via: pnpm bench:syntax # standalone pnpm bench:syntax -- --control-dir /path/to/main-checkout Size ladder matches the PR emberjs#21314 body (462 / 1494 / 4482 / 32868 chars, the last sized to match Discourse's admin-user/index.gjs). Fixtures live under bench/fixtures/; large and extra-large are medium × 3 / × 22. Emits mitata's standard ms/iter + boxplot + summary output, plus a µs/char summary table across the size ladder that makes the O(n²) → O(n) flattening visible at a glance.
96ca173 to
aca513f
Compare
Contributor
Author
|
@NullVoxPopuli thanks, added now with an attempt to align with ember-eslint-parser, but please tell if it should be done differently. |
johanrd
added a commit
to johanrd/ember.js
that referenced
this pull request
Apr 16, 2026
Reproducible side-by-side benchmark of preprocess, normalize (ASTv1 → ASTv2), and full precompile() against a control worktree. Run via: pnpm bench:syntax # standalone pnpm bench:syntax -- --control-dir /path/to/main-checkout Size ladder matches the PR emberjs#21314 body (462 / 1494 / 4482 / 32868 chars, the last sized to match Discourse's admin-user/index.gjs). Fixtures live under bench/fixtures/; large and extra-large are medium × 3 / × 22. Emits mitata's standard ms/iter + boxplot + summary output, plus a µs/char summary table across the size ladder that makes the O(n²) → O(n) flattening visible at a glance.
aca513f to
a5c0a3f
Compare
johanrd
added a commit
to johanrd/ember.js
that referenced
this pull request
Apr 16, 2026
Reproducible side-by-side benchmark of preprocess, normalize (ASTv1 → ASTv2), and full precompile() against a control worktree. Run via: pnpm bench:syntax # standalone pnpm bench:syntax -- --control-dir /path/to/main-checkout Size ladder matches the PR emberjs#21314 body (462 / 1494 / 4482 / 32868 chars, the last sized to match Discourse's admin-user/index.gjs). Fixtures live under bench/fixtures/; large and extra-large are medium × 3 / × 22. Emits mitata's standard ms/iter + boxplot + summary output, plus a µs/char summary table across the size ladder that makes the O(n²) → O(n) flattening visible at a glance.
a5c0a3f to
e0431fc
Compare
Contributor
|
thanks for adding the benchmark -- can you Pr that separately so I can run it on main when reviewing this PR? |
johanrd
added a commit
to johanrd/ember.js
that referenced
this pull request
Apr 16, 2026
Reproducible side-by-side benchmark of preprocess, normalize (ASTv1 → ASTv2), and full precompile() against a control worktree. Run via: pnpm bench:syntax # standalone pnpm bench:syntax -- --control-dir /path/to/main-checkout Size ladder matches the PR emberjs#21314 body (462 / 1494 / 4482 / 32868 chars, the last sized to match Discourse's admin-user/index.gjs). Fixtures live under bench/fixtures/; large and extra-large are medium × 3 / × 22. Emits mitata's standard ms/iter + boxplot + summary output, plus a µs/char summary table across the size ladder that makes the O(n²) → O(n) flattening visible at a glance.
2 tasks
Contributor
Author
|
@NullVoxPopuli see #21316 |
hbsPosFor() and charPosFor() were doing a fresh indexOf('\n') scan of
the source on every call, making each lookup O(lines_until_offset).
These are called once per AST node loc by the ASTv2 normalize pass, so
total cost was effectively O(n²) in template size.
CPU profile of a full precompile() showed hbsPosFor dominating at ~28%
of self-time, scattered across many call sites.
Fix: precompute an array of newline offsets on first use, binary-search
it for conversions. O(log n) per call, O(n) to build once per source.
Impact (Node 24, warmed JIT, full precompile()):
real-world template (1494 chars): 1.34ms -> 1.24ms
large template (3520 chars): 4.53ms -> 3.06ms (32%% faster)
The normalize phase specifically (ASTv1 -> ASTv2) drops from ~1.72ms
to ~0.43ms on the large template — a 4x speedup in that phase.
All tests pass.
- charPosFor now returns null for out-of-range lines (lineIdx > newlineOffsets.length) and negative columns, matching its 'number | null' return type. Previously it silently returned column or source.length in those cases. - Adds direct unit tests for Source.hbsPosFor / charPosFor covering empty source, single-char/newline, exact-newline offsets, column-past-line-end clamping, and negative/out-of-range inputs.
e0431fc to
4f15abb
Compare
Contributor
Contributor
📊 Package size report 0.01%↑
|
NullVoxPopuli
approved these changes
Apr 16, 2026
This was referenced Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

While investigating a faster single-pass-parser, claude found an interesting quick-win perf optimization that is parser-independent.
Cowritten by claude:
Cache newline offsets in
SourceA single-file fix to an accidentally-quadratic utility in
@glimmer/syntaxthat dominates compile time on real-world route templates.The problem
Source.hbsPosFor(offset)converts a char offset to{line, column}. The existing implementation scanssource.indexOf('\n')repeatedly from position 0 on every call — O(lines-until-offset) per invocation.charPosFor(the inverse) does the same.The
normalizephase (ASTv1 → ASTv2) calls these once per AST node location. For a template with N nodes and L lines, total work is O(N·L) — effectively O(n²) in template size.This kicks in at template sizes typical of a route (1–25k chars), not at inline components.
The fix
packages/@glimmer/syntax/lib/source/source.ts— precompute an array of'\n'offsets in theSourceconstructor, binary-search it for conversions. O(n) build, O(log n) per call.No API changes. The table is built once per
Sourceinstance and reused across everyhbsPosFor/charPosForcall on that source.Impact
Measured via the mitata harness landed separately in #21316. Run
pnpm build && pnpm bench:precompileon each branch and diff ms/iter. Apple M1 Max, Node 24.14; control = currentmain, experiment = this branch. The bench loadsdist/prod/by default; the dev numbers below came from a manual swap todist/dev/.ember-source's package exports resolvedevelopment→ dev andproduction→ prod, so both tables reflect builds real consumers actually load (vite devgets dev andvite buildgets prod by default — Vite resolves the condition viaNODE_ENV).Per-char cost (µs/char) — the O(n²) → O(n log n) flattening
normalizephase alone (ASTv1 → ASTv2, derived asnormalize − parse):Before rises sharply at
large— the O(n²) is kicking in. After stays essentially flat across this size range.Absolute speedups on large templates
With large Discourse-scale route template at 33374 chars:
Prod build (
vite build, production deploys)preprocess)precompile()Dev build (
vite dev,ember serve, local app development)preprocess)precompile()precompilemeasured end-to-end throughember-template-compiler(the user-facing entry, which runs the core compile plus ember-specific AST transform plugins). The dev build's larger speedup reflects thatDEBUG-mode assertions remain active throughout the dev compile pipeline; some of them interact with the loc machinery this fix accelerates.Who benefits
Every consumer of
@glimmer/syntax:ember-template-compiler.precompile()— the ember-cli / Vite build path, invoked once per.gts/.hbsfile..gtsfiles — each edit re-extracts templates viacontent-tagand calls@glimmer/syntax.preprocessfor ASTv1, then walks ASTv1 directly for type extraction. A 33k-char route template's parse step currently burns ~36 ms, mostly re-scanning newlines; after this fix, ~14 ms.This is pipeline-agnostic — it doesn't matter which parser feeds
Source; every compile goes throughhbsPosFor/charPosFor.Testing
packages/@glimmer/syntax/test/source-boundary-test.tslock in thehbsPosFor/charPosForcontract at boundary positions: empty source, single char, offset at\n, column past line-end, out-of-range line, negative inputs.DEBUG-mode round-trip assertion incharPosFor(preserved from the original implementation) verifiescharPosFor(hbsPosFor(o)) === o.bin/precompile.bench.mjs(landed in add mitata harness for precompile (parse/normalize/precompile) #21316) produced the numbers above.Compat
Sourceconstructor behavior unchanged.number[]of newline offsets perSourceinstance, typically small (one number per line, a few hundred for route templates). Built once in the constructor — a single linear pass oversource.Scope
packages/@glimmer/syntax/lib/source/source.ts— the fix (diff +50 -55 lines).packages/@glimmer/syntax/test/source-boundary-test.ts— new unit tests forhbsPosFor/charPosForboundaries.