Skip to content

[codex] Split parser single-branch path#376

Closed
adamziel wants to merge 1 commit intoexplore/lexing-parsing-10xfrom
explore/parser-rearchitecture
Closed

[codex] Split parser single-branch path#376
adamziel wants to merge 1 commit intoexplore/lexing-parsing-10xfrom
explore/parser-rearchitecture

Conversation

@adamziel
Copy link
Copy Markdown
Collaborator

What changed

This stacked draft PR builds on #375 and gives the parser a dedicated fast path for the common case where FIRST-set dispatch resolves a rule to exactly one candidate branch.

  • Splits single-branch parsing from the multi-branch fallback in WP_Parser::parse_recursive().
  • Keeps the existing multi-branch backtracking path for ambiguous dispatch.
  • Changes WP_Parser_Node::has_child() from count() > 0 to ! empty() for the hot AST completion check.

Why

After #375, branch dispatch commonly resolves directly to a single branch. The parser still paid loop/index bookkeeping designed for multiple candidates. This change duplicates a small amount of branch parsing code so the dominant single-candidate path can avoid that work while keeping the generic parser architecture intact.

This is deliberately still compact: it does not generate a large parser or expand the grammar on disk.

Performance

Benchmarks are noisy on this machine, so I compared this branch against the stacked base branch immediately before opening this PR.

That is roughly a 6% incremental parser improvement on top of #375 in this run.

Parser size constraint

src/parser/*.php plus src/mysql/mysql-grammar.php remains under the requested 200 KB on-disk cap:

  • 95,298 bytes total

Validation

  • git diff --check
  • php -l on modified parser files
  • composer run test -- --filter 'WP_MySQL_(Lexer|Server_Suite_(Lexer|Parser))|WP_Parser_Node'
    • 143 tests
    • 1,421,037 assertions
  • composer run test
    • 667 tests
    • 1,427,673 assertions
    • 2 skipped, 2 incomplete
  • php packages/mysql-on-sqlite/tests/tools/run-parser-benchmark.php

Notes

I also tested broader parser reshapes before settling on this smaller fast path: inline terminal matching, in-memory single-fragment expansion, lightweight fragment arrays, direct branch-array dispatch, and by-reference parser cursor state. Those were slower in this codebase, so they are not included here.

@adamziel
Copy link
Copy Markdown
Collaborator Author

Closing per request; scrapping this parser rearchitecture experiment.

@adamziel adamziel closed this Apr 27, 2026
JanJakes added a commit that referenced this pull request Apr 28, 2026
`! empty( $this->children )` short-circuits without calling `count()`,
saving one function call per invocation.

Co-authored-by: Adam Zieliński <adam@adamziel.com>

Adapted from #376
JanJakes added a commit that referenced this pull request Apr 28, 2026
`! empty( $this->children )` short-circuits without calling `count()`,
saving one function call per invocation.

Co-authored-by: Adam Zieliński <adam@adamziel.com>

Adapted from #376
JanJakes added a commit that referenced this pull request Apr 29, 2026
`! empty( $this->children )` short-circuits without calling `count()`,
saving one function call per invocation.

Co-authored-by: Adam Zieliński <adam@adamziel.com>

Adapted from #376
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant