Skip to content

Add a hook for native MySQL lexer and parser implementations#384

Merged
adamziel merged 1 commit intotrunkfrom
codex/native-parser-php-facade
Apr 30, 2026
Merged

Add a hook for native MySQL lexer and parser implementations#384
adamziel merged 1 commit intotrunkfrom
codex/native-parser-php-facade

Conversation

@adamziel
Copy link
Copy Markdown
Collaborator

@adamziel adamziel commented Apr 28, 2026

Summary

Add a small entry point in load.php so an optional native (e.g. Rust) MySQL lexer/parser extension can plug itself in as WP_MySQL_Lexer / WP_MySQL_Parser. The extension pre-declares WP_MySQL_Native_Lexer / WP_MySQL_Native_Parser before this file loads; otherwise the existing pure-PHP classes ship as the implementation, unchanged.

The existing PHP files keep their original names, locations, and class names — no rename, no polyfill scaffolding. Native shims live in a separate mysql/native/ directory and are only required when the native classes are present.

Testing

  • php -l on changed files
  • Smoke: WP_MySQL_Lexer and WP_MySQL_Parser instantiate and parse SELECT 1 end-to-end with no native extension loaded
  • phpcs clean

@adamziel adamziel changed the title Add native parser PHP facade Extract MySQL lexer and parser polyfills Apr 28, 2026
Copy link
Copy Markdown
Member

@JanJakes JanJakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamziel The changes here are clear. I just have two API/naming observations:

  1. The "polyfill" naming feels a bit misleading. In many use cases, and especially in the plugin context, the PHP classes are the source of truth and the only possible implementation.

  2. Related to 1, is the WP_MySQL_Parser extends WP_MySQL_Polyfill_Parser part necessary? Would it be possible to just require the existing classes conditionally in https://github.com/WordPress/sqlite-database-integration/blob/trunk/packages/mysql-on-sqlite/src/load.php without even moving and renaming them:

if ( class_exists( 'WP_MySQL_Native_Parser', false ) ) {
    class WP_MySQL_Parser extends WP_MySQL_Native_Parser {}
} else {
    require_once __DIR__ . '/mysql/class-wp-mysql-parser.php';
}

Or, if both have the same class name, just:

if ( ! class_exists( 'WP_MySQL_Parser', false ) ) {
    require_once __DIR__ . '/mysql/class-wp-mysql-parser.php';
}

That way, the existing code would remain largely as-is, and we'd just add some native extension entrypoint.

Btw. If we need to stick with renames, I don't know why, but GitHub diff doesn't see them as renames. Perhaps that's fixable for better history and blames?

@adamziel adamziel force-pushed the codex/native-parser-php-facade branch from 91e6d80 to 7ad17f2 Compare April 30, 2026 11:52
@adamziel adamziel marked this pull request as ready for review April 30, 2026 11:57
@adamziel adamziel added the enhancement New feature or request label Apr 30, 2026
@adamziel
Copy link
Copy Markdown
Collaborator Author

@JanJakes I like this option:

if ( class_exists( 'WP_MySQL_Native_Parser', false ) ) {
    class WP_MySQL_Parser extends WP_MySQL_Native_Parser {}
} else {
    require_once __DIR__ . '/mysql/class-wp-mysql-parser.php';
}

It's implemented now :) I'll fix PHPCS and we may be good to go here

Add a small entrypoint in load.php that swaps in WP_MySQL_Native_Lexer /
WP_MySQL_Native_Parser when a native extension has pre-declared them, and
otherwise falls back to the pure-PHP implementations shipped in this package.

The PHP lexer and parser stay where they are with their original class
names. No file moves, no polyfill renames. Native extensions plug in by
declaring WP_MySQL_Native_* before this file is loaded.
@adamziel adamziel force-pushed the codex/native-parser-php-facade branch from 7ad17f2 to 97c3ebd Compare April 30, 2026 11:59
Copy link
Copy Markdown
Member

@JanJakes JanJakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamziel Amazing, thanks!

@adamziel adamziel changed the title Extract MySQL lexer and parser polyfills Add a hook for native MySQL lexer and parser implementations Apr 30, 2026
@adamziel adamziel merged commit ff36738 into trunk Apr 30, 2026
16 checks passed
adamziel added a commit that referenced this pull request Apr 30, 2026
## Summary
- reuse one `WP_MySQL_Parser` instance inside the SQLite driver and
reset its token stream per query
- add `reset_tokens()` to the PHP parser polyfill and the Rust native
parser
- restore native parser-node accessor fast paths in
`WP_MySQL_Native_Parser_Node`, while keeping PHP child materialization
for mutation
- fix the local native extension build helper for Nix/libclang bindgen
by undefining `__SSE2__` during binding generation

## Stack
This is the top PR in the native MySQL lexer/parser stack. The stack is
split so each GitHub diff shows one reviewable concern:

1. [#384 Extract MySQL lexer and parser
polyfills](#384)
   - `trunk` -> `codex/native-parser-php-facade`
   - extraction-only PHP refactor
- moves the existing PHP lexer/parser implementations into polyfill
classes
- keeps public `WP_MySQL_Lexer` and `WP_MySQL_Parser` as thin PHP
subclasses

2. [#385 Add optional native parser
routing](#385)
- `codex/native-parser-php-facade` ->
`codex/native-parser-class-routing`
   - adds fallback `WP_MySQL_Native_*` PHP classes
- routes the public lexer/parser classes through native classes when the
Rust extension provides them
   - adds the minimal PHP grammar-export bridge for the native parser

3. [#386 Add lazy native parser node
facade](#386)
- `codex/native-parser-class-routing` ->
`codex/native-parser-node-facade`
   - keeps `WP_Parser_Node` as the plain PHP tree node
- adds `WP_MySQL_Native_Parser_Node extends WP_Parser_Node` for
native-backed lazy AST nodes
- keeps native AST handles and native accessor delegation out of the
base node class

4. [#381 Add lazy native AST
facade](#381)
   - `codex/native-parser-node-facade` -> `codex/native-lazy-ast-facade`
- implements the Rust lexer/parser extension and lazy native AST facade
   - makes the Rust extension instantiate `WP_MySQL_Native_Parser_Node`
- adds native-extension CI coverage for the SQLite driver and WordPress
PHPUnit tests
   - includes the local SQLite facade smoke benchmark

5. [#387 Cache native grammar on parser grammar
object](#387)
- `codex/native-lazy-ast-facade` ->
`codex/native-parser-object-grammar-cache`
   - restores the object-attached native grammar cache
   - adds only `WP_Parser_Grammar::$native_grammar` on the PHP side
- removes the Rust content-hash cache that walked the whole exported
grammar on every parser construction

6. This PR, [#388 Speed up native AST
materialization](#388)
- `codex/native-parser-object-grammar-cache` ->
`codex/native-parser-bulk-materialization`
- optimizes native-to-PHP AST access after the grammar-cache performance
restoration
- reuses the SQLite driver's parser instance instead of constructing it
per query

## Why
The native lexer/parser itself is fast, but the PHP-facing path can lose
that benefit if each query repeatedly rebuilds native parser state or
forces full PHP AST materialization. On the current stack, #387 already
removes the large grammar export/hash cost. This PR removes the
remaining per-query parser construction churn and restores the native
AST accessor path for descendant-heavy SQLite driver workloads.

## Measurements
Environment: local PHP 8.2 via the native build helper, release Rust
extension, current top of this PR.

Focused constructor/reset benchmark over 5000 unique SELECT queries:

| Phase | Time |
| --- | ---: |
| native tokenize | 22.62 us/query |
| fresh native parser constructor only | 2.31 us/query |
| reusable parser `reset_tokens()` only | 0.32 us/query |
| reusable parser reset + parse + `get_descendants()` | 157.06 us/query
|
| constructor/reset ratio | 7.3x |

The previously reported ~622 us/query constructor cost does not
reproduce on this stack because #387 already caches the native grammar
on the PHP grammar object. Parser reuse still removes most of the
remaining constructor overhead.

SQLite facade smoke workload:

Command:

```bash
TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh
```

| Workload | PHP fallback | Native extension | Speedup |
| --- | ---: | ---: | ---: |
| 250 generated queries, including 1 x 2000-row insert | 4.060s | 0.525s
| 7.73x |

## Testing
- `cargo fmt --check`
- `git diff --check`
- `composer run check-cs`
- `composer run test` from `packages/mysql-on-sqlite`
- `php -d
extension=packages/mysql-on-sqlite/ext/wp-mysql-parser/target/release/libwp_mysql_parser.so
packages/mysql-on-sqlite/vendor/bin/phpunit -c
packages/mysql-on-sqlite/phpunit.xml.dist`
- `TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants