Add a hook for native MySQL lexer and parser implementations#384
Add a hook for native MySQL lexer and parser implementations#384
Conversation
JanJakes
left a comment
There was a problem hiding this comment.
@adamziel The changes here are clear. I just have two API/naming observations:
-
The "polyfill" naming feels a bit misleading. In many use cases, and especially in the plugin context, the PHP classes are the source of truth and the only possible implementation.
-
Related to 1, is the
WP_MySQL_Parser extends WP_MySQL_Polyfill_Parserpart necessary? Would it be possible to just require the existing classes conditionally in https://github.com/WordPress/sqlite-database-integration/blob/trunk/packages/mysql-on-sqlite/src/load.php without even moving and renaming them:
if ( class_exists( 'WP_MySQL_Native_Parser', false ) ) {
class WP_MySQL_Parser extends WP_MySQL_Native_Parser {}
} else {
require_once __DIR__ . '/mysql/class-wp-mysql-parser.php';
}Or, if both have the same class name, just:
if ( ! class_exists( 'WP_MySQL_Parser', false ) ) {
require_once __DIR__ . '/mysql/class-wp-mysql-parser.php';
}That way, the existing code would remain largely as-is, and we'd just add some native extension entrypoint.
Btw. If we need to stick with renames, I don't know why, but GitHub diff doesn't see them as renames. Perhaps that's fixable for better history and blames?
91e6d80 to
7ad17f2
Compare
|
@JanJakes I like this option: if ( class_exists( 'WP_MySQL_Native_Parser', false ) ) {
class WP_MySQL_Parser extends WP_MySQL_Native_Parser {}
} else {
require_once __DIR__ . '/mysql/class-wp-mysql-parser.php';
}It's implemented now :) I'll fix PHPCS and we may be good to go here |
Add a small entrypoint in load.php that swaps in WP_MySQL_Native_Lexer / WP_MySQL_Native_Parser when a native extension has pre-declared them, and otherwise falls back to the pure-PHP implementations shipped in this package. The PHP lexer and parser stay where they are with their original class names. No file moves, no polyfill renames. Native extensions plug in by declaring WP_MySQL_Native_* before this file is loaded.
7ad17f2 to
97c3ebd
Compare
## Summary - reuse one `WP_MySQL_Parser` instance inside the SQLite driver and reset its token stream per query - add `reset_tokens()` to the PHP parser polyfill and the Rust native parser - restore native parser-node accessor fast paths in `WP_MySQL_Native_Parser_Node`, while keeping PHP child materialization for mutation - fix the local native extension build helper for Nix/libclang bindgen by undefining `__SSE2__` during binding generation ## Stack This is the top PR in the native MySQL lexer/parser stack. The stack is split so each GitHub diff shows one reviewable concern: 1. [#384 Extract MySQL lexer and parser polyfills](#384) - `trunk` -> `codex/native-parser-php-facade` - extraction-only PHP refactor - moves the existing PHP lexer/parser implementations into polyfill classes - keeps public `WP_MySQL_Lexer` and `WP_MySQL_Parser` as thin PHP subclasses 2. [#385 Add optional native parser routing](#385) - `codex/native-parser-php-facade` -> `codex/native-parser-class-routing` - adds fallback `WP_MySQL_Native_*` PHP classes - routes the public lexer/parser classes through native classes when the Rust extension provides them - adds the minimal PHP grammar-export bridge for the native parser 3. [#386 Add lazy native parser node facade](#386) - `codex/native-parser-class-routing` -> `codex/native-parser-node-facade` - keeps `WP_Parser_Node` as the plain PHP tree node - adds `WP_MySQL_Native_Parser_Node extends WP_Parser_Node` for native-backed lazy AST nodes - keeps native AST handles and native accessor delegation out of the base node class 4. [#381 Add lazy native AST facade](#381) - `codex/native-parser-node-facade` -> `codex/native-lazy-ast-facade` - implements the Rust lexer/parser extension and lazy native AST facade - makes the Rust extension instantiate `WP_MySQL_Native_Parser_Node` - adds native-extension CI coverage for the SQLite driver and WordPress PHPUnit tests - includes the local SQLite facade smoke benchmark 5. [#387 Cache native grammar on parser grammar object](#387) - `codex/native-lazy-ast-facade` -> `codex/native-parser-object-grammar-cache` - restores the object-attached native grammar cache - adds only `WP_Parser_Grammar::$native_grammar` on the PHP side - removes the Rust content-hash cache that walked the whole exported grammar on every parser construction 6. This PR, [#388 Speed up native AST materialization](#388) - `codex/native-parser-object-grammar-cache` -> `codex/native-parser-bulk-materialization` - optimizes native-to-PHP AST access after the grammar-cache performance restoration - reuses the SQLite driver's parser instance instead of constructing it per query ## Why The native lexer/parser itself is fast, but the PHP-facing path can lose that benefit if each query repeatedly rebuilds native parser state or forces full PHP AST materialization. On the current stack, #387 already removes the large grammar export/hash cost. This PR removes the remaining per-query parser construction churn and restores the native AST accessor path for descendant-heavy SQLite driver workloads. ## Measurements Environment: local PHP 8.2 via the native build helper, release Rust extension, current top of this PR. Focused constructor/reset benchmark over 5000 unique SELECT queries: | Phase | Time | | --- | ---: | | native tokenize | 22.62 us/query | | fresh native parser constructor only | 2.31 us/query | | reusable parser `reset_tokens()` only | 0.32 us/query | | reusable parser reset + parse + `get_descendants()` | 157.06 us/query | | constructor/reset ratio | 7.3x | The previously reported ~622 us/query constructor cost does not reproduce on this stack because #387 already caches the native grammar on the PHP grammar object. Parser reuse still removes most of the remaining constructor overhead. SQLite facade smoke workload: Command: ```bash TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh ``` | Workload | PHP fallback | Native extension | Speedup | | --- | ---: | ---: | ---: | | 250 generated queries, including 1 x 2000-row insert | 4.060s | 0.525s | 7.73x | ## Testing - `cargo fmt --check` - `git diff --check` - `composer run check-cs` - `composer run test` from `packages/mysql-on-sqlite` - `php -d extension=packages/mysql-on-sqlite/ext/wp-mysql-parser/target/release/libwp_mysql_parser.so packages/mysql-on-sqlite/vendor/bin/phpunit -c packages/mysql-on-sqlite/phpunit.xml.dist` - `TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh`
Summary
Add a small entry point in
load.phpso an optional native (e.g. Rust) MySQL lexer/parser extension can plug itself in asWP_MySQL_Lexer/WP_MySQL_Parser. The extension pre-declaresWP_MySQL_Native_Lexer/WP_MySQL_Native_Parserbefore this file loads; otherwise the existing pure-PHP classes ship as the implementation, unchanged.The existing PHP files keep their original names, locations, and class names — no rename, no polyfill scaffolding. Native shims live in a separate
mysql/native/directory and are only required when the native classes are present.Testing
php -lon changed filesWP_MySQL_LexerandWP_MySQL_Parserinstantiate and parseSELECT 1end-to-end with no native extension loadedphpcsclean