Skip to content

Conversation

adamziel
Copy link
Collaborator

@adamziel adamziel commented Oct 9, 2025

Proposes a WP_MySQL_Naive_Query_Stream to enable stream-processing large SQL files one query at a time without running out of memory.

Usage:

$stream = new WP_MySQL_Naive_Query_Stream();
$stream->append_sql( 'SELECT id FROM users; SELECT * FROM posts;' );
while ( $stream->next_query() ) {
    $sql_string = $stream->get_query();
    // Process the query.
 }

$stream->append_sql( 'CREATE TABLE users (id INT, name VARCHAR(255));' );
while ( $stream->next_query() ) {
     $sql_string = $stream->get_query();
     // Process the query.
}

$stream->mark_input_complete();
$stream->next_query(); // returns false

This class is naive because it doesn't understand what a valid query is.

We assume an invalid query if we can't get the next token and the input source is already exhausted or we have over 2MB of buffered SQL. We can't do better until the lexer provides an explicit distinction between syntax errors and incomplete input. I expect this heuristic to be sufficient in many scenarios, but it will of course fail in pathological cases such as SELECT SELECT SELECT ... without any semicolons.

Related to Automattic/wp-cli-sqlite-command#13

Remaining work

Review this PR, reformat code, add some more comments.

cc @JanJakes @sejas

@adamziel adamziel marked this pull request as draft October 9, 2025 15:21
@JanJakes
Copy link
Member

JanJakes commented Oct 9, 2025

@adamziel Thanks for sharing the draft and APIs!

This class is naive because it doesn't understand what a valid query is.
We can't do better until the lexer provides an explicit distinction between syntax errors and incomplete input.

Interesting! I was only thinking of implementing the full support in the lexer. I don't think it's very hard, but probably not very easy either.

That said, do you think it makes sense to get it in in the "naive" form? With the quick fixes I did today on the WP CLI SQLite side, covered with tests, it supports everything apart from NO_BACKSLASH_ESCAPES and DELIMITER .... We have no other use case yet, so it's likely not super urgent, but eventually necessary for sure.

@adamziel
Copy link
Collaborator Author

adamziel commented Oct 9, 2025

Yeah if there isn't a use-case, I think it's fine for this to sit here until one emerges. It likely will as a part of the streaming importer work, and we may need to explore a non-naive query stream implementation for that.

@JanJakes
Copy link
Member

@adamziel If we run into another issue with the current WP SQLite CLI parsing, I'll definitely use this in some form. Also, when moving the CLI commands to the SQLite repo, it could make sense to give this a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants