New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Server crash with transparent retries when executing multiple statements in ysqlsh command mode #21361
Comments
Thanks @pao214 for raising this! There are 2 bugs here: First, the The fatal isn't expected because of the following reasoning: we don't register an internal savepoint for single statement transactions i.e., those outside a transaction block in RC. RC retries such single-statement transactions by just restarting and retrying the whole transaction instead of rolling back that statement and redoing it. In other words, single statement transactions are retried by the Even if the above mentioned issue with |
RCA: A few facts:
After the above, a select on test2 doesn't return (6, 6), because both statements were considered part of the same txn. However, the following results in (6, 6) to be inserted even though the 2nd insert faces a duplicate key error:
Given this, we have the following issues in RC: Top-level query layer retries for multi-statement queries should not be performed because it will result in redoing all the queries from the start. If there are no intervening "commit;" statements this is okay since it would be one implicit transaction. Otherwise, it can result in wrong results because the previous statements which have committed will also be re-executed. So, we should block query layer retries for multi-statement queries. In future, we can try to implement retries for multi-statement queries. This will require implementing the retry logic per statement within exec_simple_query similar to |
Summary: Query layer retries should be blocked for multi-statement queries. This is because they might have transaction blocks within them that had committed before a statement faces a kConflict, and a query layer retry would redo the whole query and hence the transaction block too, which might not be idempotent. Something like the below where the second update faces a kConflict: "begin ... repeatable read; update ...; commit; update ...;" This diff also does cosmetic changes to log messages related to query layer retries and error messages that are sent back to the external client. Jira: DB-10258 Test Plan: Jenkins Added yb_query_layer_retries_for_a_multi_statement_query to the yb_pg_isolation_schedule Reviewers: tfoucher, patnaik.balivada, ishan.chhangani, shubhankar.shastri Reviewed By: tfoucher Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D33860
…tatement queries Summary: Query layer retries should be blocked for multi-statement queries. This is because they might have transaction blocks within them that had committed before a statement faces a kConflict, and a query layer retry would redo the whole query and hence the transaction block too, which might not be idempotent. Something like the below where the second update faces a kConflict: "begin ... repeatable read; update ...; commit; update ...;" This diff also does cosmetic changes to log messages related to query layer retries and error messages that are sent back to the external client. Jira: DB-10258 Original commit: b72433e / D33860 Test Plan: Jenkins Added yb_query_layer_retries_for_a_multi_statement_query to the yb_pg_isolation_schedule Reviewers: tfoucher, patnaik.balivada Reviewed By: patnaik.balivada Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34633
Jira Link: DB-10258
Description
The query layer supports multiple statements in one query message when using the simple query message protocol. See the https://www.postgresql.org/docs/11/protocol-flow.html#PROTOCOL-FLOW-MULTI-STATEMENT for more information.
We can trigger this mode using ./bin/ysqlsh -c "<... specify your ; separated commands ...>"
These commands are executed within an implicit transactional block, i.e. the expectation is that the commands are executed atomically. However, the commands may contain transactional statements such as COMMIT. The implementation details do not play well with READ COMMITTED internal subtransaction mechanism.
Issue
Let's take a look at an example scenario where this is problematic
Setup
Now, we execute concurrent conflicting transactions
Expected Behavior
Notice that there is a transaction conflict at SELECT FOR UPDATE when session 1 commits. We hope to retry this statement transparently and return ten
1
s.Potential Cause
The server crash happens in debug mode because of an assert statement in our code that checks whether or not we are within the internal subtransaction started by READ COMMITTED (for the purposes of transparent restarts). Naturally, we suspect that the COMMIT command pops out the internal subtxn. Moreover, we suspect that a new subtxn is not created for each statement in the batch but only at the start of the batch. This leaves us without an internal subtransaction when trying to rollback the statement for a subsequent retry.
Other Considerations
Please lookout for issue #21297 since the READ COMMITTED mode there fails for a similar reason. Our issue executes batches using the simple query protocol while the related issue #21297 uses the extended query protocol. So, instead of missed updates, we should instead lookout for duplicate updates. Finally, good to verify transparent retries in REPEATABLE READ isolation as well.
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: