-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spanner): inline begin transaction for ReadWriteTransactions #7149
Conversation
256de2a
to
5c184f0
Compare
765f703
to
a7410ac
Compare
da2d8b5
to
c94939b
Compare
@@ -268,7 +268,6 @@ func TestReadWriteTransaction_ErrorReturned(t *testing.T) { | |||
requests := drainRequestsFromServer(server.TestSpanner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add test cases for known corner cases with InlineBeginTransaction, so at least for the following cases:
- Verify that everything works as expected if a read/write transaction executes two queries in parallel as the first statement. Only one should include a
Begin
, and the other should block until the first has returned a transaction. - What happens if the first statement in a transaction returns an error?
- What happens if the first statement in a transaction is a BatchDML, and the first N statements succeed, but statement N+1 fails?
- What happens if the first statement in a transaction is a Read (not Query)?
- What happens if the first statement in a transaction is a query or read operation that initially succeeds, but then later fails with an
Unavailable
error while consuming the stream ofPartialResultSet
s? The request will be retried, but it should not include aBeginTransaction
option, as a transaction was already returned by the initial request.
…allel as the first statement, transactionID used should be same and the other should block until the first has returned a transactionID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big change. Generally, it looks good to me. My major concern is that do we understand what is the performance impact with this change? Do we have any plan to run benchmarks to compare with or without this change?
} | ||
} | ||
|
||
func TestClient_ReadWriteTransaction_BatchDmlWithErrorOnSecondStatement(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also verify the correct behavior if the first statement is invalid and the Batch DML request is the first request in the transaction. In that case the transaction should also be restarted with an explicit BeginTransaction RPC. Otherwise, the Batch DML request won't be part of the transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, added the test case.
@rahul2393 It seems that we are not adding any tests for |
} | ||
} | ||
|
||
func TestClient_ReadWriteTransaction_BatchDmlWithErrorOnFirstStatement(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is good (although an explicit BeginTransaction RPC is actually not needed in these cases). My question is actually if the error is not retryable?
So you have the following (naive) transaction (pseudo code):
_, err := tx.update("insert into foo (id, value) values (1, 'one')");
if err.Code == Code.AlreadyExists {
tx.update("update foo set value='one' where id=1");
}
In this case the initial insert statement will not return a transaction id if it fails, but it is important that the statement is part of the entire transaction that we execute, as the returned error code actually tells something about the state of the database. So in this case we also need to restart the transaction with an explicit BeginTransaction.
@olavloite added the test case for this case.
Thanks
@olavloite We are not changing the StmtBasedTransaction with this change, because we expect customers to do error handling and retry // management, so if we do inline begin & retries it will be confusing to them. |
Thanks @codyoss , We run tests using -race flag in Kokoro already https://github.com/googleapis/google-cloud-go/blob/main/internal/kokoro/continuous.sh#L87 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks mostly good to me. I still have a couple of small nits/questions. Please take a look at those, but I think that those will be my last comments on this PR.
spanner/client_test.go
Outdated
var attempts int | ||
_, err := client.ReadWriteTransaction(ctx, func(ctx context.Context, tx *ReadWriteTransaction) error { | ||
attempts++ | ||
if attempts > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this will be confusing to someone who sees this code in a year from now without any context (and I'm also not 100% sure that it works as expected if the error remains). We should not remove the error from the mock server before the retry. Instead, we should just catch and 'accept' the error. So the test transaction should be something like (and hence not remove the error on the second attempt):
updateCounts, err := tx.BatchUpdate(ctx, []Statement{{SQL: invalidStatement}, {SQL: UpdateBarSetFoo}})
if err != nil {
// We know that this statement can fail, but it is acceptable for this transaction,
// so we just continue with the next statement.
}
if _, err := tx.Update(ctx, Statement{SQL: UpdateBarSetFoo}); err != nil {
return err
}
return nil
@@ -914,6 +948,15 @@ func (s *inMemSpannerServer) ExecuteBatchDml(ctx context.Context, req *spannerpb | |||
if err != nil { | |||
return nil, err | |||
} | |||
if _, ok := req.GetTransaction().GetSelector().(*spannerpb.TransactionSelector_Begin); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also use the helper method getResultSetWithTransactionSet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
The Spanner client library will normally prepare a fraction of the sessions in the session pool with a read/write transaction. These sessions are used when an application requests a read/write transaction. This eliminates the need for a BeginTransaction RPC to be executed as part of a transaction, as the call has already been executed in advance by the session pool.
This can however be less efficient in some specific cases:
Cloud Spanner also supports starting a new transaction as part of a query or DML statement. This eliminates the need for a separate round-trip to the server for starting a transaction. This PR adds the possibility to instruct the Spanner client to start read/write transactions by including this option in the first query/update statement of a read/write transaction, instead of preparing the read/write transactions in the session pool. This can improve the overall read/write transaction execution performance.
Including the
BeginTransaction
option with the first query or update statement of a transaction does require the client to block any other statement on the same transaction until the first statement has finished. This means that this strategy could perform less well in case of a read/write transaction that includes multiple parallel reads and/or writes.With this PR we are going to make Inline begin transaction as default for ReadWriteTransactions