kvcoord: assert that the txn is clean after commit by andreimatei · Pull Request #46399 · cockroachdb/cockroach

andreimatei · 2020-03-21T18:17:17Z

This patch adds a couple of assertions about the state in which a txn
proto can be after a successful commit (explicit commit or implicit).
The assertions are about how the multiple updates to a txn done by
sub-batches of the commit batch (as split by the DistSender) must have
been collapsed in a sane way when merging their responses together.

I spent the morning thinking that we have a problem with the WriteTooOld
flag being set on the result for an implicitly-committed txn. The
scenario that I was worried about is that the
EndTxn(CanCommitAtHigherTimestamp:true) batch gets split
into two, and both halves get pushed. The EndTxn half gets pushed higher
than the other half (otherwise, we wouldn't end up in an implicit commit
state), but it performs a server-side refresh and succeeds. I was
worried that the other half can still return a WriteTooOld flag, which
would make it into the combined response, and would trick the span
refresher into retrying. This would be bad, since retrying after an
implicit commit is problematic.
However, I believe this situation is not possible. We're saved by the
fact that combining the transaction updates (Transaction.Update()) is
smart enough to not treat the WriteTooOld flag as cummulative; instead,
the response with the highest read timestamp gets to dictate the wto,
and the EndTxn response should not have it set.

Release note: None
Release justification: Only an assertion. It would be indicative of a
dangerous bug if it fires.

cockroach-teamcity · 2020-03-21T18:17:24Z

This change is

nvb

LGTM once the one comment is addressed. However, could you run a set of the following three roachtests 3x each before merging this:

tpcc/mixed-headroom/n5cpu16
version/mixed/nodes=3
version/mixed/nodes=5

Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 227 at r1 (raw file):

	// highest read timestamp, the response from the EndTxn sub-batch gets to
	// dictate the wto flag of the combined response (see Transaction.Update()).
	if br.Txn.WriteTooOld {

Why do we have the same assertions twice? Why don't we just use txn.Status in assertTxnCleanWhenCommitted and call that from here?

This patch adds a couple of assertions about the state in which a txn proto can be after a successful commit (explicit commit or implicit). The assertions are about how the multiple updates to a txn done by sub-batches of the commit batch (as split by the DistSender) must have been collapsed in a sane way when merging their responses together. I spent the morning thinking that we have a problem with the WriteTooOld flag being set on the result for an implicitly-committed txn. The scenario that I was worried about is that the EndTxn(CanCommitAtHigherTimestamp:true) batch gets split into two, and both halves get pushed. The EndTxn half gets pushed higher than the other half (otherwise, we wouldn't end up in an implicit commit state), but it performs a server-side refresh and succeeds. I was worried that the other half can still return a WriteTooOld flag, which would make it into the combined response, and would trick the span refresher into retrying. This would be bad, since retrying after an implicit commit is problematic. However, I believe this situation is not possible. We're saved by the fact that combining the transaction updates (Transaction.Update()) is smart enough to not treat the WriteTooOld flag as cummulative; instead, the response with the highest read timestamp gets to dictate the wto, and the EndTxn response should not have it set. Release note: None Release justification: Only an assertion. It would be indicative of a dangerous bug if it fires.

andreimatei requested a review from nvb March 21, 2020 18:17

nvb approved these changes Mar 23, 2020

View reviewed changes

andreimatei force-pushed the txn.assert-clean-after-commit branch from de7921a to 3c9e2c8 Compare March 26, 2020 19:34

tbg added the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvcoord: assert that the txn is clean after commit#46399

kvcoord: assert that the txn is clean after commit#46399
andreimatei wants to merge 1 commit intocockroachdb:masterfrom
andreimatei:txn.assert-clean-after-commit

andreimatei commented Mar 21, 2020

Uh oh!

cockroach-teamcity commented Mar 21, 2020

Uh oh!

nvb left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

andreimatei commented Mar 21, 2020

Uh oh!

cockroach-teamcity commented Mar 21, 2020

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants