Skip to content

Conversation

@stevendanna
Copy link
Collaborator

When a EndTxn(abort) enouncouters a transaction record in STAGING, it returns an IndeterminateCommitError so that the transaction recovery process can be started. When constructing this error, we previously returned a copy of the staging transaction that had its write timestamp forwarded the txn found on the request header.

According to this comment, this is rather important.

// We're using existingTxn on the reply, although it can be stale
// compared to the Transaction in the request (e.g. the Sequence,
// and various timestamps). We must be careful to update it with the
// supplied ba.Txn if we return it with an error which might be
// retried, as for example to avoid client-side serializable restart.

However, a side-effect is that the txnrecovery performed in response to an IndeterminateCommitError could erroneously find that the the transaction was commited because it was checking for the implicit commit criteria using a higher timestamp. The actual recovery of the transaction would then hit the following assertion:

programming error: timestamp change by implicitly committed transaction

Here, we solve this by returning the unmodified txn in an IndeterminateCommitError. While the comment warns us against this, I think in the case of here, this transaction has nothing (other than the EndTxn) to retry since we are rolling back. Further, the error is handled server-side immediately and then the user receives either the error from the recovery process or the error from the retried request.

Fixes #156698

Release note (bug fix): Fix a bug that could result in a transaction encountering the following assertion failure:

failed indeterminate commit recovery: programming error: timestamp change by
implicitly committed transaction

@stevendanna stevendanna requested a review from a team as a code owner November 2, 2025 02:03
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@miraradeva
Copy link
Contributor

Looks reasonable to me but I'll defer the approval to @arulajmani.

We see this programming error failure only with serializable transactions, right? Does this new error handling also affect the weak-isolation issue we've seen in kvnemesis of the type transaction found to be already committed (e.g. #154510)? It seems like the new check for the existence of the txn record can help detect that case too?

@stevendanna
Copy link
Collaborator Author

stevendanna commented Nov 3, 2025

@miraradeva I think this particular bug can happen at any isolation level.

I don't think that this will fix the weak isolation level issues we've seen since in those cases we've written the transaction record already. However, this fix is required for our proposed fix of running span-refresher retries at a higher timestamp.

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: modulo comments.

@arulajmani reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 506 at r1 (raw file):

			// actually be in a state that _would have_ been implicitly committed IF
			// it had been able to write a transaction record with this new state.
			// However, here we are aborting the transaction and don't want to end up

Should we drop the "here we are aborting the transaction" bit? It's slightly misleading, because we've tried to abort the transaction and realized we need to perform transaction recovery. The final status of the transaction is contingent on the result of this.


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 513 at r1 (raw file):

				txnForError = existingTxn
			} else {
				txnForError = *reply.Txn

Is it possible for us to hit this branch? My read is that reply.Txn is either set to the existingTxn (if recordAlreadyExisted) or h.Txn.Clone(). h.Txn.Clone().Status should never be STAGING, so we should never be in this branch -- if this checks out to you, should we return an assertion failed error here instead?


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4621 at r1 (raw file):

}

// TestRollbackAfterRefreshAndFailedCommit is a regression test for #156698

nit: missing period.


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4625 at r1 (raw file):

// This test forces the self-recovery of a transaction after a failed parallel
// commit. It doe this via the following.
//

nit: Thanks for writing this comment to explain the test setup. One thing it's missing is how the ranges are split during setup to ensure Del key=8 is split into its own batch. Should we weave that in?

@arulajmani
Copy link
Collaborator

@stevendanna great debugging here by the way! 🚀

@stevendanna stevendanna force-pushed the ssd/self-recovery-error branch from 0c6289b to ab1971b Compare November 10, 2025 18:09
Copy link
Collaborator Author

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @arulajmani and @miraradeva)


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 506 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Should we drop the "here we are aborting the transaction" bit? It's slightly misleading, because we've tried to abort the transaction and realized we need to perform transaction recovery. The final status of the transaction is contingent on the result of this.

Changed the text a bit


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 513 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Is it possible for us to hit this branch? My read is that reply.Txn is either set to the existingTxn (if recordAlreadyExisted) or h.Txn.Clone(). h.Txn.Clone().Status should never be STAGING, so we should never be in this branch -- if this checks out to you, should we return an assertion failed error here instead?

Yup, I think so, just added a small assertion.


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4625 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: Thanks for writing this comment to explain the test setup. One thing it's missing is how the ranges are split during setup to ensure Del key=8 is split into its own batch. Should we weave that in?

Done.

@github-actions
Copy link

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

  • If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
  • If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

@github-actions github-actions bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Nov 10, 2025
When a EndTxn(abort) enouncouters a transaction record in STAGING, it returns
an IndeterminateCommitError so that the transaction recovery process can be
started. When constructing this error, we previously returned a copy of the
staging transaction that had its write timestamp forwarded the txn found on
the request header.

According to this comment, this is rather important.

	// We're using existingTxn on the reply, although it can be stale
	// compared to the Transaction in the request (e.g. the Sequence,
	// and various timestamps). We must be careful to update it with the
	// supplied ba.Txn if we return it with an error which might be
	// retried, as for example to avoid client-side serializable restart.

However, a side-effect is that the txnrecovery performed in response to an
IndeterminateCommitError could erroneously find that the the transaction was
commited because it was checking for the implicit commit criteria using a higher
timestamp. The actual recovery of the transaction would then hit the following
assertion:

    programming error: timestamp change by implicitly committed transaction

Here, we solve this by returning the unmodified txn in an IndeterminateCommitError.
While the comment warns us against this, I think in the case of here, this transaction
has nothing (other than the EndTxn) to retry since we are rolling back. Further, the
error is handled server-side immediately and then the user receives either the error
from the recovery process or the error from the retried request.

Fixes cockroachdb#156698

Release note (bug fix): Fix a bug that could result in a transaction encountering
the following assertion failure:

   failed indeterminate commit recovery: programming error: timestamp change by
   implicitly committed transaction
@stevendanna stevendanna force-pushed the ssd/self-recovery-error branch from ab1971b to 3d1337f Compare November 10, 2025 18:17
@stevendanna stevendanna removed the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Nov 10, 2025
@stevendanna
Copy link
Collaborator Author

bors r=arulajmani

@craig
Copy link
Contributor

craig bot commented Nov 11, 2025

@craig craig bot merged commit f69b856 into cockroachdb:master Nov 11, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kv/kvnemesis.TestKVNemesisMultiNode: TestKVNemesisMultiNode failed [self-recovery failure after failed retry]

4 participants