release-20.1: storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52267
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #52234.
/cc @cockroachdb/release
Fixes #49482.
Fixes #50386.
Fixes #50683.
This commit fixes a bug in replayTransactionalWrite that resulted in
incorrect "transaction %s with sequence %d has a different value %+v
after recomputing from what was written" errors being thrown when write
requests were reissued, often due to transaction refreshes after
partially successful batches. The bug was caused by an unsafe byte
buffer acquired through
Iterator.UnsafeValue
being used after itsiterator had been moved. Moving an iterator invalidates and potentially
corrupts unsafe memory. This caused a sanity check later in the function
to occasionally fail.
I had a hard time reproducing this in a unit test. Even with very large
keys, very large values, interspersed compactions, and a durable Pebble
instance, I couldn't create a situation where moving the iterator
actually caused corruption in value previously retrieved using
iter.UnsafeValue
. Maybe I'm missing a trick there. Regardless, thiscertainly seems to fix the failures observed in #50683, as I can no
longer reproduce it under stress.
This will need to be backported to release-20.1, release-19.2, and
release-19.1, as all three release branches are susceptible to the
spurious errors that this bug can result in.
Release note (bug fix): Large write requests no longer have a chance
of erroneously throwing a "transaction with sequence has a different
value" error.