Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-19.2: storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52268

Merged

Conversation

nvanbenschoten
Copy link
Member

Backport 1/1 commits from #52234.

/cc @cockroachdb/release


Fixes #49482.
Fixes #50386.
Fixes #50683.

This commit fixes a bug in replayTransactionalWrite that resulted in
incorrect "transaction %s with sequence %d has a different value %+v
after recomputing from what was written" errors being thrown when write
requests were reissued, often due to transaction refreshes after
partially successful batches. The bug was caused by an unsafe byte
buffer acquired through Iterator.UnsafeValue being used after its
iterator had been moved. Moving an iterator invalidates and potentially
corrupts unsafe memory. This caused a sanity check later in the function
to occasionally fail.

I had a hard time reproducing this in a unit test. Even with very large
keys, very large values, interspersed compactions, and a durable Pebble
instance, I couldn't create a situation where moving the iterator
actually caused corruption in value previously retrieved using
iter.UnsafeValue. Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in #50683, as I can no
longer reproduce it under stress.

This will need to be backported to release-20.1, release-19.2, and
release-19.1, as all three release branches are susceptible to the
spurious errors that this bug can result in.

Release note (bug fix): Large write requests no longer have a chance
of erroneously throwing a "transaction with sequence has a different
value" error.

Fixes cockroachdb#49482.
Fixes cockroachdb#50386.
Fixes cockroachdb#50683.

This commit fixes a bug in replayTransactionalWrite that resulted in
incorrect "transaction %s with sequence %d has a different value %+v
after recomputing from what was written" errors being thrown when write
requests were reissued, often due to transaction refreshes after
partially successful batches. The bug was caused by an unsafe byte
buffer acquired through `Iterator.UnsafeValue` being used after its
iterator had been moved. Moving an iterator invalidates and potentially
corrupts unsafe memory. This caused a sanity check later in the function
to occasionally fail.

I had a hard time reproducing this in a unit test. Even with very large
keys, very large values, interspersed compactions, and a durable Pebble
instance, I couldn't create a situation where moving the iterator
actually caused corruption in value previously retrieved using
`iter.UnsafeValue`. Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in cockroachdb#50683, as I can no
longer reproduce it under stress.

This will need to be backported to release-20.1, release-19.2, and
release-19.1, as all three release branches are susceptible to the
spurious errors that this bug can result in.

Release note (bug fix): Large write requests no longer have a chance
of erroneously throwing a "transaction with sequence has a different
value" error.
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

I assume this backported cleanly modulo the test.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @petermattis)

@nvanbenschoten
Copy link
Member Author

TFTR!

I assume this backported cleanly modulo the test.

Yes, this backported cleanly modulo the test.

@nvanbenschoten nvanbenschoten merged commit cc36d15 into cockroachdb:release-19.2 Aug 3, 2020
@nvanbenschoten nvanbenschoten deleted the backport19.2-52234 branch August 4, 2020 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants