Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52234

Merged

Conversation

nvanbenschoten
Copy link
Member

Fixes #49482.
Fixes #50386.
Fixes #50683.

This commit fixes a bug in replayTransactionalWrite that resulted in
incorrect "transaction %s with sequence %d has a different value %+v
after recomputing from what was written" errors being thrown when write
requests were reissued, often due to transaction refreshes after
partially successful batches. The bug was caused by an unsafe byte
buffer acquired through Iterator.UnsafeValue being used after its
iterator had been moved. Moving an iterator invalidates and potentially
corrupts unsafe memory. This caused a sanity check later in the function
to occasionally fail.

I had a hard time reproducing this in a unit test. Even with very large
keys, very large values, interspersed compactions, and a durable Pebble
instance, I couldn't create a situation where moving the iterator
actually caused corruption in value previously retrieved using
iter.UnsafeValue. Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in #50683, as I can no
longer reproduce it under stress.

This will need to be backported to release-20.1, release-19.2, and
release-19.1, as all three release branches are susceptible to the
spurious errors that this bug can result in.

Release note (bug fix): Large write requests no longer have a chance
of erroneously throwing a "transaction with sequence has a different
value" error.

Fixes cockroachdb#49482.
Fixes cockroachdb#50386.
Fixes cockroachdb#50683.

This commit fixes a bug in replayTransactionalWrite that resulted in
incorrect "transaction %s with sequence %d has a different value %+v
after recomputing from what was written" errors being thrown when write
requests were reissued, often due to transaction refreshes after
partially successful batches. The bug was caused by an unsafe byte
buffer acquired through `Iterator.UnsafeValue` being used after its
iterator had been moved. Moving an iterator invalidates and potentially
corrupts unsafe memory. This caused a sanity check later in the function
to occasionally fail.

I had a hard time reproducing this in a unit test. Even with very large
keys, very large values, interspersed compactions, and a durable Pebble
instance, I couldn't create a situation where moving the iterator
actually caused corruption in value previously retrieved using
`iter.UnsafeValue`. Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in cockroachdb#50683, as I can no
longer reproduce it under stress.

This will need to be backported to release-20.1, release-19.2, and
release-19.1, as all three release branches are susceptible to the
spurious errors that this bug can result in.

Release note (bug fix): Large write requests no longer have a chance
of erroneously throwing a "transaction with sequence has a different
value" error.
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice find! I bet that wasn't fun to track down.

Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in #50683, as I can no
longer reproduce it under stress.

Unsafe values returned by Pebble and RocksDB point into cache memory. (Unsafe keys are different in that they point to temporary buffers because the keys need to be prefix decompressed). The cache memory is very unlikely to be reused in a short time period unless the cache is small. Perhaps try a test where the cache is zero size, or a test where there is concurrent pressure on the cache. Or we could just agree that this was definitely a bug and move on.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)


pkg/storage/mvcc.go, line 1398 at r1 (raw file):

	// calculated value on this replay is the same as the one we've previously
	// written.
	if !bytes.Equal(value, writtenValue) {

Could you instead move this error check about the if valueFn != nil block? That does change the error handling somewhat , but it would avoid having to copy writtenValue.

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Perhaps try a test where the cache is zero size

I tried this out but still wasn't able to get a repro. I do wonder if there is anything we can do to uncover this form of bug through our existing test suite. For instance, maybe we could intentionally poison unsafe key and value buffers whenever an iterator is moved under race builds.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @petermattis)


pkg/storage/mvcc.go, line 1398 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Could you instead move this error check about the if valueFn != nil block? That does change the error handling somewhat , but it would avoid having to copy writtenValue.

When valueFn != nil, we use valueFn to determine the value to compare against. So I don't think this is possible, unless I'm misunderstanding your suggestion.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this out but still wasn't able to get a repro. I do wonder if there is anything we can do to uncover this form of bug through our existing test suite. For instance, maybe we could intentionally poison unsafe key and value buffers whenever an iterator is moved under race builds.

Hmm, we do that somewhere. Perhaps in Pebble. Let me track it down.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)


pkg/storage/mvcc.go, line 1398 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

When valueFn != nil, we use valueFn to determine the value to compare against. So I don't think this is possible, unless I'm misunderstanding your suggestion.

Nope, you understood it. My suggestion was bad. I was looking for places where we set writtenValue, but neglected to notice that we set value within that block of code. I think the current change is as good as it gets.

@nvanbenschoten
Copy link
Member Author

Hmm, we do that somewhere. Perhaps in Pebble. Let me track it down.

I'm interested to see that. I wonder if applying this to CRDB's test suite could catch any misuses in CRDB itself. This might be a good knob for Pebble to expose.

bors r+

@petermattis
Copy link
Collaborator

I'm interested to see that. I wonder if applying this to CRDB's test suite could catch any misuses in CRDB itself. This might be a good knob for Pebble to expose.

I just took a look right now and couldn't find what I thought was there. Perhaps I'm misremembering. This does sound like a useful idea. I'll take another look later and if I don't find anything I'll file an issue.

@craig
Copy link
Contributor

craig bot commented Aug 3, 2020

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants