storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52234

nvanbenschoten · 2020-08-03T04:25:41Z

Fixes #49482.
Fixes #50386.
Fixes #50683.

This commit fixes a bug in replayTransactionalWrite that resulted in
incorrect "transaction %s with sequence %d has a different value %+v
after recomputing from what was written" errors being thrown when write
requests were reissued, often due to transaction refreshes after
partially successful batches. The bug was caused by an unsafe byte
buffer acquired through Iterator.UnsafeValue being used after its
iterator had been moved. Moving an iterator invalidates and potentially
corrupts unsafe memory. This caused a sanity check later in the function
to occasionally fail.

I had a hard time reproducing this in a unit test. Even with very large
keys, very large values, interspersed compactions, and a durable Pebble
instance, I couldn't create a situation where moving the iterator
actually caused corruption in value previously retrieved using
iter.UnsafeValue. Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in #50683, as I can no
longer reproduce it under stress.

This will need to be backported to release-20.1, release-19.2, and
release-19.1, as all three release branches are susceptible to the
spurious errors that this bug can result in.

Release note (bug fix): Large write requests no longer have a chance
of erroneously throwing a "transaction with sequence has a different
value" error.

Fixes cockroachdb#49482. Fixes cockroachdb#50386. Fixes cockroachdb#50683. This commit fixes a bug in replayTransactionalWrite that resulted in incorrect "transaction %s with sequence %d has a different value %+v after recomputing from what was written" errors being thrown when write requests were reissued, often due to transaction refreshes after partially successful batches. The bug was caused by an unsafe byte buffer acquired through `Iterator.UnsafeValue` being used after its iterator had been moved. Moving an iterator invalidates and potentially corrupts unsafe memory. This caused a sanity check later in the function to occasionally fail. I had a hard time reproducing this in a unit test. Even with very large keys, very large values, interspersed compactions, and a durable Pebble instance, I couldn't create a situation where moving the iterator actually caused corruption in value previously retrieved using `iter.UnsafeValue`. Maybe I'm missing a trick there. Regardless, this certainly seems to fix the failures observed in cockroachdb#50683, as I can no longer reproduce it under stress. This will need to be backported to release-20.1, release-19.2, and release-19.1, as all three release branches are susceptible to the spurious errors that this bug can result in. Release note (bug fix): Large write requests no longer have a chance of erroneously throwing a "transaction with sequence has a different value" error.

cockroach-teamcity · 2020-08-03T04:25:48Z

This change is

petermattis

Nice find! I bet that wasn't fun to track down.

Maybe I'm missing a trick there. Regardless, this
certainly seems to fix the failures observed in #50683, as I can no
longer reproduce it under stress.

Unsafe values returned by Pebble and RocksDB point into cache memory. (Unsafe keys are different in that they point to temporary buffers because the keys need to be prefix decompressed). The cache memory is very unlikely to be reused in a short time period unless the cache is small. Perhaps try a test where the cache is zero size, or a test where there is concurrent pressure on the cache. Or we could just agree that this was definitely a bug and move on.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)

pkg/storage/mvcc.go, line 1398 at r1 (raw file):

	// calculated value on this replay is the same as the one we've previously
	// written.
	if !bytes.Equal(value, writtenValue) {

Could you instead move this error check about the if valueFn != nil block? That does change the error handling somewhat , but it would avoid having to copy writtenValue.

nvanbenschoten

TFTR!

Perhaps try a test where the cache is zero size

I tried this out but still wasn't able to get a repro. I do wonder if there is anything we can do to uncover this form of bug through our existing test suite. For instance, maybe we could intentionally poison unsafe key and value buffers whenever an iterator is moved under race builds.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @petermattis)

pkg/storage/mvcc.go, line 1398 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Could you instead move this error check about the if valueFn != nil block? That does change the error handling somewhat , but it would avoid having to copy writtenValue.

When valueFn != nil, we use valueFn to determine the value to compare against. So I don't think this is possible, unless I'm misunderstanding your suggestion.

petermattis

I tried this out but still wasn't able to get a repro. I do wonder if there is anything we can do to uncover this form of bug through our existing test suite. For instance, maybe we could intentionally poison unsafe key and value buffers whenever an iterator is moved under race builds.

Hmm, we do that somewhere. Perhaps in Pebble. Let me track it down.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)

pkg/storage/mvcc.go, line 1398 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

When valueFn != nil, we use valueFn to determine the value to compare against. So I don't think this is possible, unless I'm misunderstanding your suggestion.

Nope, you understood it. My suggestion was bad. I was looking for places where we set writtenValue, but neglected to notice that we set value within that block of code. I think the current change is as good as it gets.

nvanbenschoten · 2020-08-03T16:00:17Z

Hmm, we do that somewhere. Perhaps in Pebble. Let me track it down.

I'm interested to see that. I wonder if applying this to CRDB's test suite could catch any misuses in CRDB itself. This might be a good knob for Pebble to expose.

bors r+

petermattis · 2020-08-03T16:08:46Z

I'm interested to see that. I wonder if applying this to CRDB's test suite could catch any misuses in CRDB itself. This might be a good knob for Pebble to expose.

I just took a look right now and couldn't find what I thought was there. Perhaps I'm misremembering. This does sound like a useful idea. I'll take another look later and if I don't find anything I'll file an issue.

craig · 2020-08-03T18:00:17Z

Build succeeded:

GitHub CI (Cockroach)

nvanbenschoten requested a review from itsbilal August 3, 2020 04:25

petermattis approved these changes Aug 3, 2020

View reviewed changes

nvanbenschoten commented Aug 3, 2020

View reviewed changes

petermattis approved these changes Aug 3, 2020

View reviewed changes

craig bot merged commit b9bed22 into cockroachdb:master Aug 3, 2020

nvanbenschoten deleted the nvanbenschoten/idempotentUnsafe branch August 4, 2020 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52234

storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52234

nvanbenschoten commented Aug 3, 2020

cockroach-teamcity commented Aug 3, 2020

petermattis left a comment

nvanbenschoten left a comment

petermattis left a comment

nvanbenschoten commented Aug 3, 2020

petermattis commented Aug 3, 2020

craig bot commented Aug 3, 2020

storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52234

storage: don't corrupt unsafe value buffer in replayTransactionalWrite #52234

Conversation

nvanbenschoten commented Aug 3, 2020

cockroach-teamcity commented Aug 3, 2020

petermattis left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

nvanbenschoten commented Aug 3, 2020

petermattis commented Aug 3, 2020

craig bot commented Aug 3, 2020