Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: verify roachpb.Value checksums when building snapshots #119345

Merged
merged 1 commit into from
Mar 30, 2024

Conversation

lyang24
Copy link
Collaborator

@lyang24 lyang24 commented Feb 18, 2024

This commit verify record checksum verification when processing raft snapshot records. This process prevents data corruption spread via snapshot process. Cluster setting snapshotChecksumVerification is introduced to act as a safeguard knob to turn on/off the verifcation.

Fixes: #110572

Release note: None

Copy link

blathers-crl bot commented Feb 18, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Feb 18, 2024
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the draft, and sorry for the late review!

Before we go any further down this path, let's check in on the status of #111001. @jbowens Do you think KV checksumming is in our near future, or is it worthwhile to implement roachpb.Value checksum verification in the interim?

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @lyang24, and @sumeerbhola)


pkg/kv/kvserver/store_snapshot.go line 557 at r1 (raw file):

				// TODO: abstract the verification logic with engine iterator similar
				// to the implmentation of verifyingMVCCIterator.
				if snapshotChecksumVerification.Get(&s.ClusterSettings().SV) {

nit: we can pull this out of the loop and put it in a stack variable to avoid the atomic load over and over.


pkg/kv/kvserver/store_snapshot.go line 558 at r1 (raw file):

				// to the implmentation of verifyingMVCCIterator.
				if snapshotChecksumVerification.Get(&s.ClusterSettings().SV) {
					val := roachpb.MakeValueFromBytes(batchReader.Value())

This doesn't really do the right thing. This creates a new value containing the raw encoded bytes from the snapshot, and creates a new checksum for that value. This will always pass verification, because we just created the value and checksum here.

Instead, we need to decode the value inside the raw encoded bytes and check its checksum. This can be a bit tricky, because we can use several different kinds of value encoding here depending on what kind of data it is.

I think the simplest place to start is to check whether they key is an MVCC key via EngineKey.IsMVCCKey(). If it is an MVCC key, then we have to decode the MVCC value to get at the inner roachpb.Value and check its checksum, like we do here:

mvccValue, ok, err := tryDecodeSimpleMVCCValue(i.value)
if !ok && err == nil {
mvccValue, err = decodeExtendedMVCCValue(i.value)
}
if err == nil {
err = mvccValue.Value.Verify(i.key.Key)
}

There are other kinds of values too that we should probably verify, but we can start with simple MVCC values.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @lyang24, and @sumeerbhola)


pkg/kv/kvserver/store_snapshot.go line 558 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

This doesn't really do the right thing. This creates a new value containing the raw encoded bytes from the snapshot, and creates a new checksum for that value. This will always pass verification, because we just created the value and checksum here.

Instead, we need to decode the value inside the raw encoded bytes and check its checksum. This can be a bit tricky, because we can use several different kinds of value encoding here depending on what kind of data it is.

I think the simplest place to start is to check whether they key is an MVCC key via EngineKey.IsMVCCKey(). If it is an MVCC key, then we have to decode the MVCC value to get at the inner roachpb.Value and check its checksum, like we do here:

mvccValue, ok, err := tryDecodeSimpleMVCCValue(i.value)
if !ok && err == nil {
mvccValue, err = decodeExtendedMVCCValue(i.value)
}
if err == nil {
err = mvccValue.Value.Verify(i.key.Key)
}

There are other kinds of values too that we should probably verify, but we can start with simple MVCC values.

PS: since those functions aren't exported, DecodeMVCCValue does the same thing and is exported. It doesn't inline, but I'm not sure we care that much about the performance here, since the network transfer and ingestion will likely dominate runtime anyway.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think KV checksumming is in our near future, or is it worthwhile to implement roachpb.Value checksum verification in the interim?

Unfortunately it seems like it'll be a while, so I do think it's worthwhile to do this in the interim.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lyang24 and @sumeerbhola)

Copy link

blathers-crl bot commented Feb 24, 2024

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Copy link

blathers-crl bot commented Feb 24, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@lyang24 lyang24 force-pushed the verify_snapshot_checksum2 branch 3 times, most recently from a827a53 to 2ab2cbd Compare February 25, 2024 20:57
Copy link
Collaborator Author

@lyang24 lyang24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing, I think i have implement the suggestions correctly. Unfortunately, lots of test are timing out due to being stuck with the infinity invalid checksum loop. I picked one of the test TestStoreRangeMergeTimestampCache and looked at key
/Local/RangeID/68/r/RangeGCThreshold and value [12 04 08 00 10 00 18 00 20 00 28 00 32 05 7a 00 c9 92 03] the computed check sum from the key is cf163c61 and the first 4 bits of the value yield 12040800 not sure what went wrong.

one example of the repeated failing logs (I enhanced the verify function logging with full information locally)

=== RUN   TestStoreRangeMergeTimestampCache/disjoint-leaseholders=false/through-snapshot=false/future-read=false
    soon.go:75: encountered retriable replication change error: error sending couldn't accept range_id:68 coordinator_replica:<node_id:1 store_id:1 replica_id:1 type:VOTER_FULL > recipient_replica:<node_id:2 store_id:2 replica_id:2 type:LEARNER > delegated_sender:<node_id:1 store_id:1 replica_id:1 type:VOTER_FULL > term:6 first_index:18 descriptor_generation:4 queue_on_delegate_len:-1 snap_id:22ab3def-2c5e-4053-bb27-48b68873e7d0 : (n2,s2):2LEARNER: remote failed to apply snapshot: recv msg error: grpc: key(/Local/RangeID/68/r/RangeGCThreshold): invalid computed checksum (cf163c61) first 4 bit checksum(12040800) value [12 04 08 00 10 00 18 00 20 00 28 00 32 05 7a 00 c9 92 03] [code 2/Unknown]

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/kv/kvserver/store_snapshot.go line 558 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

PS: since those functions aren't exported, DecodeMVCCValue does the same thing and is exported. It doesn't inline, but I'm not sure we care that much about the performance here, since the network transfer and ingestion will likely dominate runtime anyway.

Thank you for the feedback. I abstracted the suggestions with a new verify method on batchReader, currently it only verifies mvccvalue and left a todo for other value types such as locktableRecord...

@lyang24
Copy link
Collaborator Author

lyang24 commented Feb 27, 2024

I digged more into the test failure with @pav-kv, we added prints to mvccput and found the orginal value being put is a SimpleMVCCValue

key: /Local/RangeID/68/r/RangeGCThreshold
value: [7a 00 c9 92 03]
proto: 0,0
checksum: 7a00c992
computed checksum: 7a00c992

the value matches the suffix of the error log in the checksum verification
[12 04 08 00 10 00 18 00 20 00 28 00 32 05 7a 00 c9 92 03]
which leads us to think it become an ExtendedMVCCValue in the raft snaphot.

The confusion part is decode simple mvcc value would work on the [12 04 08 00 10 00 18 00 20 00 28 00 32 05 7a 00 c9 92 03] value and 12 04 08 00 leads to wrong checksum. I am expecting it to return false and redirect to decodeExtendedMVCCValue(r.Value()). I think Erik is ooo, hi @jbowens do you know a better way to decode the value from
batchreader?

@lyang24
Copy link
Collaborator Author

lyang24 commented Feb 27, 2024

Previous thread is resolved, I wrapped decoding logic under the if mvccKey.IsValue() just like here test passes, I guess mvcckey without a timestamp is another category of mvccdata.

	if mvccKey.IsValue() {
		mvccValue, ok, err := tryDecodeSimpleMVCCValue(r.Value())
		if !ok && err == nil {
			mvccValue, err = decodeExtendedMVCCValue(r.Value())
		}
		if err == nil {
			err = mvccValue.Value.Verify(mvccKey.Key)
		}
	}

@lyang24 lyang24 marked this pull request as ready for review February 27, 2024 21:24
@lyang24 lyang24 requested review from a team as code owners February 27, 2024 21:24
// All batch operations are guaranteed to be point key or range key puts.
for batchReader.Next() {
// Verify value checksum to catch data corruption.
if verifyCheckSum {
if err = batchReader.Verify(); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err = batchReader.Verify(); err != nil {
if err := batchReader.Verify(); err != nil {

// Verify value checksum to catch data corruption.
if verifyCheckSum {
if err = batchReader.Verify(); err != nil {
return noSnap, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return noSnap, err
return noSnap, errors.Wrap(err, "verifying value checksum")

var snapshotChecksumVerification = settings.RegisterBoolSetting(
settings.SystemOnly,
"kv.snapshot_receiver.checksum_verification.enabled",
"flag to enable/disable checksum verification during raft snapshot process",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"flag to enable/disable checksum verification during raft snapshot process",
"verify value checksums on receiving a raft snapshot",

Comment on lines 1614 to 1612
// snapshotChecksumVerification is the flag to enable/ disable check sum
// verification when receiving snapshots.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// snapshotChecksumVerification is the flag to enable/ disable check sum
// verification when receiving snapshots.
// snapshotChecksumVerification enables/disables value checksums verification
// when receiving snapshots.

Comment on lines 1038 to 1039
err = r.Verify()
require.NoError(t, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = r.Verify()
require.NoError(t, err)
require.NoError(t, r.Verify())

if err != nil {
return err
}
if mvccKey.IsValue() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: try not indenting large blocks:

if !mvccKey.IsValue() {
	return nil
}
mvccValue, ok, err := ...

if err != nil {
return err
}
// TODO: verify values with other encoding types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an idea which types are missing? My understanding is that the checksums we're checking are part of MVCCValue. So it seems like we're covering it all?

nit: TODO should be named like TODO(lyang24), or reference an issue.

Copy link
Collaborator

@pav-kv pav-kv Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at

i.hasPoint, i.hasRange = i.pebbleIterator.HasPointAndRange()
if i.hasPoint {
i.value, _ = i.pebbleIterator.UnsafeValue()
if i.key.IsValue() {
mvccValue, ok, err := tryDecodeSimpleMVCCValue(i.value)
if !ok && err == nil {
mvccValue, err = decodeExtendedMVCCValue(i.value)
}
if err == nil {
err = mvccValue.Value.Verify(i.key.Key)
}

Is it true that we only include a checksum for point keys? Is this because range keys don't have any value? It's possible that we don't need this TODO here, and can just mirror what verifyingMVCCIterator does.

@erikgrinaker

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an idea which types are missing? My understanding is that the checksums we're checking are part of MVCCValue. So it seems like we're covering it all?

The checksums are in roachpb.Value, not MVCCValue. These can also be found elsewhere. See my other comment about MVCCMetadata.RawBytes.

Comment on lines 142 to 143
// Verify verifies the checksums of the current record in the batch entry
// will return error on mismatch.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Verify verifies the checksums of the current record in the batch entry
// will return error on mismatch.
// Verify ensures the checksum of the current batch entry matches the data.
// Returns an error on checksum mismatch.

if err = batchReader.Verify(); err != nil {
return noSnap, err
}
}
switch batchReader.KeyKind() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we paying twice for parsing the key above (in Verify func) and below? Is there a way to save here?

if err != nil {
return err
}
if mvccKey.IsValue() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikgrinaker

What's with the !mvccKey.IsValue() case? As I understood, it means the timestamp is empty, and it's some kind of "meta" key/value. Do we not compute a checksum for these key/values?

What are examples of "meta" values? Intents?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intents and inline values. These contain an encoded enginepb.MVCCMetadata, with Value.RawBytes in MVCCMetadata.Value. These should be decoded to a roachpb.Value and verified too.

buf.newMeta = enginepb.MVCCMetadata{RawBytes: value.RawBytes}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intents, i.e. lock-table keys, also use MVCCMetadata with a checksum in RawBytes.

I'm not sure if we ever use PutUnversioned() with anything that isn't an MVCCMetadata in the replicated keyspace. Could possibly be something in the range-local keyspace. I guess time-series data also won't be a roachpb.Value, but I'm not sure. We'll find out if something fails once we attempt to decode inline values.

Copy link
Collaborator Author

@lyang24 lyang24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pavel's catch on decode keys twice makes me think another way to implement this once i got my approach in the ball park I will fix the rest of the comments also add verification for lock table records.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @pav-kv, and @sumeerbhola)


pkg/kv/kvserver/store_snapshot.go line 561 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…

Are we paying twice for parsing the key above (in Verify func) and below? Is there a way to save here?

Thats a good idea, how about i implement verify on EngineKey struct and call it from here
func (key *EngineKey) Verify(value []byte) error {


pkg/storage/batch.go line 157 at r4 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Intents, i.e. lock-table keys, also use MVCCMetadata with a checksum in RawBytes.

I'm not sure if we ever use PutUnversioned() with anything that isn't an MVCCMetadata in the replicated keyspace. Could possibly be something in the range-local keyspace. I guess time-series data also won't be a roachpb.Value, but I'm not sure. We'll find out if something fails once we attempt to decode inline values.

The test fails on a key with rangeLocalGcThresholds previously.

I found how intents/locks are put into the storage engine we can probably verify locktable records with

  lockTableKey, err := key.ToLockTableKey()  
  if err == nil {  
   return DecodeMVCCValueAndVerify(lockTableKey.Key, value)  
  }   
  
func DecodeMVCCValueAndVerify(key roachpb.Key, value \[\]byte) error {  
  mvccValue, ok, err := tryDecodeSimpleMVCCValue(value)  
  if !ok && err == nil {  
   mvccValue, err = decodeExtendedMVCCValue(value)  
  }  
  if err == nil {  
   err = mvccValue.Value.Verify(key)  
  }  
  return err  
} 

@lyang24 lyang24 force-pushed the verify_snapshot_checksum2 branch 2 times, most recently from 4f56a97 to 74ca21a Compare March 9, 2024 21:41
@lyang24 lyang24 force-pushed the verify_snapshot_checksum2 branch 4 times, most recently from 9bd5a1a to e764ac1 Compare March 11, 2024 07:55
Copy link
Collaborator Author

@lyang24 lyang24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pav-kv @erikgrinaker , Thanks for great suggestions. I have updated the logics of verification please take another look.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @pav-kv, and @sumeerbhola)


pkg/kv/kvserver/store_snapshot.go line 561 at r4 (raw file):

Previously, lyang24 (Lanqing Yang) wrote…

Thats a good idea, how about i implement verify on EngineKey struct and call it from here
func (key *EngineKey) Verify(value []byte) error {

implement the verify with EngineKey to avoid double decoding.


pkg/kv/kvserver/store_snapshot.go line 1615 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
// snapshotChecksumVerification enables/disables value checksums verification
// when receiving snapshots.

Done.


pkg/kv/kvserver/store_snapshot.go line 1619 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
	"verify value checksums on receiving a raft snapshot",

Done.


pkg/storage/batch.go line 143 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
// Verify ensures the checksum of the current batch entry matches the data.
// Returns an error on checksum mismatch.

Done.


pkg/storage/batch.go line 149 at r4 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Do we have an idea which types are missing? My understanding is that the checksums we're checking are part of MVCCValue. So it seems like we're covering it all?

The checksums are in roachpb.Value, not MVCCValue. These can also be found elsewhere. See my other comment about MVCCMetadata.RawBytes.

Thanks for the hint on MVCCMeta, my new logic should cover all the cases thus the todo is not needed anymore.


pkg/storage/batch_test.go line 1037 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…

This looks confusing. Consider initializing the checksum before the PutMVCC. E.g.:

mvccKey, mvccValue := pointKey("mvccKey", 1), stringValue("mvccValue")
mvccValue.Value.InitChecksum(mvccKey.Key)
require.NoError(t, b.PutMVCC(mvccKey, mvccValue))

r, err := NewBatchReader(b.Repr())
...

Done.


pkg/storage/batch_test.go line 1039 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
	require.NoError(t, r.Verify())

Done.


pkg/storage/batch_test.go line 1044 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
	// Throw err on checksum mismatch
	require.NotNil(t, r.Verify())

Also, consider using require.ErrorContains, to check the error message.

Done.

Copy link
Collaborator Author

@lyang24 lyang24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @pav-kv, and @sumeerbhola)


pkg/kv/kvserver/store_snapshot.go line 557 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
					if err := batchReader.Verify(); err != nil {

Done.


pkg/kv/kvserver/store_snapshot.go line 558 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
						return noSnap, errors.Wrap(err, "verifying value checksum")

Done.

@@ -331,6 +332,67 @@ var interestingEngineKeys = [][]byte{
}),
}

func TestEngineKeyVerifyMvcc(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func TestEngineKeyVerifyMvcc(t *testing.T) {
func TestEngineKeyVerifyMVCC(t *testing.T) {

Comment on lines 346 to 348
v := MVCCValue{
Value: roachpb.MakeValueFromString("test"),
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
v := MVCCValue{
Value: roachpb.MakeValueFromString("test"),
}
v := MVCCValue{Value: roachpb.MakeValueFromString("test")}

Comment on lines 399 to 402
if err == nil {
err = mvccValue.Value.Verify(key)
}
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err == nil {
err = mvccValue.Value.Verify(key)
}
return err
if err != nil {
return err
}
return mvccValue.Value.Verify(key)

// All batch operations are guaranteed to be point key or range key puts.
for batchReader.Next() {
ek, err := batchReader.EngineKey()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of func calls below (e.g. PutInternalRangeKey) will decode the EngineKey internally. I wonder if the APIs should accept EngineKey instead of the raw one. Probably not to be addressed in PR.

@jbowens jbowens requested a review from pav-kv March 21, 2024 15:27
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks for doing this!

Reviewed 1 of 4 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @lyang24, @pav-kv, and @sumeerbhola)


pkg/storage/engine_key.go line 389 at r7 (raw file):

		return DecodeMVCCMetaAndVerify(lockTableKey.Key, value)
	}
	return DecodeMVCCMetaAndVerify(key.Key, value)

if there's an MVCCValueHeader, does this end up allocating? Even if so, I think it's okay for now, but we might leave a TODO to refactor to avoid the allocation.


pkg/storage/engine_key.go line 394 at r7 (raw file):

// DecodeMVCCValueAndVerify will try to decode the value as
// MVCCValue and then verify the checksum.
func DecodeMVCCValueAndVerify(key roachpb.Key, value []byte) error {

are we able to unexport DecodeMVCCValueAndVerify and DecodeMVCCMetaAndVerify? It looks like we only use the EngineKey.Verify entrypoint.


pkg/storage/engine_key.go line 402 at r7 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
	if err != nil {
		return err
	}
	return mvccValue.Value.Verify(key)

agreed (when possible, it's idiomatic in Go to make the exceptional case the indented, early exit case)


pkg/storage/engine_key_test.go line 335 at r7 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
func TestEngineKeyVerifyMVCC(t *testing.T) {

+1 (in Go acronyms and initialisms are fully capitalized)

Copy link
Collaborator Author

@lyang24 lyang24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing, i had a follow up question on the allocation values that has MVCCValueHeader.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @jbowens, @pav-kv, and @sumeerbhola)


pkg/storage/engine_key.go line 389 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

if there's an MVCCValueHeader, does this end up allocating? Even if so, I think it's okay for now, but we might leave a TODO to refactor to avoid the allocation.

Thanks for the feedback, i am not sure about the additional allocation. Can you please give me some hints on where the allocation could happen?

From the decode logic, we try to decode the value as if it does not have a MVCCValueHeader if that failed we reassign the allocated MVCCValue struct with decoding with header.

mvccValue, ok, err := tryDecodeSimpleMVCCValue(value)  
if !ok && err == nil {  
  mvccValue, err = decodeExtendedMVCCValue(value)  
}

are you refering to MVCCMetadata's unmarshal process could allocate extra fields we do not need in checksum verification?

	if err := protoutil.Unmarshal(value, &meta); err != nil {
		return nil
	}

pkg/storage/engine_key.go line 394 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

are we able to unexport DecodeMVCCValueAndVerify and DecodeMVCCMetaAndVerify? It looks like we only use the EngineKey.Verify entrypoint.

great catch those methods does not need to be Exported


pkg/storage/engine_key.go line 402 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

agreed (when possible, it's idiomatic in Go to make the exceptional case the indented, early exit case)

Done.


pkg/storage/engine_key_test.go line 335 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

+1 (in Go acronyms and initialisms are fully capitalized)

Done.


pkg/storage/engine_key_test.go line 348 at r7 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…
	v := MVCCValue{Value: roachpb.MakeValueFromString("test")}

Done.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

I think we can include the microbenchmark and defer resolving the allocations with a TODO. Up to you

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker, @lyang24, @pav-kv, and @sumeerbhola)


pkg/storage/engine_key.go line 389 at r7 (raw file):

Previously, lyang24 (Lanqing Yang) wrote…

Thanks for the feedback, i am not sure about the additional allocation. Can you please give me some hints on where the allocation could happen?

From the decode logic, we try to decode the value as if it does not have a MVCCValueHeader if that failed we reassign the allocated MVCCValue struct with decoding with header.

mvccValue, ok, err := tryDecodeSimpleMVCCValue(value)  
if !ok && err == nil {  
  mvccValue, err = decodeExtendedMVCCValue(value)  
}

are you refering to MVCCMetadata's unmarshal process could allocate extra fields we do not need in checksum verification?

	if err := protoutil.Unmarshal(value, &meta); err != nil {
		return nil
	}

Yeah, that MVCCMetadata unmarshal may allocate. I think we could reduce some of the allocations by doing something like:

type EngineKeyVerifier struct {
    meta enginepb.MVCCMetadata
}

func (v *EngineKeyVerifier) Verify(k EngineKey, v []byte) error {
    // Unmarshal into &v.meta
}

so the *enginepb.MVCCMetadata itself doesn't get allocate each invocation (see lockTableKeyScanner.getLockTableValue) that does something similar.

Here's a microbenchmark that demonstrates the allocations:

diff --git a/pkg/storage/engine_key_test.go b/pkg/storage/engine_key_test.go
index 43b6be771e9..1f0d45a9ffc 100644
--- a/pkg/storage/engine_key_test.go
+++ b/pkg/storage/engine_key_test.go
@@ -25,6 +25,7 @@ import (
 	"github.com/cockroachdb/cockroach/pkg/storage/enginepb"
 	"github.com/cockroachdb/cockroach/pkg/util/hlc"
 	"github.com/cockroachdb/cockroach/pkg/util/leaktest"
+	"github.com/cockroachdb/cockroach/pkg/util/log"
 	"github.com/cockroachdb/cockroach/pkg/util/protoutil"
 	"github.com/cockroachdb/cockroach/pkg/util/randutil"
 	"github.com/cockroachdb/cockroach/pkg/util/uuid"
@@ -527,3 +528,91 @@ func engineKey(key string, ts int) EngineKey {
 		Version: encodeMVCCTimestamp(wallTS(ts)),
 	}
 }
+
+func BenchmarkEngineKeyVerify(b *testing.B) {
+	defer log.Scope(b).Close(b)
+
+	type testKV struct {
+		ek  EngineKey
+		val []byte
+	}
+	type testCase struct {
+		name string
+		testKV
+	}
+
+	mustDecodeEngineKey := func(raw []byte) EngineKey {
+		ek, ok := DecodeEngineKey(raw)
+		require.True(b, ok)
+		return ek
+	}
+	mustEncodeMVCC := func(key MVCCKey, v MVCCValue) testKV {
+		encodedKey := EncodeMVCCKey(key)
+		encodedVal, err := EncodeMVCCValue(v)
+		require.NoError(b, err)
+		return testKV{ek: mustDecodeEngineKey(encodedKey), val: encodedVal}
+	}
+	mustEncodeLockTable := func(key LockTableKey, meta *enginepb.MVCCMetadata, v MVCCValue) testKV {
+		encodedMVCCValue, err := EncodeMVCCValue(v)
+		require.NoError(b, err)
+		meta.RawBytes = encodedMVCCValue
+		encodedVal, err := protoutil.Marshal(meta)
+		require.NoError(b, err)
+		ek, _ := key.ToEngineKey(nil)
+		return testKV{ek: ek, val: encodedVal}
+	}
+
+	testCases := []testCase{
+		{
+			name: "SimpleMVCCValue",
+			testKV: mustEncodeMVCC(
+				MVCCKey{
+					Key:       roachpb.Key("foobar"),
+					Timestamp: hlc.Timestamp{WallTime: 1711383740550067000, Logical: 2},
+				},
+				MVCCValue{Value: roachpb.Value{RawBytes: []byte("hello world")}},
+			),
+		},
+		{
+			name: "ExtendedMVCCValue",
+			testKV: mustEncodeMVCC(
+				MVCCKey{
+					Key:       roachpb.Key("foobar"),
+					Timestamp: hlc.Timestamp{WallTime: 1711383740550067000, Logical: 2},
+				},
+				MVCCValue{
+					MVCCValueHeader: enginepb.MVCCValueHeader{
+						LocalTimestamp:   hlc.ClockTimestamp{WallTime: 1711383740550069000},
+						OmitInRangefeeds: true,
+					},
+					Value: roachpb.Value{RawBytes: []byte("hello world")},
+				},
+			),
+		},
+		{
+			name: "LockTableKey",
+			testKV: mustEncodeLockTable(
+				LockTableKey{
+					Key:      roachpb.Key("foobar"),
+					Strength: lock.Exclusive,
+					TxnUUID:  uuid.UUID{},
+				},
+				&enginepb.MVCCMetadata{
+					KeyBytes: 100,
+					ValBytes: 100,
+				},
+				MVCCValue{
+					Value: roachpb.Value{RawBytes: []byte("hello world")},
+				},
+			),
+		},
+	}
+
+	for _, tc := range testCases {
+		b.Run(tc.name, func(b *testing.B) {
+			for i := 0; i < b.N; i++ {
+				_ = tc.ek.Verify(tc.val)
+			}
+		})
+	}
+}

@lyang24
Copy link
Collaborator Author

lyang24 commented Mar 25, 2024

pkg/kv/kvserver/store_snapshot.go line 555 at r7 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…

A bunch of func calls below (e.g. PutInternalRangeKey) will decode the EngineKey internally. I wonder if the APIs should accept EngineKey instead of the raw one. Probably not to be addressed in PR.

left a todo on this, personally I think all the method below can take a decoded engine key as input

Copy link
Collaborator Author

@lyang24 lyang24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the benchmark and explanations. I included the bench mark and the TODO.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker, @jbowens, @pav-kv, and @sumeerbhola)


pkg/storage/engine_key.go line 389 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Yeah, that MVCCMetadata unmarshal may allocate. I think we could reduce some of the allocations by doing something like:

type EngineKeyVerifier struct {
    meta enginepb.MVCCMetadata
}

func (v *EngineKeyVerifier) Verify(k EngineKey, v []byte) error {
    // Unmarshal into &v.meta
}

so the *enginepb.MVCCMetadata itself doesn't get allocate each invocation (see lockTableKeyScanner.getLockTableValue) that does something similar.

Here's a microbenchmark that demonstrates the allocations:

diff --git a/pkg/storage/engine_key_test.go b/pkg/storage/engine_key_test.go
index 43b6be771e9..1f0d45a9ffc 100644
--- a/pkg/storage/engine_key_test.go
+++ b/pkg/storage/engine_key_test.go
@@ -25,6 +25,7 @@ import (
 	"github.com/cockroachdb/cockroach/pkg/storage/enginepb"
 	"github.com/cockroachdb/cockroach/pkg/util/hlc"
 	"github.com/cockroachdb/cockroach/pkg/util/leaktest"
+	"github.com/cockroachdb/cockroach/pkg/util/log"
 	"github.com/cockroachdb/cockroach/pkg/util/protoutil"
 	"github.com/cockroachdb/cockroach/pkg/util/randutil"
 	"github.com/cockroachdb/cockroach/pkg/util/uuid"
@@ -527,3 +528,91 @@ func engineKey(key string, ts int) EngineKey {
 		Version: encodeMVCCTimestamp(wallTS(ts)),
 	}
 }
+
+func BenchmarkEngineKeyVerify(b *testing.B) {
+	defer log.Scope(b).Close(b)
+
+	type testKV struct {
+		ek  EngineKey
+		val []byte
+	}
+	type testCase struct {
+		name string
+		testKV
+	}
+
+	mustDecodeEngineKey := func(raw []byte) EngineKey {
+		ek, ok := DecodeEngineKey(raw)
+		require.True(b, ok)
+		return ek
+	}
+	mustEncodeMVCC := func(key MVCCKey, v MVCCValue) testKV {
+		encodedKey := EncodeMVCCKey(key)
+		encodedVal, err := EncodeMVCCValue(v)
+		require.NoError(b, err)
+		return testKV{ek: mustDecodeEngineKey(encodedKey), val: encodedVal}
+	}
+	mustEncodeLockTable := func(key LockTableKey, meta *enginepb.MVCCMetadata, v MVCCValue) testKV {
+		encodedMVCCValue, err := EncodeMVCCValue(v)
+		require.NoError(b, err)
+		meta.RawBytes = encodedMVCCValue
+		encodedVal, err := protoutil.Marshal(meta)
+		require.NoError(b, err)
+		ek, _ := key.ToEngineKey(nil)
+		return testKV{ek: ek, val: encodedVal}
+	}
+
+	testCases := []testCase{
+		{
+			name: "SimpleMVCCValue",
+			testKV: mustEncodeMVCC(
+				MVCCKey{
+					Key:       roachpb.Key("foobar"),
+					Timestamp: hlc.Timestamp{WallTime: 1711383740550067000, Logical: 2},
+				},
+				MVCCValue{Value: roachpb.Value{RawBytes: []byte("hello world")}},
+			),
+		},
+		{
+			name: "ExtendedMVCCValue",
+			testKV: mustEncodeMVCC(
+				MVCCKey{
+					Key:       roachpb.Key("foobar"),
+					Timestamp: hlc.Timestamp{WallTime: 1711383740550067000, Logical: 2},
+				},
+				MVCCValue{
+					MVCCValueHeader: enginepb.MVCCValueHeader{
+						LocalTimestamp:   hlc.ClockTimestamp{WallTime: 1711383740550069000},
+						OmitInRangefeeds: true,
+					},
+					Value: roachpb.Value{RawBytes: []byte("hello world")},
+				},
+			),
+		},
+		{
+			name: "LockTableKey",
+			testKV: mustEncodeLockTable(
+				LockTableKey{
+					Key:      roachpb.Key("foobar"),
+					Strength: lock.Exclusive,
+					TxnUUID:  uuid.UUID{},
+				},
+				&enginepb.MVCCMetadata{
+					KeyBytes: 100,
+					ValBytes: 100,
+				},
+				MVCCValue{
+					Value: roachpb.Value{RawBytes: []byte("hello world")},
+				},
+			),
+		},
+	}
+
+	for _, tc := range testCases {
+		b.Run(tc.name, func(b *testing.B) {
+			for i := 0; i < b.N; i++ {
+				_ = tc.ek.Verify(tc.val)
+			}
+		})
+	}
+}

Thank you for the explanation and benchmarks. I get it now allocate MVCCMetadata on every call could be improved. I added the benchmark with example results and added todo on the call.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker, @pav-kv, and @sumeerbhola)

@lyang24 lyang24 force-pushed the verify_snapshot_checksum2 branch 3 times, most recently from c2f1fe4 to 18047bb Compare March 29, 2024 17:07
This commit verify record checksum verification when processing raft snapshot
records. This process prevents data corruption spread via snapshot process.
Cluster setting snapshotChecksumVerification is introduced to act as a
safeguard knob to turn on/off the verifcation.

Fixes: cockroachdb#110572

Release note: None
@lyang24
Copy link
Collaborator Author

lyang24 commented Mar 29, 2024

unfortunately this pr experienced the same test failures will wait for the fix of 121342 then rebase and merge
#121342

@lyang24
Copy link
Collaborator Author

lyang24 commented Mar 30, 2024

bors r+

@craig craig bot merged commit 7fc4c7b into cockroachdb:master Mar 30, 2024
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: verify roachpb.Value checksums when building snapshots
5 participants