Verify write batch checksum before WAL #10114

cbi42 · 2022-06-04T03:49:46Z

Summary:
Context: WriteBatch can have key-value checksums when it was created with protection_bytes_per_key > 0.
This PR added checksum verification for write batches before they are written to WAL.

Test plan:

Added new unit tests to db_kv_checksum_test.cc: make check -j32
benchmark on performance regression: ./db_bench --benchmarks=fillrandom[-X20] -db=/dev/shm/test_rocksdb -write_batch_protection_bytes_per_key=8
- Pre-PR:
  fillrandom [AVG 20 runs] : 198875 (± 3006) ops/sec; 22.0 (± 0.3) MB/sec
- Post-PR:
  fillrandom [AVG 20 runs] : 196487 (± 2279) ops/sec; 21.7 (± 0.3) MB/sec
  Mean regressed about 1% (198875 -> 196487 ops/sec).

facebook-github-bot · 2022-06-04T04:17:55Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-06-04T18:55:26Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-04T19:16:13Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-04T19:16:26Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ajkr

LGTM. It would be nice if there's a test for a write group containing two or more batches, with corruption happening before the merge.

ajkr · 2022-06-07T18:17:24Z

db/db_impl/db_impl_write.cc

@@ -512,15 +512,18 @@ Status DBImpl::WriteImpl(const WriteOptions& write_options,
  }
  PERF_TIMER_START(write_pre_and_post_process_time);

+  if (!io_s.ok()) {


The changes to the error handling in this file look good, and could perhaps go even further by moving the IOStatusCheck() next to the point io_s is assigned, and reducing the scope of io_s. My understanding is these changes aren't strictly needed for this PR. LMK if this is incorrect. It's fine to include them here either way.

I changed the error handling of io_s in this file since I was worried about the case when WriteToWAL returns corruption and w.CallbackFailed() is true: either there is an assert for io_s.okay() or there is no checking for io_s in this case before the change in this PR. I'm not familiar with the writer callback, whether the error handling change need to be included in this PR depends on if the above scenario is possible.

Sorry, the longer I look at the existing code, the more confusing it becomes. It might be because FinalStatus() returns any non-callback failure first. However, if the callback failed, then the callback failure is the first failure that happened so should be returned in FinalStatus(). So when WriteToWAL() and leader callback both failed, we should only record the callback failure.

Why we even proceed to WriteToWAL() after callback failure considering

rocksdb/db/write_callback.h

Lines 18 to 20 in ad135f3

// Will be called while on the write thread before the write executes. If

// this function returns a non-OK status, the write will be aborted and this

// status will be returned to the caller of DB::Write().

is a mystery to me. What happens to callback failures in non-leader writers is also unclear.

Anyways, I don't want to derail this. This is setting a DB wide error when WriteToWAL() fails and that's good for me. ~~Worst case looks like a WriteToWAL() error can be returned when the actual first failure was in the leader callback, which is fine with me.~~

Why we even proceed to WriteToWAL() after callback failure considering

I'm guessing that for a write group, the writes whose callbacks return failure will be ignored in the following WAL/memtable operations, but we still proceed to WriteToWAL() with the writes in the group whose callbacks were successful.

db/db_impl/db_impl_write.cc

facebook-github-bot · 2022-06-08T22:15:52Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-08T22:31:30Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-08T22:42:27Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-09T07:01:37Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-09T22:37:18Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-10T02:54:55Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-10T04:43:44Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-10T05:41:57Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-06-10T20:49:29Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-10T22:13:27Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cbi42 · 2022-06-11T00:35:30Z

LGTM. It would be nice if there's a test for a write group containing two or more batches, with corruption happening before the merge.

Thanks for the suggestion! I added some tests with write group of two batches with corruption happening before the merge.

ajkr

LGTM, great work!

db/write_batch.cc

facebook-github-bot · 2022-06-14T16:49:32Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

db/db_impl/db_impl_write.cc

db/write_batch.cc

facebook-github-bot · 2022-06-14T20:15:17Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-14T20:26:12Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-06-15T18:30:01Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-06-15T18:30:22Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Update HISTORY.md for #10114: write batch checksum verification before writing to WAL. Pull Request resolved: #10189 Reviewed By: ajkr Differential Revision: D37226366 Pulled By: cbi42 fbshipit-source-id: cd2f076961abc35f35783e0f2cc3beda68cdb446

…n is turned on (#10201) Summary: This bug was discovered after write batch checksum verification before WAL is added (#10114) and stress test with write batch checksum protection is turned on (#10037). In this [line](https://github.com/facebook/rocksdb/blob/d5d8920f2cfd06d1803b0976acbe8b564b88b6b1/db/write_batch.cc#L2887), the number of checksums may not be consistent with `batch->Count()`. This PR fixes this issue. Pull Request resolved: #10201 Test Plan: ``` ./db_stress --batch_protection_bytes_per_key=8 --destroy_db_initially=1 --max_key=100000 --use_txn=1 ``` Reviewed By: ajkr Differential Revision: D37260799 Pulled By: cbi42 fbshipit-source-id: ff8dce7dcce295d689333bc9d892d17a843bf0ea

facebook-github-bot added the CLA Signed label Jun 4, 2022

cbi42 requested a review from ajkr June 4, 2022 05:33

ajkr approved these changes Jun 7, 2022

View reviewed changes

cbi42 force-pushed the wal-writebatch-checksum branch from 2af6671 to 63844d2 Compare June 8, 2022 22:31

cbi42 force-pushed the wal-writebatch-checksum branch from 63844d2 to 2cd712e Compare June 8, 2022 22:42

cbi42 force-pushed the wal-writebatch-checksum branch from 0b823b7 to f90cf0a Compare June 10, 2022 04:43

cbi42 requested a review from ajkr June 10, 2022 23:07

ajkr approved these changes Jun 14, 2022

View reviewed changes

db/write_batch.cc Show resolved Hide resolved

riversand963 reviewed Jun 14, 2022

View reviewed changes

db/db_impl/db_impl_write.cc Outdated Show resolved Hide resolved

riversand963 reviewed Jun 14, 2022

View reviewed changes

db/write_batch.cc Outdated Show resolved Hide resolved

cbi42 added 2 commits June 15, 2022 11:29

Verify write batch checksum before WAL

2abbd2a

Added write batch kv-checksum to db_bench

5760cb1

cbi42 added 6 commits June 15, 2022 11:29

Address corruption in pre-merged batches

dde0ce0

fix bug

d5b958e

Fix CI failure

a0cfa99

Fix CI failure

c0cb5f7

Ignore op_types that are not checksum protected

461dae7

Address comments

3e3ad84

cbi42 force-pushed the wal-writebatch-checksum branch from fad96d5 to 3e3ad84 Compare June 15, 2022 18:29

facebook-github-bot closed this in 9882652 Jun 15, 2022

cbi42 added a commit to cbi42/rocksdb that referenced this pull request Jun 16, 2022

Update HISTORY.md for write batch checksum verificaiton (facebook#10114)

6654a76

cbi42 mentioned this pull request Jun 16, 2022

Update HISTORY.md for #10114 #10189

Closed

cbi42 mentioned this pull request Jun 18, 2022

Fix a bug in WriteBatchInternal::Append when write batch KV protection is turned on #10201

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify write batch checksum before WAL #10114

Verify write batch checksum before WAL #10114

cbi42 commented Jun 4, 2022 •

edited

facebook-github-bot commented Jun 4, 2022

facebook-github-bot commented Jun 4, 2022

facebook-github-bot commented Jun 4, 2022

facebook-github-bot commented Jun 4, 2022

ajkr left a comment

ajkr Jun 7, 2022

cbi42 Jun 7, 2022

ajkr Jun 13, 2022 •

edited

ajkr Jun 13, 2022 •

edited

cbi42 Jun 14, 2022 •

edited

facebook-github-bot commented Jun 8, 2022

facebook-github-bot commented Jun 8, 2022

facebook-github-bot commented Jun 8, 2022

facebook-github-bot commented Jun 9, 2022

facebook-github-bot commented Jun 9, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

cbi42 commented Jun 11, 2022

ajkr left a comment

facebook-github-bot commented Jun 14, 2022

facebook-github-bot commented Jun 14, 2022

facebook-github-bot commented Jun 14, 2022

facebook-github-bot commented Jun 15, 2022

facebook-github-bot commented Jun 15, 2022

	// Will be called while on the write thread before the write executes. If
	// this function returns a non-OK status, the write will be aborted and this
	// status will be returned to the caller of DB::Write().

Verify write batch checksum before WAL #10114

Verify write batch checksum before WAL #10114

Conversation

cbi42 commented Jun 4, 2022 • edited

facebook-github-bot commented Jun 4, 2022

facebook-github-bot commented Jun 4, 2022

facebook-github-bot commented Jun 4, 2022

facebook-github-bot commented Jun 4, 2022

ajkr left a comment

Choose a reason for hiding this comment

ajkr Jun 7, 2022

Choose a reason for hiding this comment

cbi42 Jun 7, 2022

Choose a reason for hiding this comment

ajkr Jun 13, 2022 • edited

Choose a reason for hiding this comment

ajkr Jun 13, 2022 • edited

Choose a reason for hiding this comment

cbi42 Jun 14, 2022 • edited

Choose a reason for hiding this comment

facebook-github-bot commented Jun 8, 2022

facebook-github-bot commented Jun 8, 2022

facebook-github-bot commented Jun 8, 2022

facebook-github-bot commented Jun 9, 2022

facebook-github-bot commented Jun 9, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

facebook-github-bot commented Jun 10, 2022

cbi42 commented Jun 11, 2022

ajkr left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jun 14, 2022

facebook-github-bot commented Jun 14, 2022

facebook-github-bot commented Jun 14, 2022

facebook-github-bot commented Jun 15, 2022

facebook-github-bot commented Jun 15, 2022

cbi42 commented Jun 4, 2022 •

edited

ajkr Jun 13, 2022 •

edited

ajkr Jun 13, 2022 •

edited

cbi42 Jun 14, 2022 •

edited