Skip to content

Commit

Permalink
Fix explanation of XOR usage in KV checksum blog post (#10392)
Browse files Browse the repository at this point in the history
Summary:
Thanks pdillinger for reminding us that we are protected from swapping corruptions due to independent seeds (and for suggesting that approach in the first place).

Pull Request resolved: facebook/rocksdb#10392

Reviewed By: cbi42

Differential Revision: D37981819

Pulled By: ajkr

fbshipit-source-id: 3ed32982ae1dbc88eb92569010f9f2e8d190c962
  • Loading branch information
ajkr authored and facebook-github-bot committed Jul 20, 2022
1 parent b443d24 commit a0c6308
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion docs/_posts/2022-07-18-per-key-value-checksum.markdown
Expand Up @@ -37,7 +37,9 @@ Key-value pairs have multiple representations in RocksDB: in [WriteBatch](https:

Besides user key and value, RocksDB includes internal metadata in the per key-value checksum calculation. Depending on the representation, internal metadata consists of some combination of sequence number, operation type, and column family ID. Note that since timestamp (when enabled) is part of the user key it is protected as well.

The protection info consists of the XOR’d result of the xxh3 hash for all the protected components. Using XOR introduces a risk that swapping corruptions (e.g., key becomes the value and the value becomes the key) are undetectable. However, we think this is a reasonable tradeoff for the advantage it provides: we can efficiently transform protection info for different representations.
The protection info consists of the XOR’d result of the xxh3 hash for all the protected components. This allows us to efficiently transform protection info for different representations. See below for an example converting WriteBatch protection info to memtable protection info.

A risk of using XOR is the possibility of swapping corruptions (e.g., key becomes the value and the value becomes the key). To mitigate this risk, we use an independent seed for hashing each type of component.

The following two figures illustrate how protection info in WriteBatch and memtable are calculated from a key-value’s components.

Expand Down

0 comments on commit a0c6308

Please sign in to comment.