Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Support checksum verification for values read from vlog #1052
The values are written with its checksum but we were not validating it
The values are written with its checksum but we were not validating it for every read. This commit enables checksum validation for every read. If "VerifyValueChecksum" is set to true, checksum will be calculated for each entry read from the value log. This option has no effect if the value is stored completely in SST.
I am not very sure if the default behavior should be to verify the checksum of every entry read from value log. This would significantly affect the performance. Keeping it to
The chances of getting corrupted data somewhere in the middle of a value log file are anyway pretty slim. If there was some corruption, it might have affected SSTs too. In that case, Badger would recognize corruption when DB is opened. We perform checksum verification for every SST file when we open the DB. So I am not sure if we should keep it enabled by default.
I would argue that the default behavior of the storage engine should be to provide consistency, not to look good on benchmarks. Users interested in performance can change the default value if they like. Users interested in consistency may not even know that by default Badger sometimes can loose data, even if chances are pretty slim.