Silent data corruption in Redis #3730

Open
aganesan4 opened this Issue Jan 5, 2017 · 1 comment

Projects

None yet

2 participants

@aganesan4

Redis does not use checksums for its entries in its appendonly file. Without this, Redis is vulnerable to silent data corruptions resulting from underlying problems in disks and file systems atop them [1,2].

We setup a redis cluster with three nodes. In a small test case where the underlying disk/FS corrupts the key or value in the master's appendonly file, Redis can silently return corrupted user data on a read request.

Moreover, the master slave resynchronization protocol in Redis spreads the corrupted data to other intact slaves. We reproduced this scenario using our testing framework.

Is there a reason why Redis doesn't use checksums to protect the data in appendonly file from data corruptions?

[1] https://research.cs.wisc.edu/wind/Publications/zfs-corruption-fast10.pdf
[2] http://www.cs.toronto.edu/~bianca/papers/fast08.pdf

@antirez
Owner
antirez commented Jan 13, 2017

Hello. Yes... this is desirable indeed. A few points in order to articulate further discussions:

  1. Adding a new checksum every few entries logged can contribute to AOF size, however this could be opt in.
  2. Starting with Redis 4.0, we have an option to rewrite the AOF in terms of an RDB preamble with an AOF tail. RDB features a CRC64 checksum at the end, so at least for rewritten data, this could do it. The AOF tail however is still not protected.
  3. There are different representations for the checksum that trade size with ability to parse the file as Redis protocol (currently AOF is parsable as Redis protocol RESP format).

For instance instead of logging:

*3
$3
SET
$3
foo
$3
bar

We could log:

[ff84a122]
*3
$3
SET
$3
foo
$3
bar

Where the number between [] is a CRC32 of the following command RESP representation.

Otherwise to stay compatible with RESP:

*3
$3
SET
$3
foo
$3
bar
*2
$6
AOFCRC
$16
9133a4ffb4cc8233

In the second case the AOFCRC command would be the running CRC64 up to this part of the file, so could be emitted from time to time instead of being emitted at every command. However this means that we need to take state about what is the former checksum, and it should also work after loading the AOF, after rewrites and so forth. This solution is technically more complex to implement and more global while the previous is more local so it's simpler. However the local solution does not protect against, for example, out of order AOF lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment