-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Hi,
I wanted to add a functionality that allows clients to verify that, in an erasure-coded setup, the data present in all bricks is correct.
To achieve this, GlusterFS needs a functionality that allows clients to specify which bricks to read from during a read operation. If this is possible, we can perform two read operations and compare the data read from the two reads to the original data in order to detect incorrectness. For example, in an 8+4 erasure-coded setup, we can store the MD5 checksum of the file before writing to GlusterFS, and then perform a read using the first 8 bricks. We can then calculate the checksum of the data obtained. In the next read operation, we can use the last 8 bricks, calculate the checksum of the data obtained, and verify its correctness. If the checksum calculated from the two reads matches the original checksum, then the data on all bricks is correct. If not, we know that the data on some brick is incorrect.
To implement this, one of the possible approaches would be to store an extended attribute on the disk, fetch it before a read call is made, and use this to specify which bricks are to be used in the read operation.