Skip to content

Enhancement: Ability to specify which bricks to read from in an EC setup #4494

@sbk173

Description

@sbk173

Hi,

I wanted to add a functionality that allows clients to verify that, in an erasure-coded setup, the data present in all bricks is correct.

To achieve this, GlusterFS needs a functionality that allows clients to specify which bricks to read from during a read operation. If this is possible, we can perform two read operations and compare the data read from the two reads to the original data in order to detect incorrectness. For example, in an 8+4 erasure-coded setup, we can store the MD5 checksum of the file before writing to GlusterFS, and then perform a read using the first 8 bricks. We can then calculate the checksum of the data obtained. In the next read operation, we can use the last 8 bricks, calculate the checksum of the data obtained, and verify its correctness. If the checksum calculated from the two reads matches the original checksum, then the data on all bricks is correct. If not, we know that the data on some brick is incorrect.

To implement this, one of the possible approaches would be to store an extended attribute on the disk, fetch it before a read call is made, and use this to specify which bricks are to be used in the read operation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions