Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Requst: mini-scrub mirrored non-checksummed data (i.e. NoCOW) #134

Open
jamespharvey20 opened this issue May 13, 2018 · 3 comments

Comments

@jamespharvey20
Copy link
Contributor

On a 3 device btrfs RAID 1, I have 2 files that are marked NoCOW, but are still lzo compressed, that when read crash the system. They're journald files, so nothing that should crash the system. This has been discussed on the mailing list, with title: '"decompress failed" in 1-2 files always causes kernel oops, check/scrub pass'. Chris Murphy remembered in the archives a discussion that compression can be forced on NoCOW under certain circumstances. Bug reported here: https://bugzilla.kernel.org/show_bug.cgi?id=199707

Being marked NoCOW, they don't have checksums, so btfs scrub doesn't look at them.

In my situation, one of the mirrored copies is valid, and the other is invalid.

Even once the kernel crash is fixed, it would be a really important data integrity feature for btrfs-progs (I'm thinking a new feature on scrub) that when running on a mirrored volume to look at any files without a checksum, and compare the mirrored copies. If they're different, obviously nothing can be automatically corrected without a checksum to verify. But, it's important to let the user know there's a problem. They can check which version appears valid, restore that file from backups, just know something is wrong, etc.

@jamespharvey20 jamespharvey20 changed the title Feature Requst: mini-scrub mirrored NoCOW data Feature Requst: mini-scrub mirrored non-checksummed data (i.e. NoCOW) May 13, 2018
@adam900710
Copy link
Collaborator

Pull request #135 should address the problem, although not the direction of fixing/scrubbing them.

That pull request will report such extents as errors, and for kernel btrfs fix, at least we will prevent such problems by never compress any extent if NODATASUM is set.

@jamespharvey20
Copy link
Contributor Author

I agree with #135.

Wanted to clarify if you think #135 makes the feature request unnecessary.

I still think there's good reason for it. Although the kernel patches and #135 will prevent NODATASUM/NODATACOW data from being compressed, there can of course still be uncompressed data without checksums that has mirrored copies that could have one copy get corrupted. If the good copy is read, user never gets alerted their data is no longer actually mirrored, unless the good copy gets corrupted too. Then (or, if the bad copy is read initially before the good copy) user silently gets bad data.

If the mirrored copies are compared, user will know something's wrong and be able to deal with it, granted not automatically, when it's still fixable.

@Forza-tng
Copy link
Contributor

This is similar to a request I made while back: #482

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants