Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Include PAR as an option #1

Open
redmop opened this issue Aug 10, 2017 · 5 comments
Open

[Feature Request] Include PAR as an option #1

redmop opened this issue Aug 10, 2017 · 5 comments

Comments

@redmop
Copy link

redmop commented Aug 10, 2017

How does it handle the backup files being corrupted? Maybe include par as an option.

@someone1
Copy link
Owner

The original intention for this application is for the files to reside on storage targets that are not prone to such issues, such as those offered by Google, Amazon, Backblaze, Azure, etc.

I did look into adding this feature (specifically, Reed-Solomon erasure encoding) but reasoned it was not required for this project.

If there is interest in using targets that don't have such file resiliency guarantees where adding parity bits to each file makes sense, I'd be happy to add this as an option in the future.

@redmop
Copy link
Author

redmop commented Aug 11, 2017

I would definitely suggest something along these lines. "file resiliency guarantees" don't help you when the file you are working with is corrupt somehow.

But then, I'm a backup nut. I use ZFS send/recv, Backuppc, Borg, and Proxmox all on the same data.

@prologic
Copy link

So S3, Google, Azure and Backblaze guarantee data integrity? -- I'm looking at using this tool/project to backup my ZFS pool to Backblaze

@someone1
Copy link
Owner

someone1 commented Aug 30, 2017

Yes, almost all intended targets come with some SLA on the durability of the data you store on there, and since zfsbackup-go supports multiple targets, you can increase your durability by utilizing multiple providers. Note: I am still working on adding Azure/Backblaze targets, should hit within a week or two.

Google: 99.999999999% durability - they mention the usage of erasure encodings
AWS S3: 99.999999999% durability - they mention the use of checksums on the data for integrity validation and repair (something that sounds similar to what ZFS does though I'm sure its more distributed and complicated than that)
Azure: Although no target is provided, they give an in-depth explanation of their architecture which you can read here - they use the Reed-Solomon erasure encoding and you can increase durablitiy by increasing your redundancy options
Backblaze: 99.999999% - they also utilize the Reed-Solomon erasure encoding and even open sourced their code of it

I also use the checksum features of the targets available to ensure that data is delivered properly when storing the data (e.g. CRC32C for Google, MD5 for S3, etc.) This is all computed as the ZFS send stream is chunked and made ready for uploading.

@prologic
Copy link

prologic commented Sep 2, 2017

Nice! Thanks for the writeup! We should put this in the README for reference?

@someone1 someone1 changed the title Backup file corruption [Feature Request] Include PAR as an option Nov 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants