New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluate redundancy / error correction options #225

Open
ThomasWaldmann opened this Issue Sep 30, 2015 · 29 comments

Comments

Projects
None yet
@ThomasWaldmann
Member

ThomasWaldmann commented Sep 30, 2015

There is some danger that bitrot and storage media defects could lead to backup data loss / repository integrity issues. Deduplicating backup systems are more vulnerable to this than non-deduplicating ones, because a defect chunk affects all backup archives using this chunk.

Currently, there is a lot of error detection (CRCs, hashes, HMACs) going on in borgbackup, but it has no built-in support for error correction (see the FAQ about why), but it could be solved maybe using one of these approaches:

  • use borg to have N (N>1) independent backup repos of your data on different targets (if N-1 targets get corrupt then, you have still 1 working left. note that there is no support to create one non-corrupt repo from 2 corrupt repos, although that might be theoretically possible for some cases).
  • snapraid
  • par2
  • FECpp https://github.com/randombit/fecpp (BSD, C++ - make available via Cython?)
  • zfec (GPL/TGPPL, Python 2.x only, PR for Python 3.x exists)
  • RAID (and monitor and scrub the disks), ZFS mirror or RAIDZ* (better not use raid5 or raidz1)
  • zfs copies=N option (N>1)
  • specific filesystems
  • ceph librados
  • https://github.com/Bulat-Ziganshin/FastECC

If we can find some working approaches, we could add them to the documentation.
Help and feedback about this is welcome!

@oderwat

This comment has been minimized.

Show comment
Hide comment
@oderwat

oderwat Sep 30, 2015

I think "bit rot" is real. Just because of the failure rate exceeding the storage size for real huge hard drives. So statistically there will be a block error on a drive when it is big enough. But that probably has to be solved by the filesystem. On the other side borg is a backup system and should be able to recover from some disasters. Bup is using par2 (on demand) which seems to work for them.

oderwat commented Sep 30, 2015

I think "bit rot" is real. Just because of the failure rate exceeding the storage size for real huge hard drives. So statistically there will be a block error on a drive when it is big enough. But that probably has to be solved by the filesystem. On the other side borg is a backup system and should be able to recover from some disasters. Bup is using par2 (on demand) which seems to work for them.

@anarcat

This comment has been minimized.

Show comment
Hide comment
@anarcat

anarcat Oct 1, 2015

Contributor

links to mentionned projects:

  • snapraid - backup and snapshot system with multiple redundant disk support
  • zfec - generic erasure coding (aka RAID-5 but with customizable number of copies) in Python, used by Tahoe-LAFS
  • par2 - commandline-only tool to do erasure coding, i have implemented support for it for bup in bup-cron and it works well

I think zfec is the most interesting project here for our purposes, because of its Python API. we could use it to store redundant copies of the segments' chunks and double-check that in borg check.

anyone can also already run par2 or zfec on the repository through the commandline to accomplish the demanded feature.

i am not sure snapraid is what we want.

so i would recommend adding par2 as a temporary solution to the FAQ and implementing zfec eventually directly in borg, especially if we can configure a redundant drive separately for backups.

Contributor

anarcat commented Oct 1, 2015

links to mentionned projects:

  • snapraid - backup and snapshot system with multiple redundant disk support
  • zfec - generic erasure coding (aka RAID-5 but with customizable number of copies) in Python, used by Tahoe-LAFS
  • par2 - commandline-only tool to do erasure coding, i have implemented support for it for bup in bup-cron and it works well

I think zfec is the most interesting project here for our purposes, because of its Python API. we could use it to store redundant copies of the segments' chunks and double-check that in borg check.

anyone can also already run par2 or zfec on the repository through the commandline to accomplish the demanded feature.

i am not sure snapraid is what we want.

so i would recommend adding par2 as a temporary solution to the FAQ and implementing zfec eventually directly in borg, especially if we can configure a redundant drive separately for backups.

@tgharold

This comment has been minimized.

Show comment
Hide comment
@tgharold

tgharold Mar 8, 2016

Contributor

One option for those of us with extra disk space, would be to allow up to N copies of a chunk to be stored. This is a brute-force approach which would double or triple or quadruple the size of the repo depending on how many copies you allow.

Another idea is to allow repair of a repository by pulling good copies of damaged chunks from another directory. So if my primary backup repository happens to get corrupted, but I have an offline copy, I could mount that offline copy somewhere and have a repair function attempt to find good copies of damaged chunks from that directory.

PAR2 is nice, but maybe a bit slow. I don't know if PAR3 (supposedly faster) ever got off the ground.

Contributor

tgharold commented Mar 8, 2016

One option for those of us with extra disk space, would be to allow up to N copies of a chunk to be stored. This is a brute-force approach which would double or triple or quadruple the size of the repo depending on how many copies you allow.

Another idea is to allow repair of a repository by pulling good copies of damaged chunks from another directory. So if my primary backup repository happens to get corrupted, but I have an offline copy, I could mount that offline copy somewhere and have a repair function attempt to find good copies of damaged chunks from that directory.

PAR2 is nice, but maybe a bit slow. I don't know if PAR3 (supposedly faster) ever got off the ground.

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Apr 2, 2016

Contributor

FEC would make sense to protect against single/few bit errors (as opposed to a typical "many sectors / half the drive / entire drive gone" scenario). It would need to sit below encryption (since a single ciphertext bit flip potentially garbles an entire plaintext block). Implementing it on the Repository layer transparently applying to all on-disk values (and keys!) would make sense. Since check is already "local" (borg serve-side) it could rewrite all key-value pairs where FEC fixed errors. No RPC API changes required => forwards/backwards compatible change.

The C code from zfec looks small. If it doesn't need a whole lot of dependencies and the LICENSE permits it (and it, of course, being a good match to our requirements[1]) we could vendor it[2] if it's not commonly available.

[1]

  • Data independent (I think this is true for all but speciality A/V FECs)
  • Doesn't require a particular data size, or if it has padding requirements they are small (512 Byte would be a little wasteful)
  • Configurable ("how much can it rot before it can't be recovered")
  • Of course: LICENSE compat, availability, tested, proven.

[2] Vendoring should be a last-resort thing, since it more or less means that we take on all the responsibility upstream has or should have regarding packaging/bugs/testing etc.

Contributor

enkore commented Apr 2, 2016

FEC would make sense to protect against single/few bit errors (as opposed to a typical "many sectors / half the drive / entire drive gone" scenario). It would need to sit below encryption (since a single ciphertext bit flip potentially garbles an entire plaintext block). Implementing it on the Repository layer transparently applying to all on-disk values (and keys!) would make sense. Since check is already "local" (borg serve-side) it could rewrite all key-value pairs where FEC fixed errors. No RPC API changes required => forwards/backwards compatible change.

The C code from zfec looks small. If it doesn't need a whole lot of dependencies and the LICENSE permits it (and it, of course, being a good match to our requirements[1]) we could vendor it[2] if it's not commonly available.

[1]

  • Data independent (I think this is true for all but speciality A/V FECs)
  • Doesn't require a particular data size, or if it has padding requirements they are small (512 Byte would be a little wasteful)
  • Configurable ("how much can it rot before it can't be recovered")
  • Of course: LICENSE compat, availability, tested, proven.

[2] Vendoring should be a last-resort thing, since it more or less means that we take on all the responsibility upstream has or should have regarding packaging/bugs/testing etc.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Apr 2, 2016

Member

If just a single or few bits flip in a disk sector / flash block, wouldn't that be either corrected by the device's ECC mechanisms (thus be no problem) or (if too many) lead to the device giving a read error and not giving out any data (thus resulting in much more than a few bit flips for the upper layers)?

Member

ThomasWaldmann commented Apr 2, 2016

If just a single or few bits flip in a disk sector / flash block, wouldn't that be either corrected by the device's ECC mechanisms (thus be no problem) or (if too many) lead to the device giving a read error and not giving out any data (thus resulting in much more than a few bit flips for the upper layers)?

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Apr 2, 2016

Contributor

Hard drive manufacturers always seemed to tell the "either it reads correct data or it won't read at all" story. Otoh: https://en.wikipedia.org/wiki/Data_corruption#SILENT

Somewhat related (but further discussion for separate issue) is how data integrity errors are handled. Currently borg extract will throw IntegrityError and abort, but that's not terribly helpful if it's just one corrupted file or chunk. Log (like borg create does IO errors) and exit 1/2 instead?

Contributor

enkore commented Apr 2, 2016

Hard drive manufacturers always seemed to tell the "either it reads correct data or it won't read at all" story. Otoh: https://en.wikipedia.org/wiki/Data_corruption#SILENT

Somewhat related (but further discussion for separate issue) is how data integrity errors are handled. Currently borg extract will throw IntegrityError and abort, but that's not terribly helpful if it's just one corrupted file or chunk. Log (like borg create does IO errors) and exit 1/2 instead?

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Apr 3, 2016

Member

@enkore yes, aborting is not that helpful - open a separate ticket for it.

Member

ThomasWaldmann commented Apr 3, 2016

@enkore yes, aborting is not that helpful - open a separate ticket for it.

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Apr 6, 2016

Contributor

Hm, this would also be interesting for the chunk metadata. We could pass a restricted subset of the chunk metadata to the (untrusted) Repository layer, to tell it what's data and what's metadata[1]. That would allow the Repo layer to apply different strategies to them. E.g. have metadata with a much higher FEC ratio than data itself.

[1] Technically that leaks that information to something untrusted, but I'd say it's fairly easy from the access patterns and chunk sizes to tell (item) metadata and actual files apart. Specifically, item metadata is written in bursts and should be mostly chunks of ~16 kB. So if an attacker sees ~1 MB of consecutive 16 kB chunks, then that's item metadata.

Contributor

enkore commented Apr 6, 2016

Hm, this would also be interesting for the chunk metadata. We could pass a restricted subset of the chunk metadata to the (untrusted) Repository layer, to tell it what's data and what's metadata[1]. That would allow the Repo layer to apply different strategies to them. E.g. have metadata with a much higher FEC ratio than data itself.

[1] Technically that leaks that information to something untrusted, but I'd say it's fairly easy from the access patterns and chunk sizes to tell (item) metadata and actual files apart. Specifically, item metadata is written in bursts and should be mostly chunks of ~16 kB. So if an attacker sees ~1 MB of consecutive 16 kB chunks, then that's item metadata.

@dfloyd888

This comment has been minimized.

Show comment
Hide comment
@dfloyd888

dfloyd888 May 4, 2016

This would be useful as part of a repository, in the config file. It would be nice to have a configurable option to allow for ECC metadata at a selected percentage. I know that one glitch or sync error with other backup programs can render an entire repository unreadable. I definitely hope this can pop up as a feature, as it is a hedge against bit rot for long term storage.

dfloyd888 commented May 4, 2016

This would be useful as part of a repository, in the config file. It would be nice to have a configurable option to allow for ECC metadata at a selected percentage. I know that one glitch or sync error with other backup programs can render an entire repository unreadable. I definitely hope this can pop up as a feature, as it is a hedge against bit rot for long term storage.

@aljungberg

This comment has been minimized.

Show comment
Hide comment
@aljungberg

aljungberg Jul 15, 2016

The FAQ recommends using a RAID system with redundant storage. The trouble there is that while such a system is geared towards recovering from a whole disk failing, it can't repair bit rot. For example consider a RAID mirror: a scrub can show that the disks disagree but it can't show which disk is right. Some NASes layer btrfs on top of that which in theory could be used to find out which drive is right (the one with the correct btrfs checksum) but at least my Synology NAS doesn't yet actually do that.

So any internal redundancy/parity support for Borg would be great, and even just docs on sensible ways to use 3rd party tools to get there would work too. Maybe it's as simple as running the par2 command line tool with an X% redundancy argument.

aljungberg commented Jul 15, 2016

The FAQ recommends using a RAID system with redundant storage. The trouble there is that while such a system is geared towards recovering from a whole disk failing, it can't repair bit rot. For example consider a RAID mirror: a scrub can show that the disks disagree but it can't show which disk is right. Some NASes layer btrfs on top of that which in theory could be used to find out which drive is right (the one with the correct btrfs checksum) but at least my Synology NAS doesn't yet actually do that.

So any internal redundancy/parity support for Borg would be great, and even just docs on sensible ways to use 3rd party tools to get there would work too. Maybe it's as simple as running the par2 command line tool with an X% redundancy argument.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Jul 15, 2016

Member

Correct, a mirror doesn't help for all cases. But for many cases, it for sure can.

The disk controller generates and stores own CRC / ECC codes additional to the user data and if it reports a bad sector on disk A while doing a scrub run, it can just take the sector from disk B and try to write it back to disk A (usually, write is successful and sector is repaired, otherwise disk is defect). So it is only the unfortunate case when a mirrored sector on both disks does not have same data, but the CRC / ECC error is not triggered on any of both disks (which is hopefully rather unlikely) or if both sectors give a CRC/ECC error.

Important is that scrub runs take place regularly, otherwise the corruption will be undetected and, if one is unlucky, a lot of errors creep in until one notices - and if both sides of the mirror get defect in same place, the data is lost.

Similar is the case for RAID5/6/10 arrays.

What still can happen is that the controller decides suddenly that more than the redundant number of disks are defect and kicks them out. Or that more disks go defect while the array is rebuilding. But that is a fundamental problem and you can't do much about it aside from having lots of redundancy to make this case unlikely.

zfs also has own checksums/hashes of the data also, btw (and is maybe more production-ready than btrfs).

Member

ThomasWaldmann commented Jul 15, 2016

Correct, a mirror doesn't help for all cases. But for many cases, it for sure can.

The disk controller generates and stores own CRC / ECC codes additional to the user data and if it reports a bad sector on disk A while doing a scrub run, it can just take the sector from disk B and try to write it back to disk A (usually, write is successful and sector is repaired, otherwise disk is defect). So it is only the unfortunate case when a mirrored sector on both disks does not have same data, but the CRC / ECC error is not triggered on any of both disks (which is hopefully rather unlikely) or if both sectors give a CRC/ECC error.

Important is that scrub runs take place regularly, otherwise the corruption will be undetected and, if one is unlucky, a lot of errors creep in until one notices - and if both sides of the mirror get defect in same place, the data is lost.

Similar is the case for RAID5/6/10 arrays.

What still can happen is that the controller decides suddenly that more than the redundant number of disks are defect and kicks them out. Or that more disks go defect while the array is rebuilding. But that is a fundamental problem and you can't do much about it aside from having lots of redundancy to make this case unlikely.

zfs also has own checksums/hashes of the data also, btw (and is maybe more production-ready than btrfs).

@FelixSchwarz

This comment has been minimized.

Show comment
Hide comment
@FelixSchwarz

FelixSchwarz Mar 20, 2017

zfec is currently not compatible with Python 3 but there is a pull request. Also having a Python API is of course much nicer than calling out to a separate binary (+ zfec is supposed to be faster according to zfec's pypi page).

par2 on the other hand is likely present on more distros and the format seems to be widely used (other implementations/tools available). However the ideal solution for borg would use the error correction internally (otherwise encrypted repo would face quite a bit of storage overhead) so having external tools might not be that useful.

Even with good storage I'd like to see some (ideally configurable) redundancy in borg repos. Deduplication is great but I think it is more important that data is safe (even on crappy disk controllers).

Maybe a good task for 1.2?

FelixSchwarz commented Mar 20, 2017

zfec is currently not compatible with Python 3 but there is a pull request. Also having a Python API is of course much nicer than calling out to a separate binary (+ zfec is supposed to be faster according to zfec's pypi page).

par2 on the other hand is likely present on more distros and the format seems to be widely used (other implementations/tools available). However the ideal solution for borg would use the error correction internally (otherwise encrypted repo would face quite a bit of storage overhead) so having external tools might not be that useful.

Even with good storage I'd like to see some (ideally configurable) redundancy in borg repos. Deduplication is great but I think it is more important that data is safe (even on crappy disk controllers).

Maybe a good task for 1.2?

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Mar 20, 2017

Contributor

Maybe a good task for 1.2?

1.2 has a set of defined major goals, since this would be a major effort it's unlikely.

Contributor

enkore commented Mar 20, 2017

Maybe a good task for 1.2?

1.2 has a set of defined major goals, since this would be a major effort it's unlikely.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Mar 20, 2017

Member

@FelixSchwarz thanks for the pointer, I just reviewed that PR.

But as @enkore already pointed out, we rather won't extend 1.2 scope, there is already a lot to do.

Also, as I already pointed out above, I don't think we should implement EC in a way that might help for some cases, but also fails for a lot of cases. That might just give a false feeling of safety.

Member

ThomasWaldmann commented Mar 20, 2017

@FelixSchwarz thanks for the pointer, I just reviewed that PR.

But as @enkore already pointed out, we rather won't extend 1.2 scope, there is already a lot to do.

Also, as I already pointed out above, I don't think we should implement EC in a way that might help for some cases, but also fails for a lot of cases. That might just give a false feeling of safety.

@gour

This comment has been minimized.

Show comment
Hide comment
@gour

gour Apr 1, 2017

Also, as I already pointed out above, I don't think we should implement EC in a way that might help for some cases, but also fails for a lot of cases.

Does it mean that EC won't be supported/implemented at all in/within Borg, or you're just considering what would be the proper way to do it?

gour commented Apr 1, 2017

Also, as I already pointed out above, I don't think we should implement EC in a way that might help for some cases, but also fails for a lot of cases.

Does it mean that EC won't be supported/implemented at all in/within Borg, or you're just considering what would be the proper way to do it?

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Apr 1, 2017

Contributor

There are a lot of options in that space and evaluating them is non trivial; fast implementations are rare as well. On a complexity scale I see this issue on about the level of a good master's thesis (=multiple man-months of work).

Note that a lot of the "obvious" choices and papers are meant for large object-storage systems and use blocked erasure coding (essentially: the equivalent of a RAID minus the problems of RAID for an arbitrary and variable amount of disk/shelf-level redundancy). This is, that we can say already, not an apt approach if you have only one disk.

Contributor

enkore commented Apr 1, 2017

There are a lot of options in that space and evaluating them is non trivial; fast implementations are rare as well. On a complexity scale I see this issue on about the level of a good master's thesis (=multiple man-months of work).

Note that a lot of the "obvious" choices and papers are meant for large object-storage systems and use blocked erasure coding (essentially: the equivalent of a RAID minus the problems of RAID for an arbitrary and variable amount of disk/shelf-level redundancy). This is, that we can say already, not an apt approach if you have only one disk.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Apr 1, 2017

Member

@gour If we find a good way to implement it (that does not have the mentioned issues); I guess we would consider implementing it.

There is the quite fundamental issue that borg (as an application) might not have enough control / insight about where data is located (on disk, on flash).

Also, there are existing solutions (see top post), so we can just use them, right now, without implementing it within borg.

Member

ThomasWaldmann commented Apr 1, 2017

@gour If we find a good way to implement it (that does not have the mentioned issues); I guess we would consider implementing it.

There is the quite fundamental issue that borg (as an application) might not have enough control / insight about where data is located (on disk, on flash).

Also, there are existing solutions (see top post), so we can just use them, right now, without implementing it within borg.

@anarcat

This comment has been minimized.

Show comment
Hide comment
@anarcat

anarcat Apr 30, 2017

Contributor

just found out about this which might be interesting for this use case:

https://github.com/MarcoPon/SeqBox

Contributor

anarcat commented Apr 30, 2017

just found out about this which might be interesting for this use case:

https://github.com/MarcoPon/SeqBox

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Jun 5, 2017

Contributor

zfec is not suitable here. It's a straight erasure code; say you set k=94, m=100, meaning you have 100 "shares" (output blocks) of which you need >=94 to recover the original data. Means 6 % redundancy, right? No! Those 94 shares must be pristine. The same is true for all "simple" erasure codes. They only handle erasure (removal) of shares, they do nothing about corrupted shares.

A PAR-like algorithm which handles corruption within shares, i.e. you have a certain percentage of redundancy and that percentage can be corrupted in any distribution across the output, is what we need here. (?)

Edit: Aha, PAR2 is not magic either. It splits the input into slices and uses relatively many checksummed blocks (hence its lower performance) which increase resistance against scattered corruption. So what appears at first like a share in PAR2 is actually not a share, but a collection of shares.

Contributor

enkore commented Jun 5, 2017

zfec is not suitable here. It's a straight erasure code; say you set k=94, m=100, meaning you have 100 "shares" (output blocks) of which you need >=94 to recover the original data. Means 6 % redundancy, right? No! Those 94 shares must be pristine. The same is true for all "simple" erasure codes. They only handle erasure (removal) of shares, they do nothing about corrupted shares.

A PAR-like algorithm which handles corruption within shares, i.e. you have a certain percentage of redundancy and that percentage can be corrupted in any distribution across the output, is what we need here. (?)

Edit: Aha, PAR2 is not magic either. It splits the input into slices and uses relatively many checksummed blocks (hence its lower performance) which increase resistance against scattered corruption. So what appears at first like a share in PAR2 is actually not a share, but a collection of shares.

@StefanBertels

This comment has been minimized.

Show comment
Hide comment
@StefanBertels

StefanBertels Aug 8, 2017

What about the "low-tec" solution @tgharold mentioned like just doing backups to multiple repos and have some built-in way of accessing "secondary" repos as source for broken segments?

Setting up multiple backups on different hardware would help you when hardware fails or gets lost. If this setup is usable for bit rot, too, this is a plus.

StefanBertels commented Aug 8, 2017

What about the "low-tec" solution @tgharold mentioned like just doing backups to multiple repos and have some built-in way of accessing "secondary" repos as source for broken segments?

Setting up multiple backups on different hardware would help you when hardware fails or gets lost. If this setup is usable for bit rot, too, this is a plus.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Aug 8, 2017

Member

@StefanBertels having stuff in multiple repos (at different places) is always a good idea.

It can't work with encrypted repos on chunk level as different encryption keys also means differently cut chunks and thus different chunk IDs. So while you could restore a bad file from the other repo, one could not just automatically fetch missing/defect chunks from the other repo.

Member

ThomasWaldmann commented Aug 8, 2017

@StefanBertels having stuff in multiple repos (at different places) is always a good idea.

It can't work with encrypted repos on chunk level as different encryption keys also means differently cut chunks and thus different chunk IDs. So while you could restore a bad file from the other repo, one could not just automatically fetch missing/defect chunks from the other repo.

@enkore

This comment has been minimized.

Show comment
Hide comment
@enkore

enkore Aug 8, 2017

Contributor

This may be eventually possible with replication but to be honest borg check is such a complicated piece of code that adding this in the current state is pretty much guaranteed to make it completely unmaintainable.

Contributor

enkore commented Aug 8, 2017

This may be eventually possible with replication but to be honest borg check is such a complicated piece of code that adding this in the current state is pretty much guaranteed to make it completely unmaintainable.

@spikebike

This comment has been minimized.

Show comment
Hide comment
@spikebike

spikebike Aug 25, 2017

@enkore Right. Using zfec or similar codes normally you have a manifest which includes the checksum of each share. Then when trying to rebuild and error correct you only use the shared with the correct checksum. This works well in practice because checksum checking is quite fast (and the vast majority of the time all you need), but when in dire need you use the error correction from valid pieces.

Of course then you worry about the manifest, much like a file system superblock. Since it's small the usual answer is to make multiple copies.

spikebike commented Aug 25, 2017

@enkore Right. Using zfec or similar codes normally you have a manifest which includes the checksum of each share. Then when trying to rebuild and error correct you only use the shared with the correct checksum. This works well in practice because checksum checking is quite fast (and the vast majority of the time all you need), but when in dire need you use the error correction from valid pieces.

Of course then you worry about the manifest, much like a file system superblock. Since it's small the usual answer is to make multiple copies.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Aug 25, 2017

Member

@spikebike the manifest needs to be written at the end, so guess what would happen if you simply write it twice, after everything else?

It would maybe make you feel better, but it likely would end up on a closeby harddisk sector or in the same flash block, so if there is a defect, both copies might be affected.

Even if we would pre-allocate space at another time for the 2nd copy, it would be still just guessing and, depending on how fs and hardware works, it might not work like we want.

You usually need hardware or kernel/fs level control to safely do better and borg does not have that.

Member

ThomasWaldmann commented Aug 25, 2017

@spikebike the manifest needs to be written at the end, so guess what would happen if you simply write it twice, after everything else?

It would maybe make you feel better, but it likely would end up on a closeby harddisk sector or in the same flash block, so if there is a defect, both copies might be affected.

Even if we would pre-allocate space at another time for the 2nd copy, it would be still just guessing and, depending on how fs and hardware works, it might not work like we want.

You usually need hardware or kernel/fs level control to safely do better and borg does not have that.

@jvgreenaway

This comment has been minimized.

Show comment
Hide comment
@jvgreenaway

jvgreenaway Dec 18, 2017

I think bup’s par2 integration is pretty great. It would be a big plus for borg to have similar integration/recommend application.

jvgreenaway commented Dec 18, 2017

I think bup’s par2 integration is pretty great. It would be a big plus for borg to have similar integration/recommend application.

@ThomasWaldmann

This comment has been minimized.

Show comment
Hide comment
@ThomasWaldmann

ThomasWaldmann Mar 26, 2018

Member

Until someone convinces me otherwise, i think that it is pointless to add error correction in borg (see some of my previous comments). Adding it might just create false hopes and could be perceived as a (false) promise.

One can either use lower levels to add redundancy or just do 2 backups to separate places.

Member

ThomasWaldmann commented Mar 26, 2018

Until someone convinces me otherwise, i think that it is pointless to add error correction in borg (see some of my previous comments). Adding it might just create false hopes and could be perceived as a (false) promise.

One can either use lower levels to add redundancy or just do 2 backups to separate places.

@ticpu

This comment has been minimized.

Show comment
Hide comment
@ticpu

ticpu May 17, 2018

It would be possible to add an external par2 if one could ask borg if the segment is good before adding redundancy. An example at a very high level:

set -e
find data/ -type f -regex '^data/[0-9]+/[0-9]+$' | while read F
do
  check_if_segment_has_changed_since_last_backup || continue
  borg check --segment="$F" && \
  rm -f "${F}.*par2" && \
  par2 create -s16384 -c1 -n1 "${F}.par2" "$F"
done

$ du -shc {1..10}
2.1G
$ du -shc *.par2
5.3M

Then add this after the backup job. Not asking to add this but wanted to add my 2 cents for a way to add redundancy on non-redundant FS. In this case, allow to recover 16384 bytes (4 "normal" sectors) from any segment, maybe tweak the numbers for SSD erase blocks which may be bigger.

ticpu commented May 17, 2018

It would be possible to add an external par2 if one could ask borg if the segment is good before adding redundancy. An example at a very high level:

set -e
find data/ -type f -regex '^data/[0-9]+/[0-9]+$' | while read F
do
  check_if_segment_has_changed_since_last_backup || continue
  borg check --segment="$F" && \
  rm -f "${F}.*par2" && \
  par2 create -s16384 -c1 -n1 "${F}.par2" "$F"
done

$ du -shc {1..10}
2.1G
$ du -shc *.par2
5.3M

Then add this after the backup job. Not asking to add this but wanted to add my 2 cents for a way to add redundancy on non-redundant FS. In this case, allow to recover 16384 bytes (4 "normal" sectors) from any segment, maybe tweak the numbers for SSD erase blocks which may be bigger.

@jaxankey

This comment has been minimized.

Show comment
Hide comment
@jaxankey

jaxankey Jul 12, 2018

Interesting discussion. Learning lots.

Perhaps a helpful tool would be an option to compare two repositories and list all conflicting files with the change history of each (borg diff gets close to this). Then the user could dig in and decide which to keep.

Usually the majority of my repo will not change, so if I see that a "static" file changed a week ago, I'll know immediately which has the error. Or I could manually inspect them and keep the one that looks right.

jaxankey commented Jul 12, 2018

Interesting discussion. Learning lots.

Perhaps a helpful tool would be an option to compare two repositories and list all conflicting files with the change history of each (borg diff gets close to this). Then the user could dig in and decide which to keep.

Usually the majority of my repo will not change, so if I see that a "static" file changed a week ago, I'll know immediately which has the error. Or I could manually inspect them and keep the one that looks right.

@JonasOlson

This comment has been minimized.

Show comment
Hide comment
@JonasOlson

JonasOlson Jul 14, 2018

use borg to have N (N>1) independent backup repos of your data on different targets (if N-1 targets get corrupt then, you have still 1 working left.

Functionality for using multiple backup servers at once (either containing identical data or complementing each other in some way) would be useful also for other reasons, such as being able to create and retrieve backups when one server is stolen or inaccessible. (This is assuming that you didn't mean just doing multiple backups independently of each other.)

JonasOlson commented Jul 14, 2018

use borg to have N (N>1) independent backup repos of your data on different targets (if N-1 targets get corrupt then, you have still 1 working left.

Functionality for using multiple backup servers at once (either containing identical data or complementing each other in some way) would be useful also for other reasons, such as being able to create and retrieve backups when one server is stolen or inaccessible. (This is assuming that you didn't mean just doing multiple backups independently of each other.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment