Question: safe copy of repo? #5284

Grunthos · 2020-08-05T06:07:35Z

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

QUESTION

The FAQ clearly highlights the dangers of copying a repository, and notes the risk of corrupt caches when accessing a copy of a repository. My proposed 'secure' backup strategy is:

**Host A**
borg-create a local backup in an existing local repo

**Host B**
rsync (pull) the repo (using with-lock) [ only needs limited read access to Host A ]
mount the latest backup
borg-backup from the rsync'd 'staging' repo to local 'master' repo from the mounted backup
    (at least until we have a 'merge' function)

My problem is that while this seems to avoid the cache issues, accessing the repo on Host B (from Host B) will presumably create a cache, then the next time rsync is run, the repo may be radically changed, thus invalidating the cache on Host B.

Since it is a local->local copy, what is the best way of avoiding this potential problem? Just delete .cache/borg on Host B after doing the rsync?

The text was updated successfully, but these errors were encountered:

ThomasWaldmann · 2020-08-05T13:21:34Z

When using the FUSE fs (borg mount), you might lose some metadata (e.g. ACLs and "bsdflags" / fs flags).

Also, backing up such a mount might be rather slow when not being careful (guess the inode numbers would not be stable, so one needs to ignore them). For big archives, resource usage (esp. RAM) might be also quite high.

Considering that, would you still like to do that?

It looks a bit like you're looking for pull mode backups. We have a documented way to do them since recently, did you try that?

Grunthos · 2020-08-05T17:01:03Z

Thanks for the quick reply. I am not too sure if I should worry about "bsdflags"; ACLs don't worry me -- I assume the usual srwxgrwxrwx are preserved? It's a linux box.

The backup from the mount point takes a little over an hour; the data size is 100GB or so. I have not looked into RAM usage.

Pull backups are indeed what I am trying to achieve, but with the "puller" having minimal access (which this provides, since the backup is done locally on Host A). The use of SSHFS requires root access to the whole FS and was very slow in the one test I did, so it's not ideal for me.

I guess my revised concerns are: (a) is deleting the local cache on Host B sufficient to protect from corruption, or is there a better method, and (b) what are the exact implications of the loss of bsdflags (it's linux, so I assume I'm ok)?

Edit: as an aside, I am hoping the "merge" feature, if it appears, will remove the need for the mount/backup step on Host B.

ThomasWaldmann · 2020-08-06T19:13:47Z

The general term is (filesystem) flags, but that is hard to search for and it used to be called bsdflags in borg, because the stuff on linux was mapped onto the bsd flags. It's stuff like the immutable flag, for example.

The normal mode is reflected by fuse, that should be easy to see.

I currently don't see a cache (corruption) issue with what you posted.

Grunthos · 2020-08-12T03:52:10Z

My owrry with cache corruption is:

On Host B (with the copied version) I do something that uses the cache (note that I never plan to update the copy except via rsync)
Later, I use rsync to update the repo; this possibly changes or deletes blocks (not sure what happens under the hood, but imagine that the source has, for example, been pruned). This will erase/update any changes that may have occurred in prior 'list' or 'mount' or 'info' operations (note that these seem to all have an effect on the repo)
Once again, I do something that uses the cache on Host B...could the cache now be invalid?

ThomasWaldmann · 2020-08-12T11:51:05Z

If you only produce (e.g. by copying, syncing) states of a repo that would occur "naturally" also (e.g. by using/updating it by multiple borg clients), there should never be any cache issue because borg checks whether its cache corresponds to the repo.

Grunthos · 2020-10-31T13:02:27Z

An update: this seems to work, except I think there IS an issue with the cache. If the source repo has a partial data file (<512MB) and that is copied to the second machine then cached some time later, it sometime fails to recognize when the original file is later updated (and rsynced). I suspect there might be a race condition on the file date....or something that results in the cached copy being updated. Not entirely sure what is happening, but it produces corrup block warnings, and deleting and rebuilding the cache fixes it (this is on 1.1.9)

ThomasWaldmann · 2020-10-31T13:21:29Z

With "natural states" I did only mean the non-locked states. You newer should copy a locked repo that might have partial files.

Grunthos · 2020-10-31T13:28:03Z

Misunderstanding: the repo is never locked or in use when it is rsynced. Does that mean the partial file is not normal? It's possible the local backup died and left it in a mess.

edit; also, have now updated to 1.1.14

ThomasWaldmann · 2020-10-31T13:51:55Z

You should use borg with-lock to start the copy process. The repo will be locked for the copy process then.

Even when borg is crashing / the connection breaks down, the repo should not be in a problematic state.

Grunthos · 2020-10-31T13:56:13Z

That's what it does (borg-with-lock). Pretty confident the repo was unused. I do think there is a race condition with the remote cache...or something...next time it happens (1 in 100 roughly so far), I'll take more details. Erasing the cache fixed it, which makes me think it's a cache problem.

Does the cache check both file size and date when using cached values?

ThomasWaldmann · 2020-10-31T14:11:06Z

There are multiple caches / indexes, see the code for details.

But, as a general comment: if you need to think about that, something is already wrong.

In your case, it is that you have 2 copies of the same repo (with the same repo id) and you work with both. As long as they are in same state and you only do read accesses, it will work. If not, it won't / it might cause issues.

Grunthos · 2020-10-31T14:14:32Z

Don't forget the two identical repos are on different machines. The only contact between those machines is rsync, which is run while the source repo is locked. And the copy is only ever read.

And yes, it looks like something is wrong, but that was on 1.1.9....so maybe not in 1.1.14.

Grunthos · 2020-11-10T05:22:48Z

OK...a lot more testing and something is very fishy.

I rsync'ed a repo to another machine (let's call the copy S).
I stopped all other access to that repo, and stopped rsync
I enabled a background task to:

scan another repo (M) for archive names
scan repo S for archive names
pick a name in S but not in M (sorted by date)
'''borg mount``` the S archive
borg backup from the S mount point to M
rinse and repeat

After about 50 such tasks, borg mount failed with a bad chunk checksum.

I ran a borg check, and it reported the bad chunk.

I copied (via cp -a) the repository to aid in debugging. Did nothing else with it.

I re-ran borg check on S. It found no errors.

Now I am confused.

Edit: if it helps, I have the borg mount output...

ThomasWaldmann · 2020-11-10T10:02:01Z

Could be random corruption happening when accessing it at the original location, but not at the copied location.

Also, you are working with borg with 2 repo copies that have the same repo id and thus use the same clientside cache...

Grunthos · 2020-11-11T04:26:43Z

Sorry, you missed the point: the copy was not touched. I re-checked the ORIGINAL. Did not borg commands on the copy whatsoever...hence why I wrote "I copied (via cp -a) the repository to aid in debugging. Did nothing else with it.".

ie. as far as borg is converned I have only one copy with one id.

To try to be abundantly clear: the original location now verifies, and borg knows nothing of the copy...it was made in the expectation I would need to investigate the corruption further, but it went away...

ThomasWaldmann · 2020-11-11T13:58:52Z

OK, sounds like random corruption somewhere.

Grunthos · 2020-11-12T05:10:50Z

Indeed, and since in the first instance deleting the cache fixed it, and in this most recent instance, it "just fixed itself" one in left with the strong suspicion that the cache code may be buggy (ie. the cache is the source of the corruption), since the underlying data verifies.

ThomasWaldmann · 2022-04-02T20:11:55Z

see updated #5830.

ThomasWaldmann added the question label Aug 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: safe copy of repo? #5284

Question: safe copy of repo? #5284

Grunthos commented Aug 5, 2020 •

edited

ThomasWaldmann commented Aug 5, 2020

Grunthos commented Aug 5, 2020 •

edited

ThomasWaldmann commented Aug 6, 2020

Grunthos commented Aug 12, 2020

ThomasWaldmann commented Aug 12, 2020

Grunthos commented Oct 31, 2020 •

edited

ThomasWaldmann commented Oct 31, 2020

Grunthos commented Oct 31, 2020 •

edited

ThomasWaldmann commented Oct 31, 2020

Grunthos commented Oct 31, 2020 •

edited

ThomasWaldmann commented Oct 31, 2020

Grunthos commented Oct 31, 2020 •

edited

Grunthos commented Nov 10, 2020 •

edited

ThomasWaldmann commented Nov 10, 2020

Grunthos commented Nov 11, 2020 •

edited

ThomasWaldmann commented Nov 11, 2020

Grunthos commented Nov 12, 2020

ThomasWaldmann commented Apr 2, 2022

Question: safe copy of repo? #5284

Question: safe copy of repo? #5284

Comments

Grunthos commented Aug 5, 2020 • edited

Have you checked borgbackup docs, FAQ, and open Github issues?

Is this a BUG / ISSUE report or a QUESTION?

ThomasWaldmann commented Aug 5, 2020

Grunthos commented Aug 5, 2020 • edited

ThomasWaldmann commented Aug 6, 2020

Grunthos commented Aug 12, 2020

ThomasWaldmann commented Aug 12, 2020

Grunthos commented Oct 31, 2020 • edited

ThomasWaldmann commented Oct 31, 2020

Grunthos commented Oct 31, 2020 • edited

ThomasWaldmann commented Oct 31, 2020

Grunthos commented Oct 31, 2020 • edited

ThomasWaldmann commented Oct 31, 2020

Grunthos commented Oct 31, 2020 • edited

Grunthos commented Nov 10, 2020 • edited

ThomasWaldmann commented Nov 10, 2020

Grunthos commented Nov 11, 2020 • edited

ThomasWaldmann commented Nov 11, 2020

Grunthos commented Nov 12, 2020

ThomasWaldmann commented Apr 2, 2022

Grunthos commented Aug 5, 2020 •

edited

Grunthos commented Aug 5, 2020 •

edited

Grunthos commented Oct 31, 2020 •

edited

Grunthos commented Oct 31, 2020 •

edited

Grunthos commented Oct 31, 2020 •

edited

Grunthos commented Oct 31, 2020 •

edited

Grunthos commented Nov 10, 2020 •

edited

Grunthos commented Nov 11, 2020 •

edited