Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicate archiving / hardlink extraction issue #5607

Closed
ThomasWaldmann opened this issue Jan 2, 2021 · 3 comments
Closed

duplicate archiving / hardlink extraction issue #5607

ThomasWaldmann opened this issue Jan 2, 2021 · 3 comments

Comments

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Jan 2, 2021

basics: see #5603

#5603 is a special case (giving the same recursion root twice or more times) of a more generic case:

just imagine the same directory d (inode i) is mounted at multiple places a, b, ... in the filesystem.

if borg then backs up the filesystem starting at some recursion roots r, s, ..., it will run into an issue, if a, b, ... are contained in the directory tree below r, s, ... somehow.

borg will just come by the same inode i multiple times while doing the recursion.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Jan 2, 2021

ideas for solving this:

as in the fix for #5603 (#5606), we could add inodes to the ignore_inodes set. once a directory inode is in that set, that directory won't be processed again.

guess we would have to add all directory inodes, which could be quite a lot.

luckily, we only add 1 tuple (inode, devno) to the set, per directory.

better ideas?

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Jan 3, 2021

another idea (only addressing the extraction issue):

if the hardlink processing would be changed to archive another hardlink master with content to the archive, extracting duplicate directory trees with hardlinks would work, because they would be extracted exactly in the same way as the first tree.

the difficulty with that might be deciding when to archive another hardlink master with content instead of a h.s. - guess borg would need to remember inode and devno for each hl master as we can not use the path to detect this.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented May 17, 2022

extraction: guess this was fixed by #6703 :

  • borg < 2.0: hardlink masters and contentless slaves
  • borg 2+: set of related hardlinks == regular file items with identical hlid value, all with content chunks list.

archiving: a careful admin can just not archive the same content multiple times (excluding the stuff, using --one-filesystem, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant