Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Ignore or manually set backup file ownership / permission metadata #621

Open
desseim opened this issue Sep 1, 2021 · 3 comments
Open

Comments

@desseim
Copy link
Contributor

desseim commented Sep 1, 2021

Is your feature request related to a problem? Please describe.

I have a local source directory which is backed up from 2 different systems (one Windows and the other Linux).
When backing up the given directory under one system, then rebooting under the other system and re-running the same backup, even if no file changed between the 2 backups, every single file in the directory gets backed up again.

From what I've found out so far, I presume this is due to a difference in metadata. I've noticed Uid/Uname, Gid/Gname and permissions being different in the rdiff-backup directory depending on the system the backup was run from.

I guess the same issue would occur when remounting the same source directory with different ownership or permissions even within the same environment, and since the Windows Uid and Gid are 0 and the Uname and Gname empty, the directory couldn't be mounted in Linux so as to match them anyway.

--user-mapping-file and --group-mapping-file seem not to affect recorded metadata and thus to be useless in this case.

As a side note, the --compare command seems to ignore metadata differences, while the backup command considers them relevant to decide whether to update a file or not.

Describe the solution you'd like

Assuming my understanding of the issue is correct, options to either ignore or forcibly set or map the file metadata would be useful.
In my particular use case, as the source directory is exFAT, ownership and permissions are irrelevant and an option to ignore these metadata when comparing files before backup would be enough.
An option not to register them at all, or set them to user-defined values would work as well.

Describe alternatives you've considered

Somehow forcing the running environments to present the same ownership and permission values to rdiff-backup for the source directory (e.g. through mount options) seems to be the only workaround, but is often not a possibility.

Additional context

In my particular use case I experimented backing up an exFAT source to an NTFS destination using rdiff-backup 2.0.0, but my understanding and few other experiments with other versions indicates that the same issue should exist with the latest version and other file systems.

@ikus060
Copy link
Contributor

ikus060 commented Nov 8, 2021

Hello @desseim
You could try to run your backup with --no-acls on Linux and Windows to effectively ignore the ACL.

@desseim
Copy link
Contributor Author

desseim commented Nov 29, 2021

Hi @ikus060 , apologies for the late reply as I was away for a while and thanks for the suggestion.

I should have mentioned that I had tried running the backups with the no-acls option before everything I described in the original post, my bad.
To be sure, I ran tests once again specifying --no-acls on both backups (Windows and Linux) and can confirm the backups are re-run each time and the rdiff-backup metadata records different Uid / Uname / Gid / Gname / Permissions for each backup, just as described in my original post. I tried this time with versions 2.0.0 and 2.1.0a1, running ExFAT to NTFS backups from either Windows 10 or WSL Ubuntu.

@clifcox
Copy link

clifcox commented Feb 19, 2022

This issue also came up for me In regards to Linux namespaces, snapshots, and somewhat related to my other enhancement request #670,

In my recent change to using snapshots I noticed a lot of metadata changes (650MB) due to the backup machine not having the same list of users, and groups as the source image. What was happening was the uname and gname entries were being replaced with null strings because those names were not currently available from the snapshot source I was using. This is probably the same problem the OP had when backing up through the windows system. The old names were still valid, and erasing them because they were temporally unavailable doesn't seem desirable. ;-)

Obviously when one does cross machine backups, you almost always set --preserve-numerical-ids. However this doesn't imply ignoring the (possibly incorrect) usernames that it has access to. Of course when you are restoring with --preserve-numerical-ids they are ignored because the receiving machine looks up the names in it's local password and group files.

I agree that adding an option like --no-names, or --ignore-names with the intention that whatever uname and gname data is there in the metadata will be left untouched for this run. Of course, other metadata may change. At some point you may do a run from the actual source machine to refresh the correct user and group names, and then they could be updated.

I suppose an alternative could be to use a meta data mapping file like suggested by the OP, or described in #670, to update the names, but that is a bigger project. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants