Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude files based on xattrs #1242

Open
nicolas17 opened this issue Sep 16, 2017 · 20 comments
Open

Exclude files based on xattrs #1242

nicolas17 opened this issue Sep 16, 2017 · 20 comments
Labels

Comments

@nicolas17
Copy link

I have been using Duplicity, with a script to actually run the backup. My script has 30(!) different --exclude switches. I found this to be annoying to maintain there.

I'm considering switching to Restic, and I saw it supports cachedir.tag, which is already a big improvement. Now I won't need things like --exclude=.ccache because I already have a ~/.ccache/CACHEDIR.TAG!

However, I think it would be useful if I could also exclude files by marking them as such in the filesystem, instead of in a configuration file or script specific to restic. There is a freedesktop.org proposal to use the user.xdg.robots.backup=false extended attribute to mark a file or directory as excluded for backups.

Using xattrs has the advantage that moving or renaming an excluded file or directory will keep it excluded, unlike a text file with a list of exclusions.

Could this be implemented in restic? I guess it wouldn't be so hard™ now that restic is already looking at xattrs for the purpose of saving them.

@fd0
Copy link
Member

fd0 commented Sep 16, 2017

Hey, thanks for raising this issue. I think it would be better suited for the forum, but let's continue here while we're at it.

I'm a bit astonished that the folks over at freedesktop.org decided to formulate this proposal, as there is already a file system attribute called no dump (can be set via chattr) to specify exclusion from a backup: https://en.wikipedia.org/wiki/Chattr#Attributes Please note that restic does not yet support this attribute, that's a todo item.

So, I'd rather add support for the standard chattr attribute than the proposed extended attribute.

By the way: Did you discover the --exclude-file switch yet? It can be used to read exclude lists from a file, so you could store them in one central place.

So, what shall we do with this issue here? Track implementing honoring the no dump attribute?

@fd0 fd0 added type: question/problem usage questions or problem reports type: feature suggestion suggesting a new feature labels Sep 16, 2017
@nicolas17
Copy link
Author

chattr attributes are specific to ext2/ext3/ext4 filesystems. btrfs supports some of them too.

I would say xattrs are more standard than that.

@fd0
Copy link
Member

fd0 commented Sep 17, 2017

Is there any program that allows settings this extended attribute?

@rawtaz
Copy link
Contributor

rawtaz commented Sep 25, 2017

Perhaps put it in the --exclude-caches argument. Introducing a new argument for every feature people suggest is not a good way down the road, IMO. But the aforementioned argument is about excluding things based on a "standard", so that's where this belongs.

@nicolas17
Copy link
Author

I didn't see anyone suggest that this should be yet another --option...

@bherila
Copy link

bherila commented Sep 26, 2017

NTFS, FAT32, and exFAT have the Archive attribute

@nicolas17
Copy link
Author

The Archive attribute is intended to mean "the file needs to be backed up because it has changed since the last backup", so Windows will automatically set it when the file is modified. I think that makes it useless for backup exclusion.

@bherila
Copy link

bherila commented Oct 4, 2017

The Archive attribute is intended to mean "the file needs to be backed up because it has changed since the last backup", so Windows will automatically set it when the file is modified. I think that makes it useless for backup exclusion.

I was thinking along the lines of "don't back up files which do not have the Archive attribute set"

Also, perhaps restic could clear the Archive attribute after backing up a file (although that would probably be best as a separate issue/work item)

@cfcs
Copy link

cfcs commented Oct 4, 2017

@fd0 the easiest way to set them seems to be using setfattr, as referenced in man xattr. It comes in the package attr (on debian at least), and is not installed by default here.

@cfcs
Copy link

cfcs commented Oct 4, 2017

Python one-liner using ctypes:

$ python -Bc 'import sys,os,ctypes as c;m=c.CDLL("libc.so.6");h=os.open(sys.argv[2],0x501);f=m.fsetxattr;f.argtypes=([c.c_ulong,c.c_char_p,c.c_char_p]+[c.c_ulong]*2);n,v=sys.argv[1].split("=");print f(h,"user."+n,v,len(v),0)' rofl=copter myfile
0
$ attr -g rofl myfile
Attribute "rofl" had a 6 byte value for xoxo:
copter

@nicolas17
Copy link
Author

There is an os.setxattr function, no need to go low-level into ctypes and libc.

@pvgoran
Copy link

pvgoran commented Oct 5, 2017

I was thinking along the lines of "don't back up files which do not have the Archive attribute set"

This looks like a bad idea. As far as I remember the DOS/Windows world, the "archive" bit is not something that you can rely on. It's considered an unsignificant piece of information, and certain programs would freely set and reset it for their own purposes (unrelated to archival).

Also, perhaps restic could clear the Archive attribute after backing up a file (although that would probably be best as a separate issue/work item)

... and if combined with this, it becomes a terrible idea. This way, after creating the first snapshot, the flag will be reset for all archived files, and subsequent snapshots will only contain changed files.

@fd0
Copy link
Member

fd0 commented Oct 5, 2017

@cfcs Ah, maybe I wasn't clear enough: I'm aware that users can set this attribute by hand. I was asking about software (like a file browser or so) that already uses this attribute. If there's none, I don't see value in adding the check to restic.

@hoelzro
Copy link
Contributor

hoelzro commented Jun 12, 2018

Is there a chance for restic to implement the no dump attribute? I would like to use restic for local backups and this would really help slim my repository down! I'd be happy to help work on it if time is an issue.

@fd0
Copy link
Member

fd0 commented Jun 12, 2018

@hoelzro the question I have right now is: Does it make sense to implement this? Is this a feature which is widely used (so it makes sense to implement and maintain it) or is it a rather unknown and obscure feature that nobody uses (so it may become a maintenance burden quickly)?

That's an honest question :)

We already have a lot of options to exclude stuff from the backup...

@hoelzro
Copy link
Contributor

hoelzro commented Jun 13, 2018

@fd0 That's a fair question - I mean, I myself use it for various SQLite files I generate in my work, but I can't say I know anyone else who does! I've been using it with tarsnap with success; I could always write a little cronjob to generate another excludes file from attributes if you decide it's not a good fit.

@cfcs
Copy link

cfcs commented Jun 13, 2018

I feel that xattrs are easily lost in "transit," and that the options to ignore things (--exclude, empty file in directory with significant name) are powerful enough to accomplish this.

@alphapapa
Copy link

@hoelzro You might find restic-runner helpful for configuring backup sets, including extended options like excludes.

@bkmeneguello
Copy link

GNU tar provides some xattrs parameters (https://www.gnu.org/software/tar/manual/html_node/Extended-File-Attributes.html), it provides the ability to store or ignore xattrs and to include/exclude files based on xattrs expressions, like tar --xattrs --xattrs-exclude='user.*' -c a.tar ..
I think is better to avoid using xattrs by default by two reasons, first it could give weird behaviors if someone has them set by another reason, and second, reading xattrs require more syscals which can impact performance (unless explicitly desired)

@da2x
Copy link
Contributor

da2x commented Feb 22, 2019

This would also be useful on macOS where the system default Time Machine backup system use the com.apple.metadata:com_apple_backup_excludeItem extended attribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests