New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attic does not backup UNIX domain sockets #259
Comments
I could reproduce it. Easy way to create a socket: ncat -l -U /tmp/mysock It won't be in a attic 0.14 backup, see below. |
from attic/archiver.py:157:
So, it is expected with current code that you won't have UNIX Domain Sockets in your backup. So the remaining questions are "why not" and "why would one want them"? I admit I have not much clue about UDS, so if someone can help with reasoning here, that would be welcome. |
Just tried the good ol' tar:
So, tar doesn't backup UDS either. Just saying, the reasoning why or why not is still todo. |
Backup / archiving tools that ignore sockets "know" that software will create or recreate a socket if it needs to and will bind or rebind it to a path name. So, backing up sockets makes no sense. |
That's curious; I must admit I did not realise that However, a better question from my point of view is, "should a backup program backup sockets?", rather than looking purely at Again, a quick Google suggests that there may be no purpose in backing up and/or restoring sockets. This is the conclusion of this thread on superuser. However, I am not sure that is the end of the question; I have two observations:
Does it do any harm to backup and restore the sockets? In balance, it seems better, from a user's point of view? And the program using it will just recreate it either way. Perhaps this is why other tools do it? |
After first implementing "socket backup/restore support", I read that superuser discussion and esp. this:
I also tried that myself and can confirm it: there is no point in re-using a restored UDS. The program that tries to listen on it will just fail. The program must be able to delete a existing UDS first to be able to listen on (and create) a fresh one. If it can't delete or create due to missing permissions, it can not work. So, argument 1 is invalid. About argument 2: we could backup and restore UDS on a "it was there when we backed up, it will be there after restore (even if not really connected and pretty useless)" basis. An advantage of this would be that we do not have to tell "ignored socket" like tar does (and confuse everybody who does not know details about how UDS work). Opinions? |
If someone wants to give it another test: |
Cool, I can't remember ever having created or used a socket file myself for anything I've worked on, so I was unsure as to some of the finer points. The superuser discussion was quite informative and convincing, and if you have tested the conditions I speculated on in my first point, I am quite happy with your conclusion. (And it sounds logical, because to delete the socket, write permissions are required on the containing folder, etc.) So I think we are all in agreement that there is no technical need to backup sockets. I have thought quite a bit about point 2. I ultimately think that sockets should be backed up. My thinking goes like this:
Of course, all this is just my personal opinion - I'm not even contributing to the codebase! So I can hardly complain if you disagree :o) However:
Regarding your patch: I will test this in the next couple of days; once I have finished my current tests with official Attic my plan is to rebuild my Debian packages based on the merge-all branch and give that a spin, at which point I will also include your UDS commit. |
Restoring a socket file is not only useless, but will possibly also overwrite an already bound socket. This is bad because applications will not be able to connect to the service that binds the socket until that service recreates it. Many applications are not coded to be able to recover from this and will need to be restarted. So, please, do not break this. |
If restoring into a location that already has content, you could say the same about other files - for instance, any PID files or files in use by flock. Restoring these files could therefore also cause problems. What would you have Attic (or any backup tool) do in this situation? I would be happy if the default behaviour were to skip restoring sockets if they already exist. This seems sensible enough and would not cause any problems. Arguably the user could/should be informed about this. I do however see no purpose in adding an option to force-overwrite sockets, as this would be pointless due to their nature. But an option to filter them out when restoring could perhaps be useful to some people. |
You will normally exclude PID files as well as any other type of files that you do not need to backup, as you do not need to restore them. The good thing with socket files is that they are special, so we can hardcode their exclusion in the backup tools.
Don't be so sure. It could happen that the service creates the socket right after the restoration tool checked for the existence of the path, but before the restoration takes place.
I would say the oposite. By default, do not backup socket files, but have an option if someone (like you) insists in taking backups of these. |
That's a rather sweeping statement - do you really mean that for every backup you do, you carefully check where all the PID files are and write exclusion lists for them? And that you know this list is complete, and that you keep it up-to-date? If so, that is remarkable - and I suspect not many other people would do this. I have never bothered, as I can't see the point. I would rather have a backup that gathers up everything possible. Regarding the sockets specifically, of the tools I tested, Attic was the only one that did not backup and restore sockets. My sample pool is rather small and I should probably investigate other tools, but the writer of Obnam in particular strikes me as rather pedantic about data integrity (which is a good thing) and so must have thought through this very problem.
This seems like an extremely unlikely event.
What is more intuitive for the general user - to do everything, or to miss bits out? This is certainly an interesting debate - I had no idea people were so passionate about sockets! (Myself included!) ;o) |
Thankfully, there is FHS. So, it's as simple as choosing to exclude any combination of some directories:
But here we are talking about sockets. And sockets are an easy pick, since they are special and can be handled by the tool without users specifing paths.
You never know. Better be safe than sorry. Also consider this scenario: You restore a socket file and its permission bits are set the same as they were at the time of the backup. If in the meantime, after the backup but before the restore, you had changed the configuration of the service creating the socket to run with different permissions, it will fail to recreate the restored socket file due to a permission error when trying to unlink() it.
I would say that current behavior, a logging message that sockets are ignored, is quite intuitive. If an option for including sockets in the backup is implemented, then that message could be extended with a hint for that option. But what it matters here is practicallity and having maximum safety in restores. If you are some kind of purist and know about the risks of restoring socket files, then use that option. If not, better be safe and let the tool do the safest for you. |
Actually, no, it's not quite that simple - the socket I encountered during my backup test lives in my home directory somewhere. I would not have predicted its existence. Also, I personally would not attempt to restore a root filesystem, and if I did, I would think carefully about exclusions. At this point I would perhaps/probably exclude sockets, but only at restoration time. Again, personally I only tend to back up
That's a very good point. You may also have changed the owner. Still, even though I totally agree with your statement, I feel it is off topic. The reason being, like I pointed out previously, you could do the same with PID files or indeed any file. There is always a chance that you may have changed something since you made the backup, and that restoring it could cause problems. I believe this is an exercise for the user at the point of restoration. I don't believe the backup software should make decisions like this - I think it's outside of the remit. In my very humble opinion on this matter, I think backup software should faithfully backup and restore everything, perfectly, to be an exact mirror of the original unless told otherwise.
I have not seen that behaviour. All my tests so far have used the latest code on the official master branch. I assume that this new behaviour is part of the merge or merge-all branch? I would still like the message to be a restore-time message and not a backup-time message. That just seems safer.
I think this is where we disagree - to me, the "safest" behaviour for backup software is to include everything. When it comes to backing up my data, I guess perhaps I am a purist. I can live with that :o) But I would like to be able to do the following in an automated way: Backup -> Restore elsewhere -> Diff -> Check identical This to me is the gold standard of testing data integrity. I have plenty of files that are most likely useless and it would not matter if they were skipped. But I don't want my backup software making that decision for me. I also still believe that "safest" only applies in certain situations if the files are restored to an original location. In this instance the user should be carefully choosing the restore behaviour - whether to replace existing files, whether to restore sockets. So that would be an informed process, and one knowingly configured. Still, it seems that we have reached an agreement on whether the sockets should be backed up (yes). Our current disagreement is whether the default behaviour should be to restore them or not. I am pragmatic, and providing there is a way to achieve the restoration of sockets, and this is clearly documented in terms of how to achieve it (and even more clearer stated that the default is not to restore them) then I guess I can live with that, although my vote is definitely to have it the other way around (this would also be more in keeping with other tools). I'm not sure how such things are generally resolved; is there a community vote? Or does it come down to a decision by the people actually writing the code? Just curious. |
You shouldn't care about its existence, as far as your backup configuration is concerned. Backups of sockets are not useful. BTW, not following FHS standard (or extensions of it) makes administration of things more difficult.
Yes. But here we are talking about the special case of socket files, with which the tools can be smart enough and not bother with.
My mistake. Ignoring socket files is currently silent. This can be changed though.
There is no identicality in the case of socket files. The restored paths will not be bound to any in-memory socket objects.
It won't. It can only take a decision for sockets, which is a special, easily catched case.
Nope. I clearly said that the way attic works now is the safest. It does not backup sockets. But I am ok with an option to explicitly backup socket files.
Usually, @jborg listens to people and makes the final decisions. |
Other than the fact that I would like to be able to restore it if I want, to check that everything is an exact copy. As explained.
I totally agree. Unfortunately, not all software follows those rules. The socket file I mentioned I discovered in my home directory (which I only realised existed when Attic didn't restore it) was not created by me. That test was on an Ubuntu desktop system and I have no idea what created it or if it was in a sensible location. Potentially lots of things could create such sockets I guess. It's a little easier on servers, but there is still no guarantee.
How? I am not aware of an option for it at present (unless that's in the newer branches) so I assume you mean it can be changed in the codebase. Again, it seems better to back them up, thereby not needing any warning or option, and leave the warnings and/or options to restore-time, whichever way it is achieved.
You are incorrect, at least from the context that I am talking in. As far as I am aware, a (named) socket exists as a zero-byte file. So, it has ownership, permissions, etc. but no contents. Hence an integrity check would pass if the file were restored, regardless that the restored file is not in use or bound. Remember, I performed integrity checks (using diff, rsync, and bcompare) and both Bup and Obnam passed - they restored an identical copy of my data. Attic did not restore the socket file, hence how this discussion started.
Okay, perhaps I misunderstood. My apologies. But surely you cannot object to automatically backing-up the socket files, and having a restore-time option as to what to do with them? It would appear unnecessary to have to specify an option on both sides. After all, they are not exactly doing any harm, especially if the default behaviour is to ignore them on restoration. I'm curious as to what others think on this topic? I feel that we have covered all the ground I can think of, but perhaps we have missed angles or points. |
I would be willing to bet some amount of money that diff, rsync[1] and So if, lets say, you have ~/tmp/socket1 and ~/tmp/socket2 and they are I am with Petros here: there could be an option to back up sockets for 1: For example, rsync requires --specials to sync at sockets, which is off On Fri, Apr 3, 2015 at 1:52 PM, Dan Williams notifications@github.com
Dmitry Astapov |
Mmmm, I don't want to split hairs, but... I agree those tools do not attempt to compare the file contents, but disagree about that not meaning identical data. Sockets, like fifos, are recognised special files and it doesn't make any sense to try to look inside them for copy/backup purposes, plus they are technically zero-byte files, I believe. Hence the comparison is on the metadata (ownership, permissions, any other metadata) as well as the existence of the file. The backup/restore would also be on this basis, which is what I have experienced elsewhere, hence the data is in fact identical from a filesystem point of view. I hope this clarifies the intent of what I said earlier. So I am not expecting contents to be read. |
On Tue, Apr 7, 2015 at 4:55 PM, Dan Williams notifications@github.com
Case in point - I just tried to use diff to compare directories with % mkdir /tmp/{1,2} No telling if they are same or different Also, rsync: % rsync -avi --specials /tmp/1/ /tmp/2/ sent 63 bytes received 15 bytes 156.00 bytes/sec Now, lets change one of those sockets: % cp -a /var/run/rpcbind.sock /tmp/2/a sent 63 bytes received 15 bytes 156.00 bytes/sec As you can see, even with --specials rsync does not tell me that sockets How sure you are that comparisons you ran would've actually detected any Dmitry Astapov |
Well, firstly the restore check would show me if it was restored as an ordinary file instead of a socket, for instance. But having poked far too quickly at this, I'm not sure if ownership and permissions count as differences. However, I do think the matter of existence is important on its own. Perhaps I am in the minority here, but my opinion is consistent with certain other backup tools. |
Hmm.. Nitpicking: you say "ownership and permissions of the actual socket do I also think that being consistent with some backup tools is neither here I'd personally be much happier if sockets were backed up only when an On Wed, Apr 8, 2015 at 11:58 AM, Dan Williams notifications@github.com
Dmitry Astapov |
Sorry, I posted in a hurry and did not initially realise that I had made a mistake. I edited the post fairly quickly, but I think you have replied to the alert email... To be honest I think everything that can be said has been said, so I'll leave it to the powers that be to decide :o) |
On the mailing list, Dan Williams (2015-04-01) reported an issue with attic master branch code:
"""
There was one problem restoring, however: Attic failed to restore a socket
file. Ouch! The other two [obnam and bup] restored the file just fine.
"""
Can somebody else reproduce this?
The text was updated successfully, but these errors were encountered: