New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket activation & unix domain socket #71
Comments
After some pondering I decided to reopen this issue. Sure, it's an issue of the standard library, not entirely yours, but I think you guys have more expertise to rise and express it to library developers. The problem is that I need to be able to close unix domain socket, provided by systemd, without unlinking it from a file system. And using net.UnixListener API I cannot. From http://www.freedesktop.org/software/systemd/man/systemd.socket.html: "A daemon listening on an AF_UNIX socket may, but does not need to, call close(2) on the received socket before exiting. However, it must not unlink the socket from a file system". So, I have an option not to close socket, but in this case program cannot return from Server.Serve() or Listener.Accept() when receives signal to terminate and hangs till systemd forcibly kills it (not giving chance to final cleanup). Alas, my previously offered "solution" doesn't address this issue. It works in a simple test as a no-op Listener.Close() and leaves socket opened, because file descriptor get duplicated in net.FileListener(f) and though I close my descriptor, there is still an opened descriptor in the UnixListener internal field. |
I re-wrote test to better illustrate the issue, see https://github.com/nikolay-turpitko/go-systemd/commit/578612f6e0c4a1610641271f90169026e1887fd7 Notice constants in listen.go at line 43 and in listener_test.go at line 43. They allow to try different behavior - with one or several forks of child process, with or without method Close() invocation. The issue is, that with unix domain sockets second fork of the child process terminates with error "dial unix /tmp/activation-test-1.sk: no such file or directory". And without Close() process just hangs up. I think, that it hangs up in the https://golang.org/src/net/fd_unix.go, line 404, method func (fd *netFD) accept() (netfd *netFD, err error). It seems, that accept() at line 416 should invoke accept4 system call with my kernel, so the problem is to somehow break out of that endless for loop. |
Here is my attempt to handle termination signal: https://github.com/nikolay-turpitko/go-systemd/commit/5bdb401973017d432adb9f5104eca62c580de769 We cannot just kill process or exit from program in the signal handler, since in the real program we want to execute some additional cleanup and graceful-shutdown logic. We cannot close unix domain socket, but could we cause it to return EINTR, as described in http://linux.die.net/man/7/signal ? |
I tried to prevent file deletion without code modification, but it didn't lead to desired result in my app (and I don't know how to provide simple test for this attempt). My idea was to put socket file into folder with restricted access, so that only systemd could manipulate files in that folder. Code in UnixListener.Close() doesn't check the result or error of syscall.Unlink(), so I assumed, that it will not be able to delete file and will go on, close file descriptor and method http.Serve() will exit with error, so that rest of program will be able to perform cleanup logic and exit. But my service hung up at termination nevertheless along with socket unit. Don't know, whether I messed something with privileges or it just not work as I suppose it should. So, as a last resort, I tried to install golang from sources and added this method to UnixListener (the same code as in Close(), but with syscall.Unlink() completely removed):
and invoke it in my test as this:
With this fix tests seems to work as expected. Now, could someone help me by reviewing my tests and support in discussion of this issue with guys from golang team? |
I started a discussion here: https://groups.google.com/forum/#!topic/golang-nuts/UtBR4IfgaEw |
I was advised to reorganize my code to leave sockets opened and goroutines unfinished, which I managed to do. Nevertheless, I have a strong filling that code would be more accurate and straightforward if it could be possible to call something like |
Following up this old issue since I stumbled into it while working on LXD and reading @stgraber's workaround (referenced just above here). @nikolay-turpitko I think your report and analysis were quite clear and compelling, and I'm surprised that Ian Lance Taylor was so dismissive initially, when replying to your post on golang-nuts ("This means making the API more complex, which is something we only Anyway, it seems at some point the Go upstream authors have actually acknowledged the issue and also went for a fix which is basically the same proposed here (SetUnlinkOnClose): https://golang.org/pkg/net/#UnixListener.SetUnlinkOnClose This was introduced in Go 1.8. |
Actually I just realized that, although the The relevant commit to master that closes the issue is dated 5 December 2015: So anyone using >=1.6 will be safe. |
In coreos/go-systemd/activation/listeners.go, line 34 method net.FileListener() is used to create net.Listener from file. For unix domain sockets it returns UnixListener, whose method Close() unlinks it, which leads to deletion of the file in the file system. Here is method Close() from unixsock_posix.go:
I think this behavior conflicts with socket activation, because systemd doesn't anticipate this and doesn't recreate file or replace socket. So, when I kill/reload a socket activated service (for test), systemd restarts it with the same socket. but without correspondent file, which leads to immediate crash of service with "file not found" error. In my opinion, this disable important use case of socket activation.
I tried to use abstract names (starting with '@') and they work perfectly (see 'if' block in the code above). Unfortunately, server, which I use as a proxy, doesn't support them.
I am about to implement somewhat cumbersome solution in my own code, but may be you have a better idea how to deal with this issue (may be in your lib, or in it's client code). My idea is to wrap net.Listener, returned from activation.Listeners() with my implementation of net.Listener interface and replace method Close() so, that for UnixListener it do nothing, but with preceding invocation of activation.Files() I am going to obtain and remember file descriptors to be able to close them before exit. A bit better, I can store file descriptor in the wrapper as well and close it in my Close() method.
The text was updated successfully, but these errors were encountered: