Refactor UNIX domain socket #612

ryanuber · 2015-01-16T22:25:50Z

This is a follow-on from #587. There are a couple adjustments to the functionality of the feature, which I’ll outline below. Due to these changes in functionality, a good deal of code and tests have been adjusted as well.

Ownership and permissions of the socket are not managed by Consul.

After an internal discussion about this, we think the best thing for Consul to do is to simply allow the socket to be created by the underlying system calls, and leave it at that. This same approach is taken by a number of other projects, and after some research, it has become more clear why that is. Looking into the OS documentation on a standard Linux distribution (man 7 unix), we will find that sockets in Linux obey the file permissions of their parent directory, and that some BSD and other distributions ignore the permissions on the socket files completely. Because of this, we will just accept a standard UNIX domain socket path like unix:///path/to/file, which unambiguously creates the socket at that path per the OS’es implementation, and allows the operator to configure permissions at the directory above.

Consul errors when the specified socket path already exists

This was discussed in the original PR. After some more discussions and opinions, we’ve decided to err on the side of caution, and the safest thing to do in cases where the socket path already exists is to throw an error. The socket bind error from the underlying OS might be confusing to a novice user, so we’ve added a more human-friendly error message in this case explaining that it is possible the socket file was not cleaned up due to forced shutdown.

sethvargo · 2015-01-16T22:28:10Z

command/agent/agent.go

+
+	// errSocketFileExists is the human-friendly error message displayed when
+	// trying to bind a socket to an existing file.
+	errSocketFileExists = "A file exists at the requested socket path %q. " +


Can we surround the path with quotes or a special character? If the path has spaces it it, it might be difficult to distinguish.

%q's got your back, jack :) http://play.golang.org/p/V6PfAl5PWn

%q a double-quoted string safely escaped with Go syntax

sethvargo · 2015-01-16T22:29:49Z

This looks really good to me 👍 . I looked at the behavior of nginx and puma (some popular webservers) and they follow the exact same paradigm: socket is owned by the user running the process. Similarly, postgres outputs an error when you try to listen on a file that already exists.

Refactor UNIX domain socket

jefferai · 2015-01-20T15:04:10Z

@ryanuber @armon I wish you guys had included me in the discussions before unilaterally changing this around. Both of these are bad changes. I would appreciate if you reverted these and included me in the discussion.

sethvargo · 2015-01-20T16:18:02Z

Hey @jefferai

I am sorry if you feel like we went behind your back. That was certainly no one's intention. We discussed the changes at length as a team and we decided that many of the behaviors introduced were not in parallel with the Principle of Least Surprise or other tools in the industry. Please allow me to elaborate further.

When listening on a socket in nginx, nginx will create the socket as the same user:group the nginx processes is running as. If you need particular permissions for the socket, you must run nginx as a different user (or more commonly a different group) so that other services can read/write to the socket. We decided to take a similar approach with Consul. Furthermore, the standard description of a unix socket is unix:/path/to/socket. We did not feel comfortable coalescing the unix socket descriptor with the permissions on the socket file itself.
We did not think it was the appropriate UX to automatically delete the socket. If the socket was persisted on disk, that means Consul was unable to shut down cleanly (something bad happened), and a human being should make a decision of whether it is safe to remove the socket file and restart Consul or if more intervention is needed. Other tools (like postgres and nginx) both follow this paradigm.

If there are specific issues I did not cover in my response, please let me know comment on the appropriate lines in the diff of this PR and we can definitely revisit the decision(s).

jefferai · 2015-01-20T16:54:50Z

Hi @sethvargo @ryanuber @armon

First off, I want to point out that this is an example of being a terrible upstream. The syntax was proposed on the mailing list. Nobody, including Consul team members, objected. Ryan asked about it on the previous pull request, then after I explained my rationale, he thanked me for it and expressed no reservation.

You guys could have had a further discussion with me on the previous PR. You could have expressed concern and asked for community input on the mailing list. You could have poked me in email or on IRC. You could even have gotten me to make needed changes prior to you merging them so that my time wasn't wasted coding them in and writing tests for them only for you guys to waste your time reverting them less than 24 hours later.

That sucks.

So let's go over the problems with these changes.

nginx is not the only game in town. Many other services -- for instance, dovecot, clamav, and your own later example of postgres -- allow you to specify the (usually) user and (definitely) group and mode you want to use, to make it easier and -- crucially -- more secure for other services to interoperate. That way you don't need to have the primary group you are running consul under be the group that everyone else uses to communicate with it. Removing that capability reduces both flexibility and security.
You can't simply get around this by making a directory and creating the socket in that directory and having it inherit permissions. I don't know what man page you're looking at, but man 7 unix on my Ubuntu 14.04 system says nothing of the sort. And I just verified that using netcat -- permissions of the parent directory have nothing to do with anything, so the workaround you guys suggest isn't actually a workaround. Proof:

pomluser@thisisit:~/sockettest$ ls -al
total 8
drwx------  2 pomluser consul 4096 Jan 20 16:44 .
drwxr-xr-x 27 pomluser users  4096 Jan 20 15:59 ..
pomluser@thisisit:~/sockettest$ nc -l -U test.sock
^C
pomluser@thisisit:~/sockettest$ ls -al
total 8
drwx------  2 pomluser consul 4096 Jan 20 16:44 .
drwxr-xr-x 27 pomluser users  4096 Jan 20 15:59 ..
srwxr-xr-x  1 pomluser users     0 Jan 20 16:44 test.sock

Rather than removing the user/group/mode altogether, you could have simply expanded the configuration language, as Ryan and I discussed in the previous PR. As I said there, nobody expressed any issue with the configuration language since the original RFC, and if people didn't like the coalesced syntax the right thing to do would be to modify configuration to support sockets properly. You guys picked the nuclear (and time-wasting) option instead.
Leaving aside the point of what a user would do with a dead socket if Consul died in an unclean fashion, I don't know where you got your intel about removing sockets and postgres, but it's wrong. If postgres dies and you restart it, something in postgres wipes the socket and recreates it. I don't know what does it, but I can tell you that it's not Ubuntu's init script. It's one of the pg tools, or it's the daemon itself. I just tested this -- on unclean shutdown the socket remains, and upon restarting the socket is removed and recreated, which I can tell because the inode changes:

root@thisisit:/etc/postgresql/9.3/main# stat /var/run/postgresql/.s.PGSQL.5432
  File: '/var/run/postgresql/.s.PGSQL.5432'
  Size: 0           Blocks: 0          IO Block: 4096   socket
Device: 10h/16d Inode: 31942       Links: 1
Access: (0777/srwxrwxrwx)  Uid: (  118/postgres)   Gid: (  127/postgres)
Access: 2015-01-20 16:32:48.550473951 +0000
Modify: 2015-01-20 16:32:48.070233937 +0000
Change: 2015-01-20 16:32:48.070233937 +0000
 Birth: -
root@thisisit:/etc/postgresql/9.3/main# service postgresql restart
 * Restarting PostgreSQL 9.3 database server                                                                                                             [ OK ]
root@thisisit:/etc/postgresql/9.3/main# stat /var/run/postgresql/.s.PGSQL.5432
  File: '/var/run/postgresql/.s.PGSQL.5432'
  Size: 0           Blocks: 0          IO Block: 4096   socket
Device: 10h/16d Inode: 31192       Links: 1
Access: (0777/srwxrwxrwx)  Uid: (  118/postgres)   Gid: (  127/postgres)
Access: 2015-01-20 16:37:02.425358661 +0000
Modify: 2015-01-20 16:37:01.933112637 +0000
Change: 2015-01-20 16:37:01.933112637 +0000
 Birth: -

Here's another example, using everyone's favorite netcat:

pomluser@thisisit:~/sockettest$ stat test.sock
  File: 'test.sock'
  Size: 0           Blocks: 0          IO Block: 4096   socket
Device: 806h/2054d  Inode: 818915      Links: 1
Access: (0755/srwxr-xr-x)  Uid: ( 1001/pomluser)   Gid: (  100/   users)
Access: 2015-01-20 16:16:50.315555836 +0000
Modify: 2015-01-20 16:16:50.315555836 +0000
Change: 2015-01-20 16:16:50.315555836 +0000
 Birth: -
pomluser@thisisit:~/sockettest$ nc -l -U test.sock
^C
pomluser@thisisit:~/sockettest$ stat test.sock
  File: 'test.sock'
  Size: 0           Blocks: 0          IO Block: 4096   socket
Device: 806h/2054d  Inode: 787171      Links: 1
Access: (0755/srwxr-xr-x)  Uid: ( 1001/pomluser)   Gid: (  100/   users)
Access: 2015-01-20 16:40:25.754981229 +0000
Modify: 2015-01-20 16:40:25.754981229 +0000
Change: 2015-01-20 16:40:25.754981229 +0000
 Birth: -

As a contrast, let's see the workflow with nginx:

root@thisisit:/etc/nginx/sites-enabled# ps aux | grep nginx
root      8402  0.0  0.0  14044   916 pts/4    S+   16:47   0:00 grep --color=auto nginx

root@thisisit:/etc/nginx/sites-enabled# ls -al /var/run/nginx.sock
srw-rw-rw- 1 root root 0 Jan 20 16:47 /var/run/nginx.sock

# Now let's pretend I didn't just think about the fact that it's running on a socket
root@thisisit:/etc/nginx/sites-enabled# service nginx start

# Wait, what?
root@thisisit:/etc/nginx/sites-enabled# ps aux | grep nginx
root      8423  0.0  0.0  14044   916 pts/4    S+   16:48   0:00 grep --color=auto nginx

# Maybe syntax is bad
root@thisisit:/etc/nginx/sites-enabled# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

# ???
root@thisisit:/etc/nginx/sites-enabled# nginx
nginx: [emerg] bind() to unix:/var/run/nginx.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/var/run/nginx.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/var/run/nginx.sock failed (98: Address already in use)

# WTF

I don't know where you guys got your intel, but it's wrong. You care about principle of least surprise...so do I. The least surprise solution is not to leave a dead socket lying around that the user can't do anything with anyways and then fail to start because of it. That will just leave users pissed off when a simple restart of consul fails to work. The least surprise solution is to do what everyone else is doing, and remove the socket and then re-create it.

armon · 2015-01-20T17:42:05Z

@jefferai @ryanuber @sethvargo Okay, so how about the following:

Consul should perform a delete + re-create on the socket if it already exists. This allows it to recover from a crash by simply restarting. I don't see any reason to not just delete the socket, since this way we can re-bind to it.
I think if we split the UNIX domain socket configuration into multiple configuration variables it will be much clearer. Namely, we should still use the "unix:path" syntax to switch from a TCP to domain socket listener, but new configurations for unix_socket_user, unix_socket_group and unix_socket_perm should be added to explicitly specify the other variables. We can sanely default those of to that of the running agent, and the permissions to match the parent folder.

Thoughts?

jefferai · 2015-01-20T17:46:32Z

@armon That's fine -- as I said in the other pull request, changing configuration is the ideal option. I can see arguments as to why you'd want to be able to specify user/group individually for each socket (since RPC and HTTP have different APIs, and AFAIK not all things can be done on each). Maybe that's too advanced, but this might be the best time to implement something along those lines.

Unfortunately, the "best" solution that I can think of would require deprecating or changing -addresses to have sub-objects instead of sub-strings. Otherwise it'd be to add a separate option along the lines of -unix-addresses but then you have to have a war between which one takes precedence when both are set.

armon · 2015-01-20T17:49:49Z

@jefferai Is this a use case you guys have currently? (Having distinct permissions per-socket). If it isn't then I'd prefer to keep is simple and just have a single set of unix_* configuration. If it is a use case, then I think we can have sub-objects within a different block. e.g.

"unix_sockets": {
    "dns": {
        "user": "consul"
    }
}

jefferai · 2015-01-20T17:54:17Z

@armon It's not a current use-case, no, although I can't speak for everyone here. But, I was simply figuring that if that might be useful to anyone down the road, easier to change things before people use them than after. :-)

jefferai · 2015-01-20T17:56:35Z

@armon @sethvargo @ryanuber BTW, since nginx is a common example for you guys, nginx uses unix:/path, but most other utilities that I've seen -- including actual languages -- prefer the full URL syntax with unix:///path. Just FYI. I'm a big fan of nginx, but not of all of their decisions...

armon · 2015-01-20T17:57:37Z

Perfect. So I think we can do:

"unix_sockets": {
"user":...,
"group":...,
"perm":...,
}

Then in the future, we can add "rpc" and "dns" override blocks within "unix_sockets" if needed.
I also agree about the full URL path, seems more canonical.

jefferai · 2015-01-20T18:08:49Z

@armon I'm guessing you meant "rpc" and "http"; there's no reason for DNS to be over Unix sockets. (Or rather, not for the same reasons that drove this implementation.)

But yes, that looks great.

ryanuber added 9 commits January 16, 2015 00:45

agent: beginning refactor

4675cdf

agent: only ignore errors on IsNotExist()

bf48651

agent: fixing up tests

4c3ec24

command: style fixes for env addr reading

e2eff56

agent: test socket file overwrite

56c3e48

api: simple test for unix socket connectivity

4b63573

agent: error if binding to existing socket file

f01bb5c

website: update docs for agent socket options

7a585d7

command: fix env var reader for HTTP addr

41ec10a

sethvargo reviewed Jan 16, 2015
View reviewed changes

ryanuber added a commit that referenced this pull request Jan 16, 2015

Merge pull request #612 from hashicorp/f-refactor

df52ac6

Refactor UNIX domain socket

ryanuber merged commit df52ac6 into master Jan 16, 2015

ryanuber deleted the f-refactor branch January 16, 2015 22:34

ryanuber mentioned this pull request Jan 16, 2015

RPC and HTTP interfaces fully generically-sockified so Unix is supported... #587

Merged

ryanuber added a commit that referenced this pull request Jan 20, 2015

agent: beginning socket user/group/mode support as discussed in #612

21dd95d

ryanuber mentioned this pull request Jan 21, 2015

Unix domain sockets #622

Merged

duckhan pushed a commit to duckhan/consul that referenced this pull request Oct 24, 2021

Add ability to set extra labels on client pods (hashicorp#612)

c93b5db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor UNIX domain socket #612

Refactor UNIX domain socket #612

ryanuber commented Jan 16, 2015

sethvargo Jan 16, 2015

ryanuber Jan 16, 2015

sethvargo commented Jan 16, 2015

jefferai commented Jan 20, 2015

sethvargo commented Jan 20, 2015

jefferai commented Jan 20, 2015

armon commented Jan 20, 2015

jefferai commented Jan 20, 2015

armon commented Jan 20, 2015

jefferai commented Jan 20, 2015

jefferai commented Jan 20, 2015

armon commented Jan 20, 2015

jefferai commented Jan 20, 2015

Refactor UNIX domain socket #612

Refactor UNIX domain socket #612

Conversation

ryanuber commented Jan 16, 2015

Ownership and permissions of the socket are not managed by Consul.

Consul errors when the specified socket path already exists

sethvargo Jan 16, 2015

Choose a reason for hiding this comment

ryanuber Jan 16, 2015

Choose a reason for hiding this comment

sethvargo commented Jan 16, 2015

jefferai commented Jan 20, 2015

sethvargo commented Jan 20, 2015

jefferai commented Jan 20, 2015

armon commented Jan 20, 2015

jefferai commented Jan 20, 2015

armon commented Jan 20, 2015

jefferai commented Jan 20, 2015

jefferai commented Jan 20, 2015

armon commented Jan 20, 2015

jefferai commented Jan 20, 2015