Added (fixed) support for abstract Unix sockets. #740

alejandro-colomar · 2022-07-28T14:47:37Z

Unix domain sockets are normally backed by files in the
filesystem. This has historically been problematic when closing
and opening again such sockets, since SO_REUSEADDR is ignored for
Unix sockets (POSIX left the behavior of SO_REUSEADDR as
implementation-defined, and most --if not all-- implementations
decided to just ignore this flag).

Many solutions are available for this problem, but all of them
have important caveats:

unlink(2) the file when it's not needed anymore.

This is not easy, because the process that controls the fd may
not be the same process that created the file, and may not have
file permissions to remove it.

Further solutions can be applied to that caveat:
- unlink(2) the file right after creation.
  
  This will remove the pathname from the filesystem without
  closing the socket (it will continue alive until the last fd
  is closed). This is not useful for us, since we need the
  pathname of the socket as its interface.
- chown(2) or chmod(2) the directory that contains the socket.
  
  For removing a file from the filesystem, a process needs
  write permissions in the containing directory. We could
  put sockets in dummy directories that can be chown(2)ed to
  nobody. This could be dangerous, though, as we don't control
  the socket names. It is our users who configure the socket
  name in their configuration, and so it's easy that they don't
  understand the many implications of not chosing an appropriate
  socket pathname. A user could unknowingly put the socket in a
  directory that is not supposed to be owned by user nobody, and
  if we blindly chown(2) or chmod(2) the directory, we could be
  creating a big security hole.
- Ask the main process to remove the socket.
  
  This would require a very complex communication mechanism with
  the main process, which is not impossible, but let's avoid it
  if there are simpler solutions.
- Give the child process the CAP_DAC_OVERRIDE capability.
  
  That is one of the most powerful capabilities. A process with
  that capability can be considered root for most practical
  aspects. Even if the capability is disabled for most of the
  lifetime of the process, there's a slight chance that a
  malicious actor could activate it and then easily do serious
  damage to the system.
unlink(2) the file right before calling bind(2).

This is dangerous because another process (for example, another
running instance of unitd(8)), could be using the socket, and
removing the pathname from the filesystem would be problematic.
To do this correctly, a lot of checks should be added before the
actual unlink(2), which is bug-prone, and difficult to do
correctly, and atomically.
Use abstract-namespace Unix domain sockets.

This is the simplest solution, as it only requires accepting a
slightly different syntax (basically a @ prefix) for the socket
name, to transform it into a string starting with a null byte
('\0') that the kernel can understand. The patch is minimal.

Since abstract sockets live in an abstract namespace, they don't
create files in the filesystem, so there's no need to remove
them later. The kernel removes the name when the last fd to it
has been closed.

One caveat is that only Linux currently supports this kind of
Unix sockets. Of course, a solution to that could be to ask
other kernels to implement such a feature.

Another caveat is that filesystem permissions can't be used to
control access to the socket file (since, of course, there's no
file). Anyone knowing the socket name can access to it. The
only method to control access to it is by using
network_namespaces(7). Since in unitd(8) we're using 0666 file
sockets, abstract sockets should be more insecure than that
(anyone can already read/write to the listener sockets).
Ask the kernel to implement an simpler way to unlink(2) socket
files when they are not needed anymore. I've suggested that to
the linux-fsdevel@vger.kernel.org mailing list, in:
<lore.kernel.org/linux-fsdevel/0bc5f919-bcfd-8fd0-a16b-9f060088158a@gmail.com/T>

In this commit, I decided to go for the easiest/simplest solution.

This closes #669 issue on GitHub.

alejandro-colomar · 2022-07-28T15:11:54Z

One detail:

Only the first @ is transformed into a \0. If the user specifies further @s, they will be treated literally.

If the user specifies \0s in the middle of the string, it will probably screw the JSON parser, so I didn't even care to add support for it, nor test it. We just don't support such broken names.

alejandro-colomar · 2022-07-28T15:14:43Z

What I've already tested is configuring the following:

{
	"listeners": {
		"unix:@magic": {
			"pass": "routes",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		}
	},
	"routes": [{
		"action": {
			"share": "/home/alx/srv/www/$uri"
		}
	}]
}

And then:

$ sudo curl -X PUT --data-binary @- --unix-socket /usr/local/control.unit.sock localhost/config/ <sock_abstract_static.json
{
	"success": "Reconfiguration done."
}

And some sanity check:

$ ss -l | grep magic
u_str LISTEN 0      511                                         @magic@ 54141                           * 0

And also, I tested a simple C program to connect to the socket and write to it some random stuff, and it (kind of) worked: the socket name has a trailing \0, so I had to connect to \0magic\0. I'm working on fixing that.

alejandro-colomar · 2022-07-28T15:45:21Z

This is yet another even simpler alternative to #670 and #735 .

alejandro-colomar · 2022-07-28T15:59:32Z

I'm trying to test this with curl, for which I need a socket name that doesn't end in \0. For that, I'm trying to do sa->socklen--;, but for some reason, the code is completely ignoring me, and I need someone else look into my code for a moment. I'm blind.

So, I have the following debug changes, and even though I'm setting the length to 3 (and sa_family_t is already 2 bytes, so sun_path should be just one \0.

Still, I see 9 (2 + 7) bytes in the log: ALX: 9; <^@-m-a-g-i-c-^@>.

$ git diff
diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c
index 0527d08f..4e1df140 100644
--- a/src/nxt_main_process.c
+++ b/src/nxt_main_process.c
@@ -1147,10 +1147,21 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
 #if (NXT_HAVE_UNIX_DOMAIN)
     if (sa->u.sockaddr.sa_family == AF_UNIX) {
         if (sa->u.sockaddr_un.sun_path[0] == '@') {
+            sa->socklen=3;
             sa->u.sockaddr_un.sun_path[0] = '\0';
         }
     }
 #endif
+fprintf(stderr, "ALX: %i; <%c-%c-%c-%c-%c-%c-%c>\n", (int) sa->socklen,
+sa->u.sockaddr_un.sun_path[0],
+sa->u.sockaddr_un.sun_path[1],
+sa->u.sockaddr_un.sun_path[2],
+sa->u.sockaddr_un.sun_path[3],
+sa->u.sockaddr_un.sun_path[4],
+sa->u.sockaddr_un.sun_path[5],
+sa->u.sockaddr_un.sun_path[6]);
 
     ret = bind(s, &sa->u.sockaddr, sa->socklen);

And ss(8) still sees also @magic@.

hongzhidao · 2022-07-28T16:05:50Z

Just take a glance.

In nxt_sockaddr_unix_parse(), there is already handling with abstract sockets.

#if (NXT_LINUX)

    /*
     * Linux unix(7):
     *
     *   abstract: an abstract socket address is distinguished by the fact
     *   that sun_path[0] is a null byte ('\0').  The socket's address in
     *   this namespace is given by the additional bytes in sun_path that
     *   are covered by the specified length of the address structure.
     *   (Null bytes in the name have no special significance.)
     */
    if (path[0] == '@') {
        path[0] = '\0';
        socklen--;
    }

#endif

Alternative check.

if (sa->u.sockaddr.sa_family == AF_UNIX) {
     && sa->u.sockaddr_un.sun_path[0] == '\0')
{

alejandro-colomar · 2022-07-28T16:09:45Z

nxt_sockaddr_unix_parse

Yeah, it seems it wasn't entering my conditional. Heh!

Well, then, we have still some issue. There's a trailing \0, even though the code also did socklen--; as I was trying to do. I'll check why we still have that trailing byte. curl(1) doesn't support that.

alejandro-colomar · 2022-07-28T16:16:06Z

Just take a glance.

1. In `nxt_sockaddr_unix_parse()`, there is already handling with abstract sockets.

#if (NXT_LINUX)

    /*
     * Linux unix(7):
     *
     *   abstract: an abstract socket address is distinguished by the fact
     *   that sun_path[0] is a null byte ('\0').  The socket's address in
     *   this namespace is given by the additional bytes in sun_path that
     *   are covered by the specified length of the address structure.
     *   (Null bytes in the name have no special significance.)
     */
    if (path[0] == '@') {
        path[0] = '\0';
        socklen--;
    }

#endif

2. Alternative check.

if (sa->u.sockaddr.sa_family == AF_UNIX) {
     && sa->u.sockaddr_un.sun_path[0] == '\0')
{

Although, that code doesn't work. The listener doesn't support abstract sockets previous to my patch. What was that code used for? Or was it dead code?

hongzhidao · 2022-07-28T16:28:24Z

It's for disabling chmod() on path which starts with \0.

ac000 · 2022-07-28T18:00:17Z

$ git diff
diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c
index 0527d08f..4e1df140 100644
--- a/src/nxt_main_process.c
+++ b/src/nxt_main_process.c
@@ -1147,10 +1147,21 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
 #if (NXT_HAVE_UNIX_DOMAIN)
     if (sa->u.sockaddr.sa_family == AF_UNIX) {
         if (sa->u.sockaddr_un.sun_path[0] == '@') {
+            sa->socklen=3;
             sa->u.sockaddr_un.sun_path[0] = '\0';
         }
     }

I can't get the code inside the if (sa->u.sockaddr_un.sun_path[0] == '@') { to trigger

Doing

sa->u.sockaddr_un.sun_path[0] = '\0';
sa->socklen--;

before that if check makes things work. Although I think we may have tickled a bug somewhere as now ss only shows @magic as magic, even though the bind(2) is doing

bind(10, {sa_family=AF_UNIX, sun_path=@"magic"}, 8) = 0

and it shows up correctly in /proc/net/unix

$ grep magic /proc/net/unix
00000000c72b4d32: 00000002 00000000 00010000 0001 01 7229023 @magic

but

$ ss -l | grep magic
u_str LISTEN 0      511                                                        magic 7229392

But at least it works

$ curl --abstract-unix-socket magic http://localhost/
Hello, Python on Unit!

ac000 · 2022-07-28T18:57:14Z

Unix domain sockets are normally backed by files in the filesystem. This has historically been problematic when closing and opening again such sockets, since SO_REUSEADDR is ignored for Unix sockets (POSIX left the behavior of SO_REUSEADDR

s/behavior/behaviour/

  * unlink(2) the file right after creation.
    This will remove the pathname from the filesystem without
    closing the socket (it will continue alive until the last fd

s/alive/to live/

* unlink(2) the file right before calling bind(2).
  This is dangerous because another process (for example, another
  running instance of unitd(8)), could be using the socket, and
  removing the pathname from the filesystem would be problematic.
  To do this correctly, a lot of checks should be added before the
  actual unlink(2), which is bug-prone, and difficult to do

s/bug/error/

* Use abstract-namespace Unix domain sockets.
  This is the simplest solution, as it only requires accepting a
  slightly different syntax (basically a @ prefix) for the socket
  name, to transform it into a string starting with a null byte
  ('\0') that the kernel can understand.  The patch is minimal.
  Since abstract sockets live in an abstract namespace, they don't
  create files in the filesystem, so there's no need to remove
  them later.  The kernel removes the name when the last fd to it
  has been closed.
  One caveat is that only Linux currently supports this kind of
  Unix sockets.  Of course, a solution to that could be to ask
  other kernels to implement such a feature.
  Another caveat is that filesystem permissions can't be used to
  control access to the socket file (since, of course, there's no
  file).  Anyone knowing the socket name can access to it.  The
  only method to control access to it is by using
  network_namespaces(7).  Since in unitd(8) we're using 0666 file
  sockets, abstract sockets should be more insecure than that

s/be more/be no more/

You can also use SO_PEERCRED on Unix Domain Sockets.

  (anyone can already read/write to the listener sockets).

* Ask the kernel to implement an simpler way to unlink(2) socket

s/an/a/

lcrilly · 2022-07-28T19:06:46Z

s/behavior/behaviour/

Unfortunately, we default to US English :(

alejandro-colomar · 2022-07-29T10:47:08Z

before that if check makes things work. Although I think we may have tickled a bug somewhere as now ss only shows @magic as magic, even though the bind(2) is doing
[...]

But at least it works
$ curl --abstract-unix-socket magic http://localhost/
Hello, Python on Unit!

That's very weird. A bug in ss(8)? Could you please try my cli_a.c program to try to write to that magic socket? I'd like to make sure that the kernel has it right (as \0magic). Then maybe we can report a ss(8) bug.

Another test could be to try to run srv_a.c, which should fail to run if \0magic is already bound.

I don't trust curl(1) maybe doing weird stuff. It's too complex of a program.

ac000 · 2022-07-29T13:05:19Z

It's a bug, in xterm!

I've seen this before, and it just occurred to me again now.

Lets see

xterm at 80 chars wide.

$ ss -l | grep @ | grep magic
u_str LISTEN 0      511                                                        magic 7869871                         * 0

The 'm' is at the 80 char mark.

So we got magic even though there is apparently no '@'!

Now lets extend xterm out a little (even just by 1 char).

$ ss -l | grep @ | grep magic
u_str LISTEN 0      511                                                        @magic 7869871                         * 0

The '@' is now at the 80 char mark.

There we go. Previously I'd already had xterm wider for the ss output so didn't notice this then.

I've noticed this problem in xterm before, particularly with gcc output.

ac000 · 2022-08-02T02:07:14Z

#if (NXT_LINUX)

/*
 * Linux unix(7):
 *
 *   abstract: an abstract socket address is distinguished by the fact
 *   that sun_path[0] is a null byte ('\0').  The socket's address in
 *   this namespace is given by the additional bytes in sun_path that
 *   are covered by the specified length of the address structure.
 *   (Null bytes in the name have no special significance.)
 */
if (path[0] == '@') {
    path[0] = '\0';
    socklen--;
}

This code doesn't seem to be executed, at least not before we call bind(2).

2. Alternative check.

if (sa->u.sockaddr.sa_family == AF_UNIX) {
     && sa->u.sockaddr_un.sun_path[0] == '\0')
{

Yeah, that's required. Here's a minimal patch on top of master to make it work.

diff --git a/src/nxt_main_process.c b/src/nxt_main_process.c
index 03761a10..a8de8f89 100644
--- a/src/nxt_main_process.c
+++ b/src/nxt_main_process.c
@@ -1143,6 +1143,12 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
 
 #endif
 
+    if (sa->u.sockaddr.sa_family == AF_UNIX
+        && *sa->u.sockaddr_un.sun_path == '\0')
+    {
+        sa->socklen--;
+    }
+
     if (bind(s, &sa->u.sockaddr, sa->socklen) != 0) {
         err = nxt_errno;
 
@@ -1187,7 +1193,9 @@ nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls)
 
 #if (NXT_HAVE_UNIX_DOMAIN)
 
-    if (sa->u.sockaddr.sa_family == AF_UNIX) {
+    if (sa->u.sockaddr.sa_family == AF_UNIX
+        && *sa->u.sockaddr_un.sun_path != '\0')
+    {
         char     *filename;
         mode_t   access;

NOTE: I'm not saying this is correct, but it shows the code as is, is not far off (though the socket name does show up in the config as \u0000magic)

hongzhidao · 2022-08-02T07:56:04Z

This code doesn't seem to be executed, at least not before we call bind(2).

Correct, the related APIs have not been used yet.

alejandro-colomar · 2022-08-03T14:33:11Z

Digging, I found a lot of dead code that was accidentally left in some of Igor's refactors. I removed it all.

alejandro-colomar · 2022-08-03T23:51:44Z

I found more or less why we are getting the extra length. See the following debug log:

alx@asus5775:/usr/local$ sudo cat unit.log | tr '\0' '#' | strings | grep ALX
ALX: nxt_sockaddr_text():293: 1
 ALX:  <#><
ALX: nxt_sockaddr_text():305: 1
 ALX:  <u><n><i><x><:><
ALX: nxt_sockaddr_unix_parse():631: 7
 ALX:  <@><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():639: 6
 ALX:  <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 6
 ALX:  <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 6
 ALX:  <u><n><i><x><:><@>
ALX: nxt_sockaddr_unix_parse():631: 7
 ALX:  <#><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():639: 7
 ALX:  <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 7
 ALX:  <#><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 7
 ALX:  <u><n><i><x><:><@>

Got that with the following debug code:

diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 8220590e..75c1efde 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -141,6 +141,10 @@ nxt_sockaddr_create(nxt_mp_t *mp, struct sockaddr *sockaddr, socklen_t length,
 
 #endif
     }
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) size-2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", sockaddr->sa_data[i]);
+fprintf(stderr, "\n");
 
 #endif  /* NXT_HAVE_UNIX_DOMAIN */
 
@@ -286,6 +290,10 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
 
 #if (NXT_LINUX)
 
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) sa->socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", p[i]);
+fprintf(stderr, "\n");
         if (p[0] == '\0') {
             size_t  length;
 
@@ -294,8 +302,16 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
 
             p = nxt_sprintf(start, end, "unix:@%*s", length - 1, p + 1);
 
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) length);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
         } else {
             p = nxt_sprintf(start, end, "unix:%s", p);
+fprintf(stderr, "ALX: %s():%i:\n ALX:  ", __func__, __LINE__);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
         }
 
 #else  /* !(NXT_LINUX) */
@@ -612,10 +628,18 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
      *   are covered by the specified length of the address structure.
      *   (Null bytes in the name have no special significance.)
      */
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
     if (path[0] == '@') {
         path[0] = '\0';
         socklen--;
     }
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
 
 #endif

The explanation is that we're going twice through nxt_sockaddr_unix_parse(). What is not clear to me is why are we going twice through that function. The solution seems also easy (although I'm not still convinced by it; it looks like something's wrong).

The easy fix:

     if (path[0] == '@') {
         path[0] = '\0';
         socklen--;
     }

should be transformed into

     if (path[0] == '@') {
         path[0] = '\0';
     }
     if (path[0] == '\0') {
         socklen--;
     }

But, as I hinted, the fix should probably be to make sure that we only run through that function once.

I configured with the following config file:

$ cat sock_static.json 
{
	"listeners": {
		"unix:@magic": {
			"pass": "routes",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		}
	},
	"routes": [{
		"action": {
			"share": "/home/alx/srv/www/$uri"
		}
	}]
}

alejandro-colomar · 2022-08-04T10:19:20Z

s/behavior/behaviour/

Wontfix :)

s/alive/to live/
s/bug/error/
s/be more/be no more/
s/an/a/

Fixed.

Thanks!

alejandro-colomar · 2022-08-04T16:10:05Z

I found the following:

ALX: nxt_conf_vldt_listener():1484
ALX: nxt_sockaddr_parse():544
ALX: nxt_sockaddr_parse_optport():566
ALX: nxt_sockaddr_unix_parse():635: 7
 ALX:  <@><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():643: 6
 ALX:  <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 6
 ALX:  <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 6
 ALX:  <u><n><i><x><:><@>
ALX: nxt_sockaddr_parse_optport():585
ALX: nxt_sockaddr_parse():555
ALX: nxt_conf_vldt_listener():1497
ALX: nxt_router_conf_data_handler():739
ALX: nxt_router_conf_create():1483
ALX: nxt_router_socket_conf():2432
ALX: nxt_sockaddr_parse():544
ALX: nxt_sockaddr_parse_optport():566
ALX: nxt_sockaddr_unix_parse():635: 7
 ALX:  <^@><m><a><g><i><c>
ALX: nxt_sockaddr_unix_parse():643: 7
 ALX:  <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():293: 7
 ALX:  <^@><m><a><g><i><c>
ALX: nxt_sockaddr_text():305: 7
 ALX:  <u><n><i><x><:><@>
ALX: nxt_sockaddr_parse_optport():585
ALX: nxt_sockaddr_parse():555
ALX: nxt_router_socket_conf():2500
ALX: nxt_router_conf_create():1968
ALX: nxt_router_conf_data_handler():822

with the following debug code:

diff --git a/src/nxt_conf_validation.c b/src/nxt_conf_validation.c
index 3e89d775..9e5ca414 100644
--- a/src/nxt_conf_validation.c
+++ b/src/nxt_conf_validation.c
@@ -1481,6 +1481,7 @@ nxt_conf_vldt_listener(nxt_conf_validation_t *vldt, nxt_str_t *name,
 {
     nxt_int_t       ret;
     nxt_sockaddr_t  *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     sa = nxt_sockaddr_parse(vldt->pool, name);
     if (nxt_slow_path(sa == NULL)) {
@@ -1493,6 +1494,7 @@ nxt_conf_vldt_listener(nxt_conf_validation_t *vldt, nxt_str_t *name,
     if (ret != NXT_OK) {
         return ret;
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return nxt_conf_vldt_object(vldt, value, nxt_conf_vldt_listener_members);
 }
@@ -1752,6 +1754,7 @@ nxt_conf_vldt_proxy(nxt_conf_validation_t *vldt, nxt_conf_value_t *value,
     nxt_str_t       name;
     nxt_sockaddr_t  *sa;
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     nxt_conf_get_string(value, &name);
 
     if (nxt_str_start(&name, "http://", 7)) {
@@ -1760,9 +1763,11 @@ nxt_conf_vldt_proxy(nxt_conf_validation_t *vldt, nxt_conf_value_t *value,
 
         sa = nxt_sockaddr_parse(vldt->pool, &name);
         if (sa != NULL) {
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
             return NXT_OK;
         }
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return nxt_conf_vldt_error(vldt, "The \"proxy\" address is invalid \"%V\"",
                                &name);
@@ -3068,6 +3073,7 @@ nxt_conf_vldt_server(nxt_conf_validation_t *vldt, nxt_str_t *name,
 {
     nxt_int_t       ret;
     nxt_sockaddr_t  *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     ret = nxt_conf_vldt_type(vldt, name, value, NXT_CONF_VLDT_OBJECT);
 
@@ -3081,6 +3087,7 @@ nxt_conf_vldt_server(nxt_conf_validation_t *vldt, nxt_str_t *name,
         return nxt_conf_vldt_error(vldt, "The \"%V\" is not valid "
                                    "server address.", name);
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return nxt_conf_vldt_object(vldt, value,
                                 nxt_conf_vldt_upstream_server_members);
diff --git a/src/nxt_http_proxy.c b/src/nxt_http_proxy.c
index 6aa3aabb..3dc59b26 100644
--- a/src/nxt_http_proxy.c
+++ b/src/nxt_http_proxy.c
@@ -58,6 +58,7 @@ nxt_http_proxy_init(nxt_mp_t *mp, nxt_http_action_t *action,
     nxt_upstream_t        *up;
     nxt_upstream_proxy_t  *proxy;
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     sa = NULL;
     nxt_conf_get_string(acf->proxy, &name);
 
@@ -96,6 +97,7 @@ nxt_http_proxy_init(nxt_mp_t *mp, nxt_http_action_t *action,
         action->handler = nxt_http_proxy;
     }
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     return NXT_OK;
 }
 
diff --git a/src/nxt_http_request.c b/src/nxt_http_request.c
index a16340de..10dd5d48 100644
--- a/src/nxt_http_request.c
+++ b/src/nxt_http_request.c
@@ -468,6 +468,7 @@ nxt_http_request_client_ip_sockaddr(nxt_http_request_t *r, u_char *start,
 {
     nxt_str_t       addr;
     nxt_sockaddr_t  *sa;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     addr.start = start;
     addr.length = len;
@@ -498,6 +499,7 @@ nxt_http_request_client_ip_sockaddr(nxt_http_request_t *r, u_char *start,
             return NULL;
     }
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     return sa;
 }
 
diff --git a/src/nxt_router.c b/src/nxt_router.c
index e1f4f6da..7f0b0832 100644
--- a/src/nxt_router.c
+++ b/src/nxt_router.c
@@ -736,6 +736,7 @@ nxt_router_conf_data_handler(nxt_task_t *task, nxt_port_recv_msg_t *msg)
     nxt_int_t               ret;
     nxt_port_t              *port;
     nxt_router_temp_conf_t  *tmcf;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     port = nxt_runtime_port_find(task->thread->runtime,
                                  msg->port_msg.pid,
@@ -818,6 +819,7 @@ cleanup:
         nxt_fd_close(msg->fd[0]);
         msg->fd[0] = -1;
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 }
 
 
@@ -1478,6 +1480,7 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
     nxt_app_lang_module_t       *lang;
     nxt_router_app_conf_t       apcf;
     nxt_router_listener_conf_t  lscf;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     static nxt_str_t  http_path = nxt_string("/settings/http");
     static nxt_str_t  applications_path = nxt_string("/applications");
@@ -1962,6 +1965,7 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
 
     nxt_queue_add(&deleting_sockets, &router->sockets);
     nxt_queue_init(&router->sockets);
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return NXT_OK;
 
@@ -2425,6 +2429,7 @@ nxt_router_socket_conf(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
     nxt_sockaddr_t       *sa;
     nxt_socket_conf_t    *skcf;
     nxt_listen_socket_t  *ls;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     sa = nxt_sockaddr_parse(tmcf->mem_pool, name);
     if (nxt_slow_path(sa == NULL)) {
@@ -2492,6 +2497,7 @@ nxt_router_socket_conf(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
         nxt_memcpy(skcf->sockaddr, sa, size);
     }
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     return skcf;
 }
 
diff --git a/src/nxt_runtime.c b/src/nxt_runtime.c
index 46955f1c..67a5fa6a 100644
--- a/src/nxt_runtime.c
+++ b/src/nxt_runtime.c
@@ -760,6 +760,7 @@ nxt_runtime_conf_init(nxt_task_t *task, nxt_runtime_t *rt)
     nxt_sockaddr_t               *sa;
     nxt_file_name_str_t          file_name;
     const nxt_event_interface_t  *interface;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     rt->daemon = 1;
     rt->engine_connections = 256;
@@ -904,6 +905,7 @@ nxt_runtime_conf_init(nxt_task_t *task, nxt_runtime_t *rt)
     if (nxt_runtime_controller_socket(task, rt) != NXT_OK) {
         return NXT_ERROR;
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return NXT_OK;
 }
diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 8220590e..6f476e8a 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -141,6 +141,10 @@ nxt_sockaddr_create(nxt_mp_t *mp, struct sockaddr *sockaddr, socklen_t length,
 
 #endif
     }
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) size-2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", sockaddr->sa_data[i]);
+fprintf(stderr, "\n");
 
 #endif  /* NXT_HAVE_UNIX_DOMAIN */
 
@@ -286,6 +290,10 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
 
 #if (NXT_LINUX)
 
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) sa->socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", p[i]);
+fprintf(stderr, "\n");
         if (p[0] == '\0') {
             size_t  length;
 
@@ -294,8 +302,16 @@ nxt_sockaddr_text(nxt_sockaddr_t *sa)
 
             p = nxt_sprintf(start, end, "unix:@%*s", length - 1, p + 1);
 
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) length);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
         } else {
             p = nxt_sprintf(start, end, "unix:%s", p);
+fprintf(stderr, "ALX: %s():%i:\n ALX:  ", __func__, __LINE__);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", start[i]);
+fprintf(stderr, "\n");
         }
 
 #else  /* !(NXT_LINUX) */
@@ -525,6 +541,7 @@ nxt_sockaddr_parse(nxt_mp_t *mp, nxt_str_t *addr)
 {
     nxt_sockaddr_t  *sa;
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     sa = nxt_sockaddr_parse_optport(mp, addr);
 
     if (sa != NULL
@@ -535,6 +552,7 @@ nxt_sockaddr_parse(nxt_mp_t *mp, nxt_str_t *addr)
                              "The address \"%V\" must specify a port.", addr);
         return NULL;
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return sa;
 }
@@ -545,6 +563,7 @@ nxt_sockaddr_parse_optport(nxt_mp_t *mp, nxt_str_t *addr)
 {
     nxt_sockaddr_t  *sa;
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     if (addr->length == 0) {
         nxt_thread_log_error(NXT_LOG_ERR, "socket address cannot be empty");
         return NULL;
@@ -563,6 +582,7 @@ nxt_sockaddr_parse_optport(nxt_mp_t *mp, nxt_str_t *addr)
     if (nxt_fast_path(sa != NULL)) {
         nxt_sockaddr_text(sa);
     }
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     return sa;
 }
@@ -612,10 +632,18 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
      *   are covered by the specified length of the address structure.
      *   (Null bytes in the name have no special significance.)
      */
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
     if (path[0] == '@') {
         path[0] = '\0';
         socklen--;
     }
+fprintf(stderr, "ALX: %s():%i: %i\n ALX:  ", __func__, __LINE__, (int) socklen - 2);
+for (int i = 0; i < 6; i++)
+fprintf(stderr, "<%c>", path[i]);
+fprintf(stderr, "\n");
 
 #endif
 
diff --git a/src/nxt_upstream_round_robin.c b/src/nxt_upstream_round_robin.c
index 31e2f48a..20771e88 100644
--- a/src/nxt_upstream_round_robin.c
+++ b/src/nxt_upstream_round_robin.c
@@ -51,6 +51,7 @@ nxt_upstream_round_robin_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
     nxt_sockaddr_t              *sa;
     nxt_conf_value_t            *servers_conf, *srvcf, *wtcf;
     nxt_upstream_round_robin_t  *urr;
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
 
     static nxt_str_t  servers = nxt_string("servers");
     static nxt_str_t  weight = nxt_string("weight");
@@ -115,6 +116,7 @@ nxt_upstream_round_robin_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
     upstream->proto = &nxt_upstream_round_robin_proto;
     upstream->type.round_robin = urr;
 
+fprintf(stderr, "ALX: %s():%i\n", __func__, __LINE__);
     return NXT_OK;
 }

So, we first parse the string when we validate it, and then again in the router when we use it. Knowing this, I don't think it can hurt to do the simple patch I suggested yesterday. I'm worried that some path of the existing code might not be prepared to handle a '\0' in the middle of the socket name. However, since that code already existed (i.e., we wouldn't be introducing new bugs or vulnerabilities), I think it's simpler to leave it untouched (assume it still works), and only change it if we find a bug, rather than trying to make sure that it's correct (which would be much harder). As long as my tests don't detect anything weird, I'm happy with it.

What do you think about it?

alejandro-colomar · 2022-08-04T20:54:01Z

A general inspection of the code looks good to me. In the conf validation, we parse the socket address to check it's a valid one, and discard the result. Later in the router, we parse it again to use it. Nothing very weird. We could maybe keep the parsed data to not have to parse it twice... but that would require work that may be unnecessary, and is not vital, I think; there are no obvious places that would be unreasonably slow in that code.

alejandro-colomar · 2022-08-04T21:09:12Z

Tested:

$ ss -l | grep magic
u_str LISTEN 0      511                                          @magic 32357                           * 0          
$ curl --abstract-unix-socket magic http://localhost/
idx

alejandro-colomar · 2022-08-04T21:10:52Z

@echolimazulu

This code should work for you, I guess. Now I'll try to make NGINX work with abstract sockets too.

ac000 · 2022-08-04T22:05:19Z

src/nxt_sockaddr.c

     */
-    if (path[0] == '@') {
+    switch (path[0]) {
+    case '@':
        path[0] = '\0';
+        /* fall through */
+    case '\0':
        socklen--;
    }


Nice use of switch/case.

The only thing I'd point out is that the socket name is stored in the config as 'unix:\u0000magic'. Not saying that's a problem, just pointing it out...

Nice use of switch/case.

It looks soo nice! =)

The only thing I'd point out is that the socket name is stored in the config as 'unix:\u0000magic'. Not saying that's a problem, just pointing it out...

I checked that JSON allows that: https://www.rfc-editor.org/rfc/rfc7159#section-7

The commit title is Fixed support for abstract Unix sockets., but the changelog is supporting abstract UNIX sockets..
Is it ok?

The first commit just fixes the small bugs in the old code that already intended to support abstract sockets. That's why the commit message says "Fixed support ...". However, since we have never officially supported abstract sockets, I used "supporting ..." in the changelog.

ac000

This is certainly the simplest and safest fix and if you wanted to have it stored in the config as @name than that could be done separately as it's likely to be more intrusive, so I'd say this was good to go.

alejandro-colomar · 2022-08-05T11:47:20Z

This is certainly the simplest and safest fix and if you wanted to have it stored in the config as @name than that could be done separately as it's likely to be more intrusive, so I'd say this was good to go.

Yep, I have no preference between \u0000 and @ in the state file. If we find that it causes trouble someday, we can fix it. For now, it's not broken, so let's keep the patch to the very minimum :)

ac000 · 2022-08-08T10:53:22Z

This would be the code to reject \0:

diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 63d3ffb3..c34363f8 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -615,9 +615,13 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
     switch (path[0]) {
     case '@':
         path[0] = '\0';
-        /* fall through */
-    case '\0':
         socklen--;
+        break;
+
+    case '\0':
+        nxt_thread_log_error(NXT_LOG_ERR,
+                             "unix domain socket \"%V\" name starts with \\0");
+        return NULL;
     }
 
 #endif

Interesting, so it's not actually a lot of code, but at the same time, I just don't see the need...

alejandro-colomar · 2022-08-08T10:59:03Z

Interesting, so it's not actually a lot of code, but at the same time, I just don't see the need...

Yeah, I expected it to be a diff larger than the implementation, but it seems smaller. Still, it's comparable to the whole implementation diff, so I'd say it's big. :)

The two patches were (ignoring changelog):

$ git show -2 --stat -- src/

[...]

 src/nxt_conf_validation.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

[...]

 src/nxt_main_process.c | 4 +++-
 src/nxt_sockaddr.c     | 5 ++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

hongzhidao · 2022-08-08T11:32:34Z

Which commit are we talking about here? The way I see it, the current commit split seems right. Each one is doing a separate logical change.

I'm still confused about the feature. If we have supported the feature of abstract sockets by the first commit, why still need the second commit? The second one seems like a fixing. Since storing @ is required, isn't it?

ac000 · 2022-08-08T11:52:21Z

I'm still confused about the feature. If we have supported the feature of abstract sockets by the first commit, why still need the

The first commit simply makes abstract sockets work, you can specify '\u0000' or '@' in the config file and both work.

second commit? The second one seems like a fixing. Since storing @ is required, isn't it?

However with just the first commit, if a user puts '@' in the config then when they read the config back from unit, they get '\u0000' instead.

This is what the second commit changes, they can still use '\u0000' or '@', but when they read back the config it will show them whatever they used. So if they put '@', then it will show back as '@' (rather than '\u0000').

hongzhidao · 2022-08-08T12:11:22Z

However with just the first commit, if a user puts '@' in the config then when they read the config back from unit, they get '\u0000' instead.

That’s my question, if putting ‘@‘ in the configuration, they should get the same ‘@‘, right?

alejandro-colomar · 2022-08-08T12:49:21Z

However with just the first commit, if a user puts '@' in the config then when they read the config back from unit, they get '\u0000' instead.

That’s my question, if putting ‘@‘ in the configuration, they should get the same ‘@‘, right?

Yes. See the test here:
#740 (comment)

ac000 · 2022-08-08T12:50:59Z

However with just the first commit, if a user puts '@' in the config then when they read the config back from unit, they get '\u0000' instead.

That’s my question, if putting ‘@‘ in the configuration, they should get the same ‘@‘, right?

After the second commit, yes.

The first commit simply fixes the existing code to actually make abstract sockets work.

alejandro-colomar · 2022-08-08T12:55:49Z

Which commit are we talking about here? The way I see it, the current commit split seems right. Each one is doing a separate logical change.

I'm still confused about the feature. If we have supported the feature of abstract sockets by the first commit, why still need the second commit? The second one seems like a fixing.

The first commit just fixes the small bugs in the old code that already intended to support abstract sockets. That's why the commit message says "Fixing ...". However, since we have never officially supported abstract sockets, I used "Added support ..." in the changes file.

Since storing @ is required, isn't it?

It's not required. Abstract sockets would work with both \0 (\u0000) and @ after the first commit.

The second commit changes the way we store the socket name in the internal state file (which is observable by a user that queries the control socket). But it doesn't affect in any way the creation of the abstract socket.

alejandro-colomar · 2022-08-08T13:03:55Z

I think we should also add the following code to the first commit:

diff --git a/src/nxt_sockaddr.c b/src/nxt_sockaddr.c
index 63d3ffb3..a90bb392 100644
--- a/src/nxt_sockaddr.c
+++ b/src/nxt_sockaddr.c
@@ -601,8 +601,6 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
 
     socklen = offsetof(struct sockaddr_un, sun_path) + length + 1;
 
-#if (NXT_LINUX)
-
     /*
      * Linux unix(7):
      *
@@ -618,9 +616,12 @@ nxt_sockaddr_unix_parse(nxt_mp_t *mp, nxt_str_t *addr)
         /* fall through */
     case '\0':
         socklen--;
-    }
-
+#if !(NXT_LINUX)
+        nxt_thread_log_error(NXT_LOG_ERR,
+                             "abstract unix domain sockets are not supported");
+        return NULL;
 #endif
+    }
 
     sa = nxt_sockaddr_alloc(mp, socklen, addr->length);

What do you think?

Otherwise, we would probably be silently creating files starting with @ in non-Linux systems, and we clearly don't want that.

andrey-zelenkov · 2022-08-08T13:14:40Z

My 2 cents:

Can we document it as "abstract Unix sockets are supported" without specifying the symbols used for that? And just give an example with @ (if we want to do it at all). We well get rid ourself from work and following the updates in this topic.
I would suggest do not substitute/guess symbols that user wanted to specify in configuration (for this case at least). If \0 is not handled by some languages it's not our problem.

alejandro-colomar · 2022-08-08T13:23:46Z

@andrey-zelenkov

My 2 cents:

1. Can we document it as "abstract Unix sockets are supported" without specifying the symbols used for that? And just give an example with `@` (if we want to do it at all). We well get rid ourself from work and following the updates in this topic.

Sure, we can. In fact, my suggestion for the changelog is just supporting abstract UNIX sockets. Regarding the documentation in the website, I'll way for Artem to come back to discuss about it.

2. I would suggest do not substitute/guess symbols that user wanted to specify in configuration (for this case at least). If `\0` is not handled by some languages it's not our problem.

For me, that would be fine. I'd be happy accepting abstract sockets exactly as the kernel wants them (that is \0name). In JSON, the way to express that is \u0000name; which we support.

@lcrilly suggested using the @ syntax because there's much existing software that uses that symbol, for example ss(8), or Envoy. I like the idea of supporting that symbol, because we're not inventing it. Igor's old code also did this transformation, so I just fixed it, instead of removing it.

So that's why my patch set supports both syntaxes.

alejandro-colomar · 2022-08-08T13:30:11Z

I wonder what happens if you pass \u0000name to Envoy. @lcrilly it would be interesting if you would try that.

hongzhidao · 2022-08-08T13:32:23Z

Just to confirm again before reviewing.

Finally, the decision of whether \u0000name is abstract sockets is YES?
There will be two commits?

alejandro-colomar · 2022-08-08T13:37:14Z

Just to confirm again before reviewing.
1. Finally, the decision of whether `\u0000name` is abstract sockets is YES?

Yes. \u0000name cannot possibly be a normal Unix socket file. It has to be either an abstract socket, or reject it as an invalid value. I prefer interpreting it as a valid abstract socket.

2. There will be two commits?

I prefer two commits.

BTW, all of the commits are on RB too, if you want to review there.

alejandro-colomar · 2022-08-08T13:47:22Z

@lcrilly

I checked Envoy's source code, and I predict that they reject \u0000 (for no good reason, actually):

PipeInstance::PipeInstance(const std::string& pipe_path, mode_t mode,
                           const SocketInterface* sock_interface)
    : InstanceBase(Type::Pipe, sockInterfaceOrDefault(sock_interface)) {
  if (pipe_path.size() >= sizeof(pipe_.address_.sun_path)) {
    throw EnvoyException(
        fmt::format("Path \"{}\" exceeds maximum UNIX domain socket path size of {}.", pipe_path,
                    sizeof(pipe_.address_.sun_path)));
  }
  memset(&pipe_.address_, 0, sizeof(pipe_.address_));
  pipe_.address_.sun_family = AF_UNIX;
  if (pipe_path[0] == '@') {
    // This indicates an abstract namespace.
    // In this case, null bytes in the name have no special significance, and so we copy all
    // characters of pipe_path to sun_path, including null bytes in the name. The pathname must also
    // be null terminated. The friendly name is the address path with embedded nulls replaced with
    // '@' for consistency with the first character.
#if !defined(__linux__)
    throw EnvoyException("Abstract AF_UNIX sockets are only supported on linux.");
#endif
    if (mode != 0) {
      throw EnvoyException("Cannot set mode for Abstract AF_UNIX sockets");
    }
    pipe_.abstract_namespace_ = true;
    pipe_.address_length_ = pipe_path.size();
    // The following statement is safe since pipe_path size was checked at the beginning of this
    // function
    memcpy(&pipe_.address_.sun_path[0], pipe_path.data(), pipe_path.size()); // NOLINT(safe-memcpy)
    pipe_.address_.sun_path[0] = '\0';
    pipe_.address_.sun_path[pipe_path.size()] = '\0';
    friendly_name_ = friendlyNameFromAbstractPath(
        absl::string_view(pipe_.address_.sun_path, pipe_.address_length_));
  } else {
    // Throw an error if the pipe path has an embedded null character.
    if (pipe_path.size() != strlen(pipe_path.c_str())) {
      throw EnvoyException("UNIX domain socket pathname contains embedded null characters");
    }
    StringUtil::strlcpy(&pipe_.address_.sun_path[0], pipe_path.c_str(),
                        sizeof(pipe_.address_.sun_path));
    friendly_name_ = pipe_.address_.sun_path;
  }
  pipe_.mode_ = mode;
}

ac000 · 2022-08-09T17:10:10Z

src/nxt_sockaddr.c

+#if !(NXT_LINUX)
+        nxt_thread_log_error(NXT_LOG_ERR,
+                          "abstract unix domain socket \"%V\" is not supported",
+                          addr);
+        return NULL;
 #endif
+    }


OK, so this enables improved error messages in !Linux, e.g testing on FreeBSD, using either '@magic' or '\u0000magic' for the socket name would result in

{ "error": "Failed to apply new configuration." }

Now we get the likes of

{ "error": "Invalid configuration.", "detail": "The listener address \"unix:@magic\" is invalid." }

So LGTM.

Hmmm, did FreeBSD result in an error before this addition? I was worried that without this, FreeBSD might try to create a file with a leading @ or \0. If it's not the case, I prefer not having this extra code. Especially since using abstract sockets in FreeBSD is nonsense, so I don't want to add much code to make it nicer. I just want to make sure it fails hard, instead of doing weird stuff silently

Could/did you try?

OK, scrub that. If I revert that error check

It will create a socket called '@magic' in the processes current working directory (if it can).

It does error out with "Failed to apply new configuration." with '\u0000magic'.

alejandro-colomar · 2022-08-17T11:06:16Z

Tested with the following:

alx@asus5775:~/etc/unitd$ cat abs_sock.json
{
	"listeners": {
		"unix:@unitd.static.sock": {
			"pass": "routes",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		},
		"unix:@unitd.perl.sock": {
			"pass": "applications/perl",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		},
		"unix:@unitd.php.sock": {
			"pass": "applications/php",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		},
		"unix:@unitd.asgi.sock": {
			"pass": "applications/asgi",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		},
		"unix:@unitd.python.sock": {
			"pass": "applications/python",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		},
		"unix:@unitd.ruby.sock": {
			"pass": "applications/ruby",
			"client_ip": {
				"header": "X-Forwarded-For",
				"source": "unix"
			}
		}
	},
	"applications": {
		"perl": {
			"type": "perl",
			"working_directory": "/home/alx/srv/www/perl/",
			"script": "./sock.psgi"
		},
		"php": {
			"type": "php",
			"root": "/home/alx/srv/www/php/",
			"index": "sock.php"
		},
		"asgi": {
			"type": "python",
			"path": "/home/alx/srv/www/asgi/",
			"module": "sock"
		},
		"python": {
			"type": "python",
			"path": "/home/alx/srv/www/python/",
			"module": "sock"
		},
		"ruby": {
			"type": "ruby",
			"working_directory": "/home/alx/srv/www/ruby",
			"script": "sock.ru"
		}
	},
	"routes": [{
		"action": {
			"share": "/home/alx/srv/www/$uri"
		}
	}]
}

alx@asus5775:~/etc/unitd$ sudo curl -X PUT -d @abs_sock.json --unix-sock /opt/local/unit/control.unit.sock localhost/config
{
	"success": "Reconfiguration done."
}

$ ss -l | grep @unitd
u_str LISTEN 0      511                                  @unitd.php.sock 523347                           * 0          
u_str LISTEN 0      511                                 @unitd.perl.sock 523346                           * 0          
u_str LISTEN 0      511                               @unitd.python.sock 523348                           * 0          
u_str LISTEN 0      511                               @unitd.static.sock 523345                           * 0          
u_str LISTEN 0      511                                 @unitd.asgi.sock 524535                           * 0          
u_str LISTEN 0      511                                 @unitd.ruby.sock 523349                           * 0

$ curl --abstract-unix-sock unitd.python.sock localhost
Hello, Python on Unit!
$ curl --abstract-unix-sock unitd.asgi.sock localhost
Hello, ASGI
$ curl --abstract-unix-sock unitd.static.sock localhost
idx
$ curl --abstract-unix-sock unitd.perl.sock localhost
Hello, Perl on Unit!
$ curl --abstract-unix-sock unitd.php.sock localhost
Hello, PHP on Unit!

(I cheated a bit and added newlines to the output above, since some apps weren't producing a trailing newline, but that's fair.)

ac000 · 2022-08-18T13:32:40Z

Just to confirm that with "Removed trailing \0 in abstract sockets." split out from "Fixed support for abstract Unix sockets." that if you specify a socket as unix:@magic that you get an abstract socket of '\0magic\0' aka '@magic@' which is somewhat unusable, is expected behaviour?

If so, then perhaps that commit should come immediately after the first one, otherwise there's a small window where abstract sockets can work but need to be specified with a trailing \0' to where they are just specified as normal.

Was just thinking if this could potentially cause an issue with git-bisect(1) where you could end up in the middle of this patch set?... if you're debugging an issue (any issue) between the version that fixed abstract sockets and say the following version and you're actually using abstract sockets...

alejandro-colomar · 2022-08-18T13:45:24Z

Just to confirm that with "Removed trailing \0 in abstract sockets." split out from "Fixed support for abstract Unix sockets." that if you specify a socket as unix:@magic that you get an abstract socket of '\0magic\0' aka '@magic@' which is somewhat unusable, is expected behaviour?

That is fixed now (indirectly) by 739cef5 "Storing abstract sockets with @ internally.".
The pytests seem to work after it. Could you confirm?

alejandro-colomar · 2022-08-18T13:47:48Z

If so, then perhaps that commit should come immediately after the first one, otherwise there's a small window where abstract sockets can work but need to be specified with a trailing \0' to where they are just specified as normal.

Was just thinking if this could potentially cause an issue with git-bisect(1) where you could end up in the middle of this patch set?... if you're debugging an issue (any issue) between the version that fixed abstract sockets and say the following version and you're actually using abstract sockets...

I wouldn't mind merging both patches. But currently, the fix is consecutive (any of the 2nd and 4th patches do fix that, in different ways), so I guess a bisect would be fine too.

ac000 · 2022-08-18T15:46:05Z

Yes. I had missed that the 2nd patch also fixes the trailing '\0' in the '@socket' case.

So my only suggestion would be to perhaps add a little wording to the first two commits about the trailing '0' issue and it being fixed (in the '@socket' case).

ac000 · 2022-08-18T16:01:56Z

That is fixed now (indirectly) by 739cef5 "Storing abstract sockets with @ internally.". The pytests seem to work after it. Could you confirm?

Yes, both manual testing and pytests pass with just the first two commits.

alejandro-colomar · 2022-08-18T16:53:51Z

Great. Although in FreeBSD pytests fail, because they try to use @foo as an abstract socket (and expect a failure), when it's interpreted as a file socket. So in the end we'll require the first 3 patches.

Unix domain sockets are normally backed by files in the filesystem. This has historically been problematic when closing and opening again such sockets, since SO_REUSEADDR is ignored for Unix sockets (POSIX left the behavior of SO_REUSEADDR as implementation-defined, and most --if not all-- implementations decided to just ignore this flag). Many solutions are available for this problem, but all of them have important caveats: - unlink(2) the file when it's not needed anymore. This is not easy, because the process that controls the fd may not be the same process that created the file, and may not have file permissions to remove it. Further solutions can be applied to that caveat: - unlink(2) the file right after creation. This will remove the pathname from the filesystem without closing the socket (it will continue to live until the last fd is closed). This is not useful for us, since we need the pathname of the socket as its interface. - chown(2) or chmod(2) the directory that contains the socket. For removing a file from the filesystem, a process needs write permissions in the containing directory. We could put sockets in dummy directories that can be chown(2)ed to nobody. This could be dangerous, though, as we don't control the socket names. It is our users who configure the socket name in their configuration, and so it's easy that they don't understand the many implications of not chosing an appropriate socket pathname. A user could unknowingly put the socket in a directory that is not supposed to be owned by user nobody, and if we blindly chown(2) or chmod(2) the directory, we could be creating a big security hole. - Ask the main process to remove the socket. This would require a very complex communication mechanism with the main process, which is not impossible, but let's avoid it if there are simpler solutions. - Give the child process the CAP_DAC_OVERRIDE capability. That is one of the most powerful capabilities. A process with that capability can be considered root for most practical aspects. Even if the capability is disabled for most of the lifetime of the process, there's a slight chance that a malicious actor could activate it and then easily do serious damage to the system. - unlink(2) the file right before calling bind(2). This is dangerous because another process (for example, another running instance of unitd(8)), could be using the socket, and removing the pathname from the filesystem would be problematic. To do this correctly, a lot of checks should be added before the actual unlink(2), which is error-prone, and difficult to do correctly, and atomically. - Use abstract-namespace Unix domain sockets. This is the simplest solution, as it only requires accepting a slightly different syntax (basically a @ prefix) for the socket name, to transform it into a string starting with a null byte ('\0') that the kernel can understand. The patch is minimal. Since abstract sockets live in an abstract namespace, they don't create files in the filesystem, so there's no need to remove them later. The kernel removes the name when the last fd to it has been closed. One caveat is that only Linux currently supports this kind of Unix sockets. Of course, a solution to that could be to ask other kernels to implement such a feature. Another caveat is that filesystem permissions can't be used to control access to the socket file (since, of course, there's no file). Anyone knowing the socket name can access to it. The only method to control access to it is by using network_namespaces(7). Since in unitd(8) we're using 0666 file sockets, abstract sockets should be no more insecure than that (anyone can already read/write to the listener sockets). - Ask the kernel to implement a simpler way to unlink(2) socket files when they are not needed anymore. I've suggested that to the <linux-fsdevel@vger.kernel.org> mailing list, in: <lore.kernel.org/linux-fsdevel/0bc5f919-bcfd-8fd0-a16b-9f060088158a@gmail.com/T> In this commit, I decided to go for the easiest/simplest solution, which is abstract sockets. In fact, we already had partial support. This commit only fixes some small bug in the existing code so that abstract Unix sockets work: - Don't chmod(2) the socket if it's an abstract one. This fixes the creation of abstract sockets, but doesn't make them usable, since we produce them with a trailing '\0' in their name. That will be fixed in the following commit. This closes #669 issue on GitHub.

We accept both "\u0000socket-name" and "@socket-name" as abstract unix sockets. The first one is passed to the kernel pristine, while the second is transformed '@'->'\0'. The commit that added support for unix sockets accepts both variants, but we internally stored it in the same way, using "\u0000..." for both. We want to support abstract sockets transparently to the user, so that if the user configures unitd with '@', if we receive a query about the current configuration, the user should see the same exact thing that was configured. So, this commit avoids the transformation in the internal state file, storing user input pristine, and we only transform the '@' for a string that will be used internally (not user-visible). This commit (indirectly) fixes a small bug, where we created abstract sockets with a trailing '\0' in their name due to calling twice nxt_sockaddr_parse() on the same string. By calling that function only once with each copy of the string, we have fixed that bug.

The previous commit added/fixed support for abstract Unix domain sockets on Linux with a leading '@' or '\0'. To be consistent in all platforms, treat those prefixes as markers for abstract sockets in all platforms, and fail if abstract sockets are not supported by the platform. That will avoid mistakes when copying a config file from a Linux system and using it in non-Linux, which would surprisingly create a normal socket.

Sockets starting with '\0' had a trailing '\0' character. That made it hard to use tools like curl(1) with them.

alejandro-colomar · 2022-09-05T12:03:53Z

Merged (except for Removed trailing \0 in abstract sockets.).

alejandro-colomar requested review from ac000, hongzhidao and tippexs July 28, 2022 14:47

alejandro-colomar mentioned this pull request Jul 28, 2022

Socket: replacement of called functions, permissions, reusability #670

Closed

alejandro-colomar changed the title ~~Added support for abstract Unix sockets.~~ Added (fixed) support for abstract Unix sockets. Aug 4, 2022

ac000 reviewed Aug 4, 2022

View reviewed changes

ac000 approved these changes Aug 4, 2022

View reviewed changes

ac000 approved these changes Aug 9, 2022

View reviewed changes

alejandro-colomar mentioned this pull request Aug 10, 2022

unlink(2) Unix sockets after use. #735

Closed

alejandro-colomar added 4 commits August 18, 2022 18:55

Removed trailing \0 in abstract sockets.

e3e4efd

Sockets starting with '\0' had a trailing '\0' character. That made it hard to use tools like curl(1) with them.

alejandro-colomar closed this Sep 5, 2022

Added (fixed) support for abstract Unix sockets. #740

Added (fixed) support for abstract Unix sockets. #740

Uh oh!

Conversation

alejandro-colomar commented Jul 28, 2022

Uh oh!

alejandro-colomar commented Jul 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alejandro-colomar commented Jul 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alejandro-colomar commented Jul 28, 2022

Uh oh!

alejandro-colomar commented Jul 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hongzhidao commented Jul 28, 2022

Uh oh!

alejandro-colomar commented Jul 28, 2022

Uh oh!

alejandro-colomar commented Jul 28, 2022

Uh oh!

hongzhidao commented Jul 28, 2022

Uh oh!

ac000 commented Jul 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ac000 commented Jul 28, 2022

Uh oh!

lcrilly commented Jul 28, 2022

Uh oh!

alejandro-colomar commented Jul 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ac000 commented Jul 29, 2022

Uh oh!

ac000 commented Aug 2, 2022

Uh oh!

hongzhidao commented Aug 2, 2022

Uh oh!

alejandro-colomar commented Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alejandro-colomar commented Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alejandro-colomar commented Aug 4, 2022

Uh oh!

alejandro-colomar commented Aug 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alejandro-colomar commented Aug 4, 2022

Uh oh!

alejandro-colomar commented Aug 4, 2022

Uh oh!

alejandro-colomar commented Aug 4, 2022

Uh oh!

ac000 Aug 4, 2022

Choose a reason for hiding this comment

Uh oh!

alejandro-colomar Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

hongzhidao Aug 8, 2022

Choose a reason for hiding this comment

Uh oh!

alejandro-colomar Aug 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ac000 left a comment

Choose a reason for hiding this comment

Uh oh!

alejandro-colomar commented Aug 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ac000 commented Aug 8, 2022

Uh oh!

alejandro-colomar commented Aug 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

alejandro-colomar commented Jul 28, 2022 •

edited

Loading

alejandro-colomar commented Jul 28, 2022 •

edited

Loading

alejandro-colomar commented Jul 28, 2022 •

edited

Loading

ac000 commented Jul 28, 2022 •

edited

Loading

alejandro-colomar commented Jul 29, 2022 •

edited

Loading

alejandro-colomar commented Aug 3, 2022 •

edited

Loading

alejandro-colomar commented Aug 3, 2022 •

edited

Loading

alejandro-colomar commented Aug 4, 2022 •

edited

Loading

alejandro-colomar Aug 8, 2022 •

edited

Loading

alejandro-colomar commented Aug 5, 2022 •

edited

Loading

alejandro-colomar commented Aug 8, 2022 •

edited

Loading

alejandro-colomar commented Aug 8, 2022 •

edited

Loading

alejandro-colomar commented Aug 8, 2022 •

edited

Loading

alejandro-colomar commented Aug 8, 2022 •

edited

Loading

alejandro-colomar commented Aug 8, 2022 •

edited

Loading

alejandro-colomar commented Aug 8, 2022 •

edited

Loading

alejandro-colomar Aug 9, 2022 •

edited

Loading

alejandro-colomar commented Aug 17, 2022 •

edited

Loading

alejandro-colomar commented Aug 18, 2022 •

edited

Loading

alejandro-colomar commented Aug 18, 2022 •

edited

Loading

ac000 commented Aug 18, 2022 •

edited

Loading