-
Notifications
You must be signed in to change notification settings - Fork 369
Socket: replacement of called functions, permissions, reusability #670
Conversation
Due to the mandatory presence of error messages: nginx#670 (comment)
Due to the mandatory presence of error messages: nginx#670 (comment)
Due to the mandatory presence of error messages: nginx#670 (comment)
Due to the mandatory presence of error messages: nginx#670 (comment)
…me fixes Due to the mandatory presence of error messages: - nginx#670 (comment) - nginx#670 (comment)
| nxt_uint_t level, nxt_uint_t sockopt); | ||
| NXT_EXPORT nxt_int_t nxt_socket_setsockopt(nxt_task_t *task, nxt_socket_t s, | ||
| nxt_uint_t level, nxt_uint_t sockopt, int val); | ||
| NXT_EXPORT nxt_int_t nxt_socket_bind(nxt_task_t *task, nxt_socket_t s, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note you can use nxt_socket_errno after calling nxt_socket_bind(). So it's not a good idea to add one more parameter *errnum. It's better to keep the API as minimal as possible, including its parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that before. But my concern is that any other function call (like nxt_alert and others) could theoretically cause errno to change its state. I think this is probably why nxt_listen_socket_create uses pure listen and connect for errno.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean (see "Notes"):
https://linux.die.net/man/3/errno
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hongzhidao what do you think about the above described by me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense, but the decision of where to do listen affects such changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to draw your attention to the fact that in this conversation we are discussing errno, not listen. Sorry, I just don't see the relationship between the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hongzhidao Let's get this straight. Is it still safe to call nxt_socket_errno after nxt_alert and other calls?
|
|
||
| nxt_sockaddr_text(sa); | ||
|
|
||
| (void) unlink(sa->u.sockaddr_un.sun_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think the previous one is good enough.
If you think it's possible that the temp file can be existing like created manually, it happens very rarely.
We can just to use unlink(sa->u.sockaddr_un.sun_path).
My thought is based on we need to do more in the future about this feature to clean up sock files.
So we can just keep the previous code here.
Your way may be used in the future. But now I prefer not to introduce it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This improvement was motivated by this comment:
#649 (comment)
Yes, this function does not implement a check to see if the socket is abandoned or not. But for this, as recommended everywhere and for similar use in linux, a lock is required, which I feel we will not get to soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to my previous answer, I want to note that ignoring the error handling of (void) unlink here may cause bind to fail later and throw an error (EADDRINUSE) not related to the cause of the failure (e.g. EACCES), but related to the consequence of using bind after (void) unlink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good morning @mar0x,
Since you suggested this feature earlier, please let me know what you think of my proposed implementation. Thank you!
|
|
||
|
|
||
| static nxt_int_t | ||
| nxt_main_listening_socket(nxt_sockaddr_t *sa, nxt_listening_socket_t *ls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you didn't notice that the router process is different from the controller process.
controller has the privilege to do bind on the controller listening socket, then it does listen operation.
But router has no such privilege, so it has to ask main process to do bind on listening sockets from the configuration. The main process doesn't do listen, the router process does.
There are two operations: bind and listen.
controller process: bind + listen
router process: ask main to bind then listen by itself.
So it's wrong to reuse nxt_listen_socket_create().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, didn't really notice that difference.
I asked @VBart earlier about the need to keep the separation between nxt_listen_socket and nxt_main_listening_socket.
My question and his answer:
#649 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
listening socket in controller process is in charge of configuration management.
listening in router process is in charge of traffic requests.
So the listen operation related to listeners options need to be done in the router process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your explanation!
The sequence and reasons for calling via RPC between main <-> router are clear to me.
Please explain. I don't understand why nxt_listen_socket is executed exactly in router in nxt_router_listen_socket_ready? After all, in fact, this happens immediately after the creation of the socket and after the response from main -> router. If an error occurs in the main RPC handler, then the router will simply not receive fd from main and will not interact with the given socket. From the point of view of parameters: BACKLOG, TCP_DEFER_ACCEPT and other sockets for the controller and for the router seem identical to me.
Both cases:
controller->nxt_listen_socket_createrouter->main(nxt_main_listening_socket)->router(nxt_router_listen_socket_ready)
The same socket setup happens, only in the case of the router it is split into two parts and ends in nxt_router_listen_socket_ready. Is it only necessary to put the socket in passive mode only on successful completion of the RPC exchange between processes in the router? Does it make any real sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, then I suggest keeping the current split and just adding unlink before bind in nxt_main_listening_socket, thus solving this problem in the same way as it is already implemented in nxt_listen_socket_create. And in the future, we can replace unlink with a more secure solution, if at the moment nxt_socket_release_by_path does not suit us for one reason or another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, listen has an effect on the socket itself and is not tied to the state of a particular process. In my understanding, listen simply switches the socket to passive mode, setting the mode of operation for incoming connections to be processed when accept is received. So I can't see a direct relationship between the traffic (which is the function of the router) and the process (which is the router) with the listen call itself, which simply changes the socket behavior.
You are right, listen only changes the bound TCP socket into a listening state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, then I suggest keeping the current split and just adding
unlinkbeforebindinnxt_main_listening_socket, thus solving this problem in the same way as it is already implemented innxt_listen_socket_create. And in the future, we can replaceunlinkwith a more secure solution, if at the momentnxt_socket_release_by_pathdoes not suit us for one reason or another.I agree to use
unlinknow, but we need to reuse the logic innxt_listen_socket_create(): create temp file then bind. Otherwise, if it's bonded with other instances like nginx, it's bad tounlinkit.
Then we need to split nxt_listen_socket_create into code before and after listen execution, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of "create temp path for unix domain socket" and "test connecting after listening" should work in the same process. If bind happens on the main process but listen happens on the router process, the logic is broken. We'll discuss it first and tell you if it's clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I don't understand at all. So this is the current version of the existing code in the origin master branch. Do you think it's incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's right. It makes sense that the router process does the listen operation until all the listeners have been bounded in the main process.
What I'm not sure about is if the listen operation can be moved to the main process from the router process. I didn't notice the background before.
dcffa05 to
062223f
Compare
| nxt_alert(task, "listen(%d, %d) failed %E", | ||
| s, ls->backlog, nxt_socket_errno); | ||
| goto listen_fail; | ||
| if (ls->start == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hongzhidao How do you like this decision? It is fully consistent with the current logic of the listen and router call program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current patch is based on the logic from the controller process, as you used nxt_listen_socket_create(), but it seems it requires main process does listen operation, and now I don't know if it's reasonable.
My suggestion is to make it clear first. Please be patient with the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The moment I noted, does not perform listen when called nxt_listen_socket_create that came from the router, but at the same time performs it for the controller. That is, this logic corresponds to the already existing logic (without changes) contained in the master branch.
I reproduce new options based on your suggestions and comments as they come in so that we can immediately see the result and move on to the best solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If we perform
listenin themainprocess before all sockets are bound, clients will be able to be accepted but without any process. It's possible to happen in the real world. - If we perform
listenin therouterprocess, the logic in thenxt_listen_socket_createis broken, because we don't check the listening connection.
// The below logic is required after `listen`.
#if (NXT_HAVE_UNIX_DOMAIN)
if (orig_sa != NULL) {
ts = nxt_socket_create(task, AF_UNIX, SOCK_STREAM, 0, 0);
if (ts == -1) {
goto listen_fail;
}
ret = connect(ts, &orig_sa->u.sockaddr, orig_sa->socklen);
err = nxt_socket_errno;
nxt_socket_close(task, ts);
if (ret == 0) {
nxt_alert(task, "connect(%d, %*s) succeed, address already in use",
ts, (size_t) orig_sa->length,
nxt_sockaddr_start(orig_sa));
goto listen_fail;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sincerely appreciate your corrections, thank you for this!
You can also move this connect check into a separate function like nxt_listen_socket and call it separately from router and from nxt_listen_socket_create with ls->start == NULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that personally, I can't agree this new way.
- It seems the logic is broken.
static void
nxt_router_conf_error(nxt_task_t *task, nxt_router_temp_conf_t *tmcf)
{
...
for (qlk = nxt_queue_first(&creating_sockets);
qlk != nxt_queue_tail(&creating_sockets);
qlk = nxt_queue_next(qlk))
{
skcf = nxt_queue_link_data(qlk, nxt_socket_conf_t, link);
s = skcf->listen->socket;
if (s != -1) {
nxt_socket_close(task, s);
}
nxt_free(skcf->listen);
}
- It's strange that the
routerprocess has to maintain one more socket information aboutrename. I think it makes things complicated.
We think the ideal solution is to clean up all the unused unix domain socket files.
This is our internal plan for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your correction.
I'll fix nxt_router_conf_error, thanks for bringing this to my attention.
But rename cannot be performed directly from router, as it does not have sufficient permissions to perform this operation. This moment is tested by me personally.
The whole difficulty is that you want to ensure similar work of controller and router on the basis of one nxt_listen_socket_create, depending on the place in the code where listen is called, and also provide all the necessary checks and operations after running listen ( connect, rename).
Please tell me more about your ideal solution. I want to note that cleaning up unused (abandoned) sockets and the problem of reusing sockets at the moment are completely different problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please tell me more about your ideal solution. I want to note that cleaning up unused (abandoned) sockets and the problem of reusing sockets at the moment are completely different problems.
If we can clean up unused sockets, then we don't need the way borrowed from controller, though it's more robust. To clean up unused sockets, we need to get the pending deleting sockets in the router process, then ask the controller/main process to do it.
But rename cannot be performed directly from router, as it does not have sufficient permissions to perform this operation. This moment is tested by me personally.
Right, I agree that the way works, but it makes the router more complicated. The router code is already complicated, I prefer not to make it bigger...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not take it for excessive perseverance and obsession. I'm trying to solve a real problem I'm facing in my projects.
The problem is that abandoned and deleted router sockets during a successful (soft) termination may not solve the problem of reusing sockets when the program terminates unexpectedly.
Removing existing sockets must somehow be done during the socket creation process, as it may have been abandoned at the time of the previous unexpected or or early termination of the program.
Previously (#670 (comment)) I suggested not using nxt_listen_socket_create and adding unlink before bind in nxt_main_listening_socket, but you (#670 (comment)) said that it should be abandoned in favor of nxt_listen_socket_create, which involves communication with controller and the need to separate the logic of listen, connect, rename.
When using cloud platforms, the snap-in sends a SIGKILL signal to the container and it is expected that it will complete all the necessary procedures to close all existing processes: connections, sockets, etc. If the end application does not have time to do this, then the container will be forcefully terminated within 10 seconds after SIGKILL.
I understand that in the case of a Unit, 10 seconds is more than enough to terminate all processes, however, but unexpected terminations may not be due to a SIGKILL being passed from a snap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that personally, I can't agree this new way.
- It seems the logic is broken.
Seems to be fixed now: 1497d19
Access for unix sockets should be equal to 0666 and controlled by the rights set on the directory. Discussion: nginx#647
5195c85 to
867d4ec
Compare
|
Hello Alex (@alejandro-colomar), Glad I was able to answer your question!
I'm using
Thank you for the detailed explanation of how As @VBart previously mentioned, I see This is my vision and use of unit, of course it may differ from the vast majority. Sorry for highlighting the text, just wanted to emphasize what I think are the most important thoughts and improve readability. I am always glad to any discussions and work with you and your team. Regards, |
Ok, so Linux.
But normal Unix sockets are already restricted to the same host (AFAIK). I don't think there are extra restrictions in this regard by using abstract Unix sockets. Do you have an example application the can work with normal Unix sockets in a multi-host environment that would stop working if the socket was an abstract one? |
|
Usually it is. I don't understand how you can grant access to connect to an abstract socket outside of the host (virtualization unit) on which it was running. I consider it correct to separate multiple connections to a file socket and actual external access to an abstract socket into different problems. Perhaps I do not know something - I admit it. With a file socket, this option is definitely possible (shared) and multiple connections to this socket are not required in most standard scenarios, and if it is required and it is technically unavailable (limited), then it is possible to provide multiple connections using a TCP/IP socket. Earlier, in discussions with @VBart, I said that I do not understand the logic of the F5 Networks & Nginx Inc R&D teams. which boils down to not using file sockets in the Unit product, regardless of the reasons, such as those indicated by you (as I read this): since access is 666 for file sockets - we can only use abstract ones. But no, we can't (in my humble opinion). If you take the same nginx, from time immemorial it has supported all possible types of connection, not to mention the implementation of a larger number of protocols available in it. I would like to recall the whole history of the development of this problem:
Also a special thanks to @tippexs for leading the process. As a result, as I understand it, the unit team came to the question of the need to implement and maintain file sockets in principle as such, which seems to me a little strange, as well as the fact that I repeatedly give detailed usage scenarios (it’s not difficult for me, I’m glad help), which I have never seen from the key developers of unit (@VBart), except for the statements that this is not necessary from real users. Apparently, in their opinion, I am not real or I am not a user of the unit product. I don’t blame anyone, but from the outside it looks, to put it mildly, like an ordinary sabotage of the development of the project by individual developers and brings them to the decision to refuse to implement file sockets, motivated not by the lack of the possibility of implementation or the presence of real demand. I did not see answers from @mar0x, @VBart, in this PR, those who participated in the discussions of this problem and its solution at the very beginning and asked the vector, but subsequently completely avoided discussing this problem. I also want to note that for several months of waiting, I did not wait for the promised answer with the vision of the future Unit project, which @lcrilly promised me by asking me to write a letter to his mail: liam@nginx.com, which he kindly provided me earlier and to which I wrote to him on 03/31/2022. Nevertheless, I respect his busyness and understand that my letter could simply be lost. I want to clarify what I am for the development and bright future of the unit product, which is why I spent a significant amount of time trying to help this project and continue to do so until now. Perhaps I am wasting my time, as I have no desire to resist the overall vision of the unit team's product development, which changes from circumstances unknown to me. Regardless of the above, I am full of hopes and expectations related to the unit project. I have great respect for absolutely all the contributors of the project and I am glad to interact with each of them regardless of the circumstances, as I believe that the future of the unit project is the key in this whole story and we should probably all concentrate on joining forces when working on it and reaching consensus rather than developing disagreements on it. |
|
I'm not saying we're not going to implement Unix sockets. We are. And I'm about to commit the changes into the master branch (I wanted to clear up a few discussions before that, but the code is ready). What I'm saying is that I want to support both variants of the Unix domain socket: filesystem and abstract. Both are equivalent from a user point of view: filesystem sockets would be used as The other difference between filesystem sockets and abstract sockets is that with filesystem sockets you'll have the problem that your filesystem will have files that you'll have to clean up manually, while the abstract sockets will be cleaned up by the kernel automatically (a good thing). You sholdn't notice any other differences (as long as the application that wants to talk to the socket also supports abstract sockets; NGINX still doesn't but I'm discussing with them about the possibility of adding support for it). |
|
Thanks for your detailed answer. Sorry, maybe I misread or misunderstood your answer (I fully admit it). But I think that you and I understand perfectly well that the implementation of abstract and file sockets you specified is available for the most part at the current time (only through My question to the implementation of file sockets you indicated at the moment is why the user needs to separately manually determine where the abandoned file socket was, track the state of the unit and delete it after the unit automatically created this file socket at startup and simply abandoned it at the end of the program? Moreover, when not deleting the file socket, the unit cannot be restarted. |
|
I didn't solve the problem with Unix domain filesystem sockets because:
So I don't see much benefit in cleaning the files within unit, considering the complexity it imposes, and the available alternatives. It's not fully discarded, but I don't like the idea. |
Thanks for the options provided.
Please forget about abstract sockets in the context of this PR, I didn't mention this question at all until you first did it.
You are asking to show the environment that I have already described at least 2 times before (in this PR and before that in a discussion with @VBart in another PR). I'm talking about two Pods with roles (nginx and unit) inside a Kubernetes cluster with file socket communication (shared). How can you implement this exchange (communication between two applications) through an abstract socket? Please explain this to me?
I'm relying on Google's advice, which is a good reflection of why using managers like Moreover, I don't use
I think that it doesn't matter what we personally like and what we don't when we talk about the needs of the business expressed in the requirements for the product that the end user needs. |
Exactly as you would with unix sockets. See an example here. Please feel free to ask any questions if you don't understand something. alx@asus5775:~/tmp$ cat srv_a.c
#include <err.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <unistd.h>
#define MY_SOCK_NAME "@magic"
#define MY_LISTEN_BACKLOG 50
int main(void)
{
int sfd, cfd;
char buf[7];
socklen_t len;
struct sockaddr_un sun = {0}, cun;
sfd = socket(AF_UNIX, SOCK_STREAM, 0);
if (sfd == -1)
err(EXIT_FAILURE, "socket(2)");
sun.sun_family = AF_UNIX;
strncpy(sun.sun_path, MY_SOCK_NAME, sizeof(sun.sun_path) - 1);
sun.sun_path[0] = '\0';
len = sizeof(sa_family_t) + strlen(MY_SOCK_NAME);
if (bind(sfd, (struct sockaddr *) &sun, len) == 1)
err(EXIT_FAILURE, "bind(2)");
if (listen(sfd, MY_LISTEN_BACKLOG) == -1)
err(EXIT_FAILURE, "listen(2)");
len = sizeof(cun);
cfd = accept(sfd, (struct sockaddr *) &cun, &len);
if (cfd == -1)
err(EXIT_FAILURE, "accept(2)");
if (read(cfd, buf, sizeof(buf)) != sizeof(buf))
err(EXIT_FAILURE, "read(2)1");
if (write(STDOUT_FILENO, buf, sizeof(buf)) == -1)
err(EXIT_FAILURE, "write(2)");
if (read(cfd, buf, 1) != 0)
err(EXIT_FAILURE, "read(2)2");
if (close(cfd) == -1)
err(EXIT_FAILURE, "close(cfd)");
if (close(sfd) == -1)
err(EXIT_FAILURE, "close(sfd)");
exit(EXIT_SUCCESS);
}alx@asus5775:~/tmp$ cat cli_a.c
#include <err.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <unistd.h>
#define MY_SOCK_NAME "@magic"
int main(void)
{
int sfd;
socklen_t len;
struct sockaddr_un sun = {0};
sfd = socket(AF_UNIX, SOCK_STREAM, 0);
if (sfd == -1)
err(EXIT_FAILURE, "socket(2)");
sun.sun_family = AF_UNIX;
strncpy(sun.sun_path, MY_SOCK_NAME, sizeof(sun.sun_path) - 1);
sun.sun_path[0] = '\0';
len = sizeof(sa_family_t) + strlen(MY_SOCK_NAME);
if (connect(sfd, (struct sockaddr *) &sun, len) == 1)
err(EXIT_FAILURE, "connect(2)");
if (write(sfd, "hello\n", 7) != 7)
err(EXIT_FAILURE, "write(2)");
if (close(sfd) == -1)
err(EXIT_FAILURE, "close(sfd)");
exit(EXIT_SUCCESS);
}alx@asus5775:~/tmp$ cc -Wall -Wextra srv_a.c -o srv_a
alx@asus5775:~/tmp$ cc -Wall -Wextra cli_a.c -o cli_a
alx@asus5775:~/tmp$ ./srv_a &
[1] 2885
alx@asus5775:~/tmp$ ./cli_a
hello
[1]+ Done ./srv_a(Thanks to @ac000, who pointed out a bug in the above program. I fixed it.) |
Thanks for your explanation. But in the previous sentence you refer to, I gave a specific example that involves interacting with a socket from two independent containers.
As far as I remember, the |
That picture helps a lot. So, I've been testing isolating my server and client programs in Linux namespaces, and I found that the only thing that causes the socket to be invisible is having both processes in different network namespaces. That's consistent with what network_namespaces(7) describes: Are both pods in the same network Linux namespace? If not, is it something that you can change? (The following sentence was incorrect, after investigation. Different pods are in different net namespaces, and although that can be relaxed, it would be a security issue.) |
Hello Alex, I apologize for the wording "interprocessor communication" - I meant communication between two processes within the same host. I formulated it poorly, since I wrote the answer and drew the picture late at night. As far as I remember, the attached filespace driver (attached file system, shared) is from the Kubernetes snap-in and is not connected to the network in any way. This works in much the same way as mounted volumes in Docker, although in my humble opinion it is more efficient. |
Don't worry, with the picture I got a perfect idea of what you have. So, I've been investigating about kubernetes, and you're right, pods are isolated from each other regarding network (kubernetes implements this by puting the processes running in them in diferent Linux network namespaces). So, if you have nginx and unit in different pods, it's impossible to use abstract sockets. But if you put the two containers in the same pod (that's possible), they will share network namespace, and they'll be able to communicate that way between them. Would you be able to use a single pod with two containers? As you've explained it to me, it seems that the relation of nginx containers to unit containers will be 1:1, right? If that's the case, using the same pod would make sense in terms of scalability, since you could just increase the multi-container pod instances easily. It would also remove the need for using a shared filesystem mountpoint. I think it would be even better in terms of isolation. |
|
@echolimazulu good so see you again hanging around! Thank you very much for all the work you put into this topic! We really appreciate it! The final infrastructure is clear to me now but could you explain whats the use case for the NGINX (OSS) Proxy in front of the Unit instance? Just want to understand what feature is missing. Looking forward to your answer. |
Hello @tippexs, Thank you for your attention to this issue and for joining the discussion with @alejandro-colomar and @hongzhidao. Glad to work with you all on this issue! Specifically, I use and I really like the way it is implemented in nginx:
|
Hello @alejandro-colomar, Thank you for taking the time to explore Kubernetes features and suggestions. Unfortunately, the abstraction option you described: two programs ( Here I give my preference and follow Google's guidelines for Kubernetes containers, namely: As for scalability, I prefer to consider its possibility at the |
Dear Evgenii, I totally agree with you that placing two processes in the same container would be a terrible idea. I am suggesting instead a totally different solution: placing one single process per container, but two containers in the same pod. As you may already know, a pod is a group of containers that share the same network namespace (they are isolated as a whole, so no networking to the outside, but they can network between them). Thank you very much.
|
|
I'm referring to this: https://www.mirantis.com/blog/multi-container-pods-and-container-communication-in-kubernetes/ which you'll notice that mentions as a use case having nginx as a reverse proxy before the actual app (unit in this case). |
|
With my srv_a and cli_a programs, the YAML would look something like the following: apiVersion: v1
kind: Pod
metadata:
name: server_and_client
spec:
containers:
- name: server
image: srv_a
command: ["/bin/sh", "-c"]
args:
- /usr/local/bin/srv_a
- name: client
image: cli_a
command: ["/bin/sh", "-c"]
args:
- sleep 10 && /usr/local/bin/cli_a |
Wow, Alex (@alejandro-colomar), this is incredibly useful information, sincerely thank you for this! And yes, it seems to me that this will allow the use of a common namespace to work with I overlooked this, since I mainly work in abstraction: But the main guest in this PR is still I also know of people who use these roles ( If we abandon file sockets altogether, then in my humble opinion it looks like a degradation of the project. If we leave everything as it is and file sockets have to be deleted manually, after automatic creation at the time of unit startup - then this is also, in my humble opinion, a degradation of the project. In fact, when we talk about I am sincerely glad and grateful to you personally and @tippexs for throwing options to find a solution, because this is how we can come to a successful and best solution to this problem. Regards, |
I'm happy that this is so useful to you! :-)
Yeah, I know there are scenarios where file sockets might be necessary.
Yes, please share any updates about this; tell me if it works well for you!
When/if we receive a report that can't be adapted to use abstract sockets, this PR will be reconsidered. But for now, we have a queue of difficult things to implement, and will try to keep it simple for now.
No, it's not abandoned. It's just delayed until we really need it.
Still, I agree with you that Unix sockets are a big thing for Unit, and in fact we're right now discussing about more uses of unix sockets within Unit (and NGINX too). Cheers, Alex |
Good news! Thanks for sharing this! Once again, I want to note that I do not insist on combining this PR, it seems to me that I stand on the side and act in the interests of those users who are used to, love and want to use unix sockets in unit and for whom the lack of the correct ability to use them can be critical and the lack of a proper level of support and implementation on the part of the developer at the moment when they decide to choose unit as a product to implement in their production solutions.
And isn't this an example of use with the inability to use Sorry, but the position on your part and that of your colleagues is constantly changing. Sometimes (1) you call it implementation complexity, sometimes (2) lack of real need on the part of customers, sometimes (3) lack of real scenarios where it is required (offering workarounds), sometimes (4) useless project legacy, sometimes (5) internal prioritization features and lack of internal implementation capabilities. Am I forgetting something here? I wonder what else could be invented here to justify not doing this in a few months from the moment I reported it and even suggested a possible solution? This is a rhetorical question. UPDATE: Personally, I am grateful to you for all kinds of attempts to solve and the solutions you provided for the problems I found, from my point of view, you are the most motivated and effective employee along with @tippexs and @hongzhidao. Please note that I do not insist on solving this problem! If you do not consider it necessary to solve this problem or do not want to do it now - please, I will accept any decision on this problem. The project is yours, so it's up to you! |
I'd like to know a bit more about the case. I admit that I have no idea about ESXi (I for example had some idea about k8s, which is why I could help you). Maybe after some investigation, we can conclude that there's an easier way. I don't know.
Certainly not. I don't even know if you have a subscription to nginx. I care about every user of Unit. And if a feature makes sense to me, balancing the implementation complexity, and its benefits, I'll implement it, independently of who requested/suggested it. Just saying that the complexity of this one is huge, so I want to make sure that we really need it, before taking the time to add it.
I work for NGINX (F5), but my comments here are my own. I don't speak for my collegues, nor do I share their opinion in some cases. As you've noticed in the past, I have a strong tendency to be minimallist in my patches (even if sometimes I'm critizised of trying to apply too big patches). You can see that even after having merged several important features to the Unit code, my net line count in the project is negative (+785 -878). I'm just being myself here, and I'm very averse to add such complex code to a project whose source code I still don't know very well (for some meaning of "very well").
Certainly, it is complex. You wrote the code, so I guess you have an idea. It will be even more complex for me to review your code, since it's completely new code for me.
In your case, we've agreed it's probably simpler for you to use abstract sockets, right?
I know there are scenarios where it is required, but I would like to have some certainty on that necessity by real users (your collegue that you mentioned might be one of them).
I don't remember having said that.
This certainly is the biggest issue we have right now.
We're less than a handful of C programmers in the Unit team. I've been in the team just for a few months, and still don't know very well how Unit is implemented internally, so it's hard for me to review this code; please understand that. We've also had important changes, so that's why at some points we've been more silent than other times.
Thanks! Really!
My priority right now will be related with unix sockets for the next release, so I'll be learning how Unit uses them. As I said, for the moment, I'll defer the revision of this PR indefinitely. After some more work with unix sockets, I may reconsider it. Please tell your collegues to open an issue if they want to share a description of their use case, and maybe you or they can convince me that it's necessary enough that I can't avoid it. Cheers, Alex |
I am very grateful to you!
I don't separate F5 Networks and Nginx Inc. for individual employees, just as I do not separate the unit project from other F5 Networks projects. I chose unit because of the good attitude towards the nginx product and its well-deserved reputation, so I expect the same from unit, but taking into account its small age. I have previously described my vision to your Sr Director of Product Management, pointing out the unit's current competitive advantages and disadvantages, but this seems to have been ignored or left unnoticed. However, after so much time and opposition from the team (not all of its members), providing evidence and discussions, I am still here, but in the near future I plan to close this PR, since unfortunately I will not be able to work on it anymore due to some circumstances not related to this PR.
This was not said by you, but by @VBart in other PRs. |
|
capabilities(7) might be a nice way of solving the unlink(2)ing of the socket files problem, on Linux at least. I don't know if they ever made it to other Unix systems. It never got past the POSIX draft stage (POSIX.1e). Or maybe they have a similar mechanism. The router process could gain the CAP_DAC_OVERRIDE capability, do the unlink(2) then relinquish the capability until next time. I did use capabilities(7) in a project many years ago... |
I'm not sure it's a safe one. See a rationale in #740. CAP_DAC_OVERRIDE is basically root (together with CAP_SYS_ADMIN, one of the most powerful caps). I'd prefer a ssafer one, I think |
|
Heh, I think that's a little extreme. It allows a process to override file system permissions yes, but it's not as bad as CAP_SYS_ADMIN (that's the basically root one) and it would only be for the duration of the unlink(2). But then maybe in the end all these schemes aren't much simpler than just sending a message to the main process to and that would work for everybody. But then nxt_main_process_cleanup() gets called when an application terminates, (perhaps not quite what we want),... and it looks like the listen sockets should be tucked away in task->thread->runtime->listen_sockets, but the array holding those sockets always seems to be empty. Even looking in the router process where I'd expect them to be [Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f9633111bfe in epoll_wait (epfd=3, events=0x15e5970, maxevents=32, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30 return SYSCALL_CANCEL (epoll_wait, epfd, events, maxevents, timeout);
(gdb) frame 2
#2 0x00000000004134ba in nxt_event_engine_start (engine=0x15dcbd0)
at src/nxt_event_engine.c:549
549 engine->event.poll(engine, timeout);
(gdb) p *task->thread->runtime->listen_sockets
$1 = {
elts = 0x15dc918,
nelts = 0,
size = 48,
nalloc = 1,
mem_pool = 0x15dc060
}
(gdb) p *(nxt_listen_socket_t *)task->thread->runtime->listen_sockets->elts
$2 = {
socket = 0,
backlog = 0,
work_queue = 0x0,
handler = 0x0,
sockaddr = 0x0,
count = 0,
flags = 0 '\000',
read_after_accept = 0 '\000',
ipv6only = 0 '\000',
socklen = 0 '\000',
address_length = 0 '\000'
}
(gdb) |
8cb7e0b to
5a37171
Compare
|
Closing this pull-request. There has been some commits in relation to this issue ccaad38 & 2e3e1c7. Also, the docker image from https://github.com/echolimazulu/unit-bind-aaiu runs without failure now. (Required the following patch) diff --git a/Dockerfile b/Dockerfile
index dd2e87e..d7f2a6b 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -8,7 +8,7 @@ RUN set -ex \
&& mkdir -p /usr/lib/unit/modules /usr/lib/unit/debug-modules \
&& hg clone https://hg.nginx.org/unit \
&& cd unit \
- && hg up 1.26.1 \
+ && hg up 1.31.0 \
&& NCPU="$(getconf _NPROCESSORS_ONLN)" \
&& DEB_HOST_MULTIARCH="$(dpkg-architecture -q DEB_HOST_MULTIARCH)" \
&& CC_OPT="$(DEB_BUILD_MAINT_OPTIONS="hardening=+all,-pie" DEB_CFLAGS_MAINT_APPEND="-Wp,-D_FORTIFY_SOURCE=2 -fPIC" dpkg-buildflags --get CFLAGS)" \
@@ -25,11 +25,11 @@ RUN set -ex \
--libdir=/usr/lib/$DEB_HOST_MULTIARCH" \
&& ./configure $CONFIGURE_ARGS --cc-opt="$CC_OPT" --ld-opt="$LD_OPT" --modules=/usr/lib/unit/debug-modules --debug \
&& make -j $NCPU unitd \
- && install -pm755 build/unitd /usr/sbin/unitd-debug \
+ && install -pm755 build/sbin/unitd /usr/sbin/unitd-debug \
&& make clean \
&& ./configure $CONFIGURE_ARGS --cc-opt="$CC_OPT" --ld-opt="$LD_OPT" --modules=/usr/lib/unit/modules \
&& make -j $NCPU unitd \
- && install -pm755 build/unitd /usr/sbin/unitd \
+ && install -pm755 build/sbin/unitd /usr/sbin/unitd \
&& make clean \
&& ./configure $CONFIGURE_ARGS --cc-opt="$CC_OPT" --modules=/usr/lib/unit/debug-modules --debug \
&& ./configure \ |

Solution to the problem:
#669
Repository to reproduce the issue:
https://github.com/echolimazulu/unit-bind-aaiu
Hello @VBart, @mar0x, @hongzhidao, @alejandro-colomar, @tippexs.
Please tell me what you think. In the process of testing this fix, I no longer observe the previously encountered problem.
I tried to take into account all the previously voiced comments regarding the code, commits, indents and other points.
I did not propose the implementation of locks, cleaning sockets and locks when the program is closed, cleaning when updating the configuration, since this is a much larger number of changes. Addressing several different issues and extending functionality that is not needed for: #669