Skip to content
This repository was archived by the owner on Oct 8, 2025. It is now read-only.

Conversation

@echolimazulu
Copy link

@echolimazulu echolimazulu commented Feb 17, 2022

This patch solves the problem of reusing unix sockets.

Changes:

  • added removal of socket file before "bind" operation. necessary in case of unexpected termination of the program and subsequent restart without errors;
  • added removal of a socket after it is closed in case of errors when creating a socket;
  • minor optimizations and fixes.

Issue #643:
bind(\"unix:/var/run/unit.sock\") failed (98: Address already in use)

Notice:
This PR does not solve the problem of removing unix sockets when receiving program termination signals, such as SIGTERM, SIGQUIT and others. This patch only affects the behavior of unix sockets when starting the program.

- added removal of socket file before "bind" operation. necessary in case of unexpected termination of the program and subsequent restart without errors;
- added removal of a socket after it is closed in case of errors when creating a socket;
- minor optimizations and fixes.

Issue:
bind(\"unix:/var/run/unit.sock\") failed (98: Address already in use)
@echolimazulu
Copy link
Author

echolimazulu commented Feb 17, 2022

Hello @mar0x!
I will be glad to your review, questions and comments. Thanks.

@mar0x
Copy link
Contributor

mar0x commented Feb 17, 2022

Hello,
I'm afraid this approach is unsafe. Imagine one's "mistake" to configure listener's unix socket /usr/sbin/unitd or /etc/passwd... Another bad scenario is removing socket created by other running Unit instance.

Before removing file, Unit need to make sure it's a unix domain socket and it is abandoned now. To achieve this, it is possible to use file locks, try to connect to socket or something else.

@VBart
Copy link
Contributor

VBart commented Feb 18, 2022

There's already a function that does all the necessary checks in order to avoid various unix socket collisions: https://hg.nginx.org/unit/rev/0a8840921fd0, but at the moment it's only used for the control socket.

@mar0x
Copy link
Contributor

mar0x commented Feb 18, 2022

There's already a function that does all the necessary checks in order to avoid various unix socket collisions: https://hg.nginx.org/unit/rev/0a8840921fd0, but at the moment it's only used for the control socket.

Correct. But the control socket creation procedure does not check the file is a socket (which is Ok for control socket). Perhaps for other listener sockets we want to avoid regular files removal to create unix socket, removing old sockets is Ok.

@echolimazulu
Copy link
Author

echolimazulu commented Feb 18, 2022

Hello, I'm afraid this approach is unsafe. Imagine one's "mistake" to configure listener's unix socket /usr/sbin/unitd or /etc/passwd... Another bad scenario is removing socket created by other running Unit instance.

Before removing file, Unit need to make sure it's a unix domain socket and it is abandoned now. To achieve this, it is possible to use file locks, try to connect to socket or something else.

Good morning @mar0x,
Accepted.

Totally agree with you. I have taken note of all your comments. I'm going to make the necessary corrections. Thank you for your comments, it helps me provide a better solution to the problem.

There's already a function that does all the necessary checks in order to avoid various unix socket collisions: https://hg.nginx.org/unit/rev/0a8840921fd0, but at the moment it's only used for the control socket.

Good morning @VBart,
I'm glad you joined us.

The problem of not being able to reopen a unix socket only affects the socket as a listener. What @mar0x mentioned does not apply to the control socket, but user mistake is minimized if it is used. From my point of view, listening sockets require more fine-grained control and maintenance.

@VBart
Copy link
Contributor

VBart commented Feb 18, 2022

Anyway, I suggest to avoid code duplication and add the necessary check (i.e. checking file type after ECONNREFUSED) to the function mentioned above, then reuse it for listening sockets.

@echolimazulu
Copy link
Author

echolimazulu commented Feb 18, 2022

Anyway, I suggest to avoid code duplication and add the necessary check (i.e. checking file type after ECONNREFUSED) to the function mentioned above, then reuse it for listening sockets.

@VBart Healthy thought. I already thought about this, but initially I thought that you were forking the code in these two different files on purpose. The labels in nxt_listen_socket_create will have to be reworked a little, as they currently look like potentially causing an error (nxt_file_delete before nxt_socket_close - without return in listen_fail they are executed sequentially), and passing the access mode to configure a socket with permissions other than 600. I am in the process of solving this problem. If you have any other ideas please let me know. Everything you and @mar0x have suggested is really valuable ideas.

@VBart
Copy link
Contributor

VBart commented Feb 19, 2022

@echolimazulu That split wasn't made by intention. The whole Unit project was forked out of some other project and as a result there are still many leftovers of old code originally intended for different purpose.

@echolimazulu echolimazulu deleted the branch nginx:master March 24, 2022 15:49
@echolimazulu echolimazulu deleted the master branch March 24, 2022 15:49
@echolimazulu
Copy link
Author

echolimazulu commented Mar 25, 2022

@VBart, @mar0x Good morning!

I closed this PR, as in the future I will open a new one, taking into account all the changes.

I have carefully studied and implemented all your suggestions, such as:

  • checking for the necessary permissions when creating a socket file;
  • check for file type matching socket when deleting;
  • creating and checking for the presence of a lock file;
  • use of the functional part of the program from listen_socket with the implementation of the error system when creating a socket from main_process (rpc);
  • deleting socket/lock files when closing the SIGQUIT, SIGKILL program;
  • deleting socket/lock files when changing the program configuration file;
  • setting the correct values ​​of the necessary access rights to the specified files: controller (0600), listeners (0666), locks (0666).

At the moment, I am testing the modified functionality.

The changes will affect the following list of files:

  • nxt_array.h / nxt_array.c
  • nxt_controller.h / nxt_controller.c
  • nxt_errno.h
  • nxt_file.h / nxt_file.c
  • nxt_listen_socket.h / nxt_listen_socket.c
  • nxt_main_process.h / nxt_main_process.c
  • nxt_port.h
  • nxt_process.c
  • nxt_router.h / nxt_router.c
  • nxt_runtime.h / nxt_runtime.c
  • nxt_sockaddr.h
  • nxt_socket.h / nxt_socket.c
  • other files in which the new functionality is implemented

Thanks!

echolimazulu added a commit to echolimazulu/unit that referenced this pull request Apr 1, 2022
Previously, @VBart suggested replacing the socket creation function to avoid code duplication.

nginx#649 (comment)
nginx#649 (comment)
echolimazulu added a commit to echolimazulu/unit that referenced this pull request Apr 1, 2022
Issue solution: nginx#669.

Earlier, @VBart suggested to implement a check in case of ECONNREFUSED.

The problem happened a little earlier for bind() and threw an EADDRINUSE error.

It was also suggested by @mar0x to check if the file type matches the socket before deleting it.

Discussion:
nginx#649
echolimazulu added a commit to echolimazulu/unit that referenced this pull request Apr 17, 2022
echolimazulu added a commit to echolimazulu/unit that referenced this pull request Apr 18, 2022
echolimazulu added a commit to echolimazulu/unit that referenced this pull request Apr 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants