drivers/unix_socket: Let clients to freely connect/disconnect#613
drivers/unix_socket: Let clients to freely connect/disconnect#613sangjinhan merged 12 commits intoNetSys:masterfrom llvilanova:fix-unix-socket
Conversation
In order to allow external controllers (connected through a UNIX socket) truly independent of bessd, client disconnection must always be detected, regardless of whether the port is used or not. This is done by keeping a thread around (using epoll). Before, this driver would only reliably detect disconnects when the port is used to receive packets. When used to send packets, the port would only detect disconnects on an actual send, which depends on the processed packets; this led to only some instances accepting new clients, making external program reconnection impossible.
|
I'm not sure how this patch affects the failed test. Any ideas? |
|
Seems like the failure happens with 32bit build and it is reproducible. Trying to debug it... |
Codecov Report
@@ Coverage Diff @@
## master #613 +/- ##
==========================================
+ Coverage 69.29% 69.32% +0.02%
==========================================
Files 204 204
Lines 13037 13061 +24
==========================================
+ Hits 9034 9054 +20
- Misses 4003 4007 +4
Continue to review full report at Codecov.
|
|
Is there some way for me to easily test the failing experiment on that build? Also, is it possible that |
|
I just did a cleanup to ensure there's no threads or FDs left behind when a module is destroyed, but it still shows the same failure. If you give me a command line to reproduce the failing build and specific test I'll give it a shot in the debugger. |
|
I don't know how to reliably reproduce the issue locally. What I did was using Travis with my private fork of this repo. The issue seems to happen only with 32bit builds. What I realized is that - return [VLANSplit(), 1, 150, expected]
+ return [VLANSplit(), 1, 30, expected]
- OUTPUT_TEST_INPUTS.append(output_test([1, 100, 77, -1, 149, 50, 100, -1]))
- OUTPUT_TEST_INPUTS.append(output_test([100, 77, -1, 149, 50, 100, -1, 33, 70]))
- OUTPUT_TEST_INPUTS.append(output_test([100, 77, -1, 149, 50, 100, -1, 33, 70], True))
+ OUTPUT_TEST_INPUTS.append(output_test([1, 17, -1, 29, 10, 13, 7]))
+ OUTPUT_TEST_INPUTS.append(output_test([1, 17, -1, 29, 10, 13, 7], True)) |
|
I'm on it. |
It seems that 32-bit builds reach the limit of open file descriptors.
|
Two requests regarding file descriptor usage:
|
|
Can't use Can't use So, is it OK to keep it with |
|
I am not sure if I understood correctly. You can add Maybe I am missing something here. If you find |
Using signals instead of pipes minimizes the number of open file descriptors (two FDs per instance).
Using ppoll() instead of epoll_pwait() saves us from using one additional file descriptor per instance, without introducing any measurable performance overhead.
|
Ooops, sorry. I read For the record, the limitations on |
Only one client connection is supported now, so ignore new ones as long as the current connection is not severed.
sangjinhan
left a comment
There was a problem hiding this comment.
Thank you so much for the update. I'll merge once those minor comments are addressed.
|
|
||
| recv_skip_cnt_ = 0; | ||
| while (true) { | ||
| fds[1].fd = client_fd_; |
There was a problem hiding this comment.
Would you add a comment that negative file descriptors are ignored by poll() for clafirication?
| if (errno == EINTR) { | ||
| continue; | ||
| } else { | ||
| PLOG(ERROR) << "[UnixSocketPort]:epoll_wait()"; |
There was a problem hiding this comment.
Feel free to ignore this: you don't need to add [UnixSocketPort]: to the log message since the filename will be automatically added by PLOG. Also, epoll_wait() -> ppoll()
| } else if (fds[1].revents & (POLLRDHUP | POLLHUP)) { | ||
| // connection dropped by client | ||
| int fd = client_fd_; | ||
| client_fd_ = -1; |
There was a problem hiding this comment.
kNotConnectedFd in place of -1
| // Relaunch the accept thread. | ||
| std::thread accept_thread(AcceptThreadMain, reinterpret_cast<void *>(this)); | ||
| accept_thread.detach(); | ||
| static void AcceptThreadHandler(int sig) { |
There was a problem hiding this comment.
Minor: You can just omit the argument name (just static void AcceptThreadHandler(int) {) to indicate the argument will be unused.
| } | ||
|
|
||
|
|
||
| accept_thread_ = std::thread([&]() { |
There was a problem hiding this comment.
[this] can be used in place of [&].
| // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | ||
| // POSSIBILITY OF SUCH DAMAGE. | ||
|
|
||
| #include <signal.h> |
There was a problem hiding this comment.
#include <poll.h> should appear first.
| #define BESS_DRIVERS_UNIXSOCKET_H_ | ||
|
|
||
| #include <assert.h> | ||
| #include <atomic> |
There was a problem hiding this comment.
#include <poll.h>
#include <atomic>
#include <cassert>
#include <cerrno>
#include <cstdio>
The Google C++ Style Guide says all C++ headers should appear after C headers. Each group needs to be alphabetically sorted.
Do we need cassert and cstdio? Also, poll.h and cerrno seem only necessary in the .cc file, not in the header.

In order to allow external controllers (connected through a UNIX socket) truly
independent of bessd, client disconnection must always be detected, regardless
of whether the port is used or not. This is done by keeping a thread
around (using epoll).
Before, this driver would only reliably detect disconnects when the port is used
to receive packets. When used to send packets, the port would only detect
disconnects on an actual send, which depends on the processed packets; this led
to only some instances accepting new clients, making external program
reconnection impossible.