You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was hit by massive deadlocks in unbound listening for tcp connections. Sometimes it arrives to state when number of idle tcp connections is maxed at incoming-num-tcp. New connections are not accepted, listening socket is excluded in select().
But idle unused connections never die. So, listening stops forever.
Not sure, but it looks like I found the reason. When incoming-num-tcp is about to be full the last added sockets are added
with c->tcp_timeout_msec==0. The first time in setup_tcp_handler timeout is forced to be TCP_QUERY_TIMEOUT_MINIMUM.
But if the socket did some io, the further comm_point_start_listening() are done with real 0(!) and timer is not started anymore. After several expirations of old sockets with non-zero timeout all the sockets remain with c->tcp_timeout_msec==0 and unbound stucks forever.
It is just a theory. I added capping c->tcp_timeout_msec to TCP_QUERY_TIMEOUT_MINIMUM in setup_tcp_handler().
Waiting for results. But I think more experienced developer could figure out the problem w/o waiting.
The text was updated successfully, but these errors were encountered:
Actually, I think the approach with reducing idle timeout when we are congested is deemed to fail. I have some experience with this in linux kernel. I used similar solution there around 2000th, it was cheap and looked nice but soon it became obvious it will never sustain even minor overloads. In single thread unbound the simplest and rock solid solution would be lru list per listening socket, when we remove socket from list when start to do something and put it to the tail when socket enters idle state. Then, when we !c->tcp_free we just close head of lru and proceed. It is not even so chumbersome, is going to be nice and clear. And we do not need redundant tcp-idle-timeout parameter anymore, established connections are closed when it is necessary.
Hi Thanks for the detailed reasoning about the deadlock and TCP streams. I fixed it by applying the minimum so the no timeout zero value is not used, but the minimum value. This should no longer deadlock the TCP streams.
- Fix missing prototypes in the code.
Changelog note for NLnetLabs#373 - Merge PR NLnetLabs#373 from fobser: Warning: arithmetic on a pointer to void is a GNU extension.
Changelog note for NLnetLabs#335 - Merge PR NLnetLabs#335 from fobser: Sprinkle in some static to prevent missing prototype warnings.
Warning: arithmetic on a pointer to void is a GNU extension.
- Fix to squelch permission denied and other errors from remote host, they are logged at higher verbosity but not on low verbosity.
- FixNLnetLabs#371: unbound-control timeout when Unbound is not running.
- iana portlist updated.
- make depend.
Code repo continues for 1.13.1 in development.
- Fix update, with write event check with streamreuse and fastopen.
- Fix for NLnetLabs#283: fix stream reuse and tcp fast open.
- Fix on windows to ignore connection failure on UDP, unless verbose.
- Fix unbound-dnstap-socket to not use log routine from interrupt handler and not print so frequently when invoked in sequence.
- FixNLnetLabs#356: deadlock when listening tcp.
- FixNLnetLabs#360: for the additionally reported TCP Fast Open makes TCP connections fail, in that case we print a hint that this is happening with the error in the logs.
Sprinkle in some static to prevent missing prototype warnings.