Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
ServerHandler: reconnect to the server if it is not responding to TCP pings #2622
This is implemened via a 'max in-flight TCP pings' counter, and
The idea is simple: ServerHandler keeps a counter of the number
When ServerHandler sends a ping, it increments the counter.
Once a pong or response for the ping has been received, it
But most importantly, ServerHandler now checks the amount
One scenario where this is useful, is for IPv6 privacy addresses.
One would naturally expect it to be possible to observe the fact
Because of this behavior, this TCP ping check was implemented.
When thinking about the design - what would happen on a lossy connection where every 100th ping got lost - wouldnt that cause an auto-reconnect after the 200th (aka 2nd lost) ping, even though the connection is alive and there is a constant stream of pings arriving? IMHO what you really want to achieve is to have the last received ping not being > X seconds old. Or have+use something like the sequence numbers in ICMP echo request/response
@bmwiedemann It's worth noting that the pings in question are simply messages in a TCP stream, so a "lost" ping would delay the stream, not really get lost. If it got truly "lost", that would suggest a server bug.
(The Murmur behavior for responding to a TCP ping is to respond ASAP: https://github.com/mumble-voip/mumble/blob/master/src/murmur/Messages.cpp#L1429)
But I think what should probably happen here is that the "max in-flight pings" counter is reset on every ping that's received, not just decremented.
That way, the solution is "X consecutive pings without an answer".
While the end result would probably be the same for a TCP socket that's actually dead, perhaps its more readable and easier to reason about.
Also, I think that we should be careful with the current logic as @Kissaki mentions.
I had Mumble open in a VM, trying to connect to an IP that no longer hosted a server.
Then, after 5-6 of them, I began getting "unable to send TCP pings to the server".
So yes, we do start pinging and continue to ping in inadequate cases. We should fix that (in a separate PR).
I created #2627 for further analysis/work.