Skip to content

Conversation

@oliverhausler
Copy link

There were some strange wirings and wrong calculations of pingTimeout which I cleaned up. The sequence is now onPacket -> emitHeartbeat -> setPing -> setHeartbeat (previously onHeartbeat).

Whatever packet is received, it must reset pingInterval AND pingTimeout, but pingInterval must repeat, so it is not possible to set both within the scheduled Runnable within setPing.

@oliverhausler
Copy link
Author

I have commented all fixes with // FIX. Just remove these after you agree with the change. I manually picked code changes into this repo, if anything shouldn't work as supposed, have a look at my fork please.

@oliverhausler oliverhausler changed the title fixed wrong pingInterval and pingHeartbeat fixed wrong pingInterval and pingTimeout Nov 17, 2015
@oliverhausler
Copy link
Author

fixes #38

@nkzawa
Copy link
Contributor

nkzawa commented Dec 12, 2015

Can you explain how is the original code wrong?

@oliverhausler
Copy link
Author

It has been a long time since I posted this and I'll do my best to recount this error-free.

What I think to remember is that the total timeout was timeout + pingInterval, so by default 60 + 25 seconds, instead of 60. Also I think to remember that ping was not correctly repeated under certain circumstances. As only a single ping was scheduled, when the network froze, there were situations where the second ping did not happen, e.g. if the first ping request after 25 seconds did not return, there was no second ping scheduled after 25 + 25 seconds. This was partly mitigated by the 10 sec connect timeout, which would start the second ping after 25 + 10 seconds, but then the next ping would already be too later and after the timeout 25 + 10 + 25 + a short response time > 60 and the connection disconnected.

I believe default values are set in a way, where 2 x pingInterval < timeout to ensure that if one ping fails, the second saves the connection from disconnecting, so the second ping must be scheduled no matter what happens elsewhere. On the other hand, scheduling pings in interval does not hurt, as the TimerTask is canceled at certain points, no matter if a single ping was scheduled or a series.

@nkzawa
Copy link
Contributor

nkzawa commented Dec 12, 2015

Just to make the things clear, ping/pong and heartbeat work the same with JS client.

What I think to remember is that the total timeout was timeout + pingInterval, so by default 60 + 25 seconds, instead of 60.

Maybe this is the intended behavior, like waiting for sendPing (pingInterval) and timeout (pingTimeout) after that.

As only a single ping was scheduled, when the network froze, there were situations where the second ping did not happen, e.g. if the first ping request after 25 seconds did not return, there was no second ping scheduled after 25 + 25 seconds.

Basically, ping/pong is for detecting inactive client/server including network froze. So I think we don't need to think about second ping.

@oliverhausler
Copy link
Author

Timeout should be an absolute value after which something times out. I don't see any reason why the heartbeat interval should be added to that value. If the reference implementation does it like that, then they probably do it wrong. Also the second ping is necessary, otherwise the timeout could be reduced to slightly above 35 seconds. There would be no reason for waiting that long before reconnecting.

@nkzawa I do understand that you want to stick to the reference implementation, but more important is that it works. And you can see from the many complaints about disconnects that it doesn't.

@nkzawa
Copy link
Contributor

nkzawa commented Dec 13, 2015

About the timeout, still not sure, but I agree it might be no reason actually.
(btw server also works the same https://github.com/socketio/engine.io/blob/master/lib/socket.js#L133)

About the second packet, I'd like to be more careful and discuss for better solutions. For example, can't we just make the reconnection cycle fast with less timeout and interval settings? Or it might be better to retry sending ping rather than sending the second packet in interval?

Anyway it would be great if we can fix JS client and server as well.

@oliverhausler
Copy link
Author

can't we just make the reconnection cycle fast with less timeout and interval settings

I tried that, but to achieve being reconnected within 5 secs would require a pingInterval of 1 sec and connect timeout of 3 sec, which is not feasible.

Or it might be better to retry sending ping

This won't work because you don't know that the first ping has failed, as long as it hasn't timed out yet. Exactly the reason why I am scheduling consecutive pings without further intervention. But that still wouldn't speed up reconnection, because even with a short connect timeout of 10 secs you would need 2 x 10 secs to realize that something is wrong.

I tried a lot of things, like canceling requests from the client and retrying, returning noop from the server after a certain time, sending 2 consecutive pings one after the other, etc. But all solutions I came up with are too expensive and therefore not feasible in production. For the moment I gave up on this.

@oliverhausler
Copy link
Author

Main problem also is that we are on Android. And as we know versions up to 4.3 are having network issues. The problem is that the server responds to the ping, while Android receives the response on the network layer but does not bubble it up. At least on ASUS devices this seems to be an issue. socketio/socket.io-client-java#250. So we can't do much on the server-side, either. The server thinks all is good.

@darrachequesne
Copy link
Member

Closed due to inactivity, please reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants