Skip to content
This repository has been archived by the owner on Feb 10, 2024. It is now read-only.

Fix lag time #1320

Closed
wants to merge 2 commits into from
Closed

Fix lag time #1320

wants to merge 2 commits into from

Conversation

pstratem
Copy link

prefs.hex_net_ping_timeout is currently more or less entirely useless.

Timeout should be based on when the last successful recv() occurred, not based on ping round trip times.

remove broken hex_net_ping_timeout logic

use timestamps in lag_check instead of relying on the function being called every 30 seconds

use prefs.hex_net_ping_timeout if available

fix whitespace

fix whitespace
@Arnavion
Copy link
Contributor

If client sends lagcheck ping at t = 0, receives a chat line at t = 1, and receives lagcheck pong at t = 2, the lag is 2 - 0 = 2, not 1 - 0 = 1 nor 2 - 1 = 1. The current code is correct.

Also, sending a lagcheck ping every second is insane.

@pstratem
Copy link
Author

You've misinterpretted the patch set.

This doesn't change the logic for calculating lagtime as indicated in the GUI.

This only changes the lagtime calculation for purposes of detecting a disconnect/timeout event.

And pings are sent at most once every ping_interval seconds.

The lag_check function no longer sends a PING everytime it's called.

@Arnavion
Copy link
Contributor

This doesn't change the logic for calculating lagtime as indicated in the GUI.
This only changes the lagtime calculation for purposes of detecting a disconnect/timeout event.

The two are identical.

@pstratem
Copy link
Author

This patch separates them.

lagtime based on rtt of PING/PONG makes sense for the gui indicator

but it's entirely useless for detecting a network disconnect

@Arnavion
Copy link
Contributor

Incorrect. Reread my first post.

@pstratem
Copy link
Author

So you're saying that you dont want to change the logic.

That you want to use the ping time lag indicator to detect a tcp socket disconnect?

(I just want to be 100% clear on this)

@Arnavion
Copy link
Contributor

Again, you cannot detect a TCP socket timeout without detecting that a pong for a corresponding ping is taking longer than expected. You cannot use non-pong traffic to detect this, because you don't know when the traffic originated. Such non-pong traffic might have originated 5 minutes ago and you may be only receiving it now, which still means the socket has timed out and needs to be reset. However when you receive a pong, you know it could only have been sent after you sent the corresponding ping, which gives you an upper bound on the timeout.

Thus if you don't receive a pong within the timeout period, that means the socket has timed out, even if you're receiving non-pong traffic during this period.

Now, looking at the current code, lag_check only seems to run if the GUI lagometer is enabled. That definitely looks like a bug.

@pstratem
Copy link
Author

You're not taking into account various queuing effects.

There's client side throttling, server side recv() throttling, and server side send() throttling.

In practice the only reasonably way to detect a network distruption is to assume that anything received means forward progress.

@pstratem
Copy link
Author

You're right the lag_check function should be called regardless of the gui lag indicator setting.

@Arnavion
Copy link
Contributor

I'm aware what throttling is, thanks. I'm afraid it doesn't matter how stubbornly you refuse to understand what I'm saying - it will not change the fact that the code in this pull request is wrong, for reasons I've explained twice now.

@pstratem
Copy link
Author

That's fine, I can just maintain my own branch that has sane behaviour.

@pstratem pstratem closed this Mar 16, 2015
@CyberShadow
Copy link

Again, you cannot detect a TCP socket timeout without detecting that a pong for a corresponding ping is taking longer than expected. You cannot use non-pong traffic to detect this, because you don't know when the traffic originated. Such non-pong traffic might have originated 5 minutes ago and you may be only receiving it now, which still means the socket has timed out and needs to be reset.

I've written so many network applications and this makes no sense to me. @Arnavion Do you have any sources or concrete examples to back this up? I'm fairly sure @pstratem is in the right here.

@Arnavion
Copy link
Contributor

If you think he's right then reread all of my posts in this thread until you've convinced yourself otherwise. It is adequately explained, including in the snippet you quoted.

@CyberShadow
Copy link

I did, and I think you're wrong.

To be specific, you have an unusual definition of "timeout". Most importantly, as far as I can see, this definition is simply not useful.

So far I have never encountered a situation where receiving any traffic on a socket not implying that the socket is alive, and should not be timed out. Can you provide an example of one?

until you've convinced yourself otherwise

Not very open-minded are we... why are you so convinced you're right?

@CyberShadow
Copy link

So far I have never encountered a situation where receiving any traffic on a socket not implying that the socket is alive, and should not be timed out.

Sorry, that's wrong. I just remembered one, Slow-Loris. But it doesn't apply to IRC.

@Arnavion
Copy link
Contributor

My definition of timeout is the same as every IRC network's. If you don't believe me, open a TCP connection to an IRC network, play out the same IRC conversation as a real client would, but don't respond to any PINGs. Tell me if it doesn't disconnect you after some time (eg 240 seconds) because it thinks you've timed out.

@CyberShadow
Copy link

OK, but 1) HexChat is not an IRC server 2) Do you know what the reason for this is? Could be simply a technical limitation. 3) This test will fail on the IRC servers I wrote :) I never noticed any issues with this behavior of course. I'll investigate a little...

@Arnavion
Copy link
Contributor

HexChat is not an IRC server

But it is a client connecting to an IRC server. There's no sense for the client to pretend the connection is alive if the server doesn't think it is. Breaking the connection by the same rules ensures that the connection breaks as soon as possible and can be reestablished.

Could be simply a technical limitation.

The behavior exists in multiple ircds, atleast Rizon's and Freenode's. So even if it was an implementation detail, it's a de-facto one.

@CyberShadow
Copy link

There's no sense for the client to pretend the connection is alive if the server doesn't think it is.

I'm not following, what does what the server think have to do with this?

This behavior exists in multiple ircds, atleast Rizon's and Freenode's.

Actually I just tried Freenode and it's not even sending PINGs at all as long as I keep sending traffic.

@CyberShadow
Copy link

If you don't believe me, open a TCP connection to an IRC network, play out the same IRC conversation as a real client would, but don't respond to any PINGs. Tell me if it doesn't disconnect you after some time (eg 240 seconds) because it thinks you've timed out.

OK, I idled until I got a PING and started sending traffic again (PRIVMSG, not PONG). It's not timing me out.

Have you tested this on Freenode before and got another result? I'm guessing something must've changed? I was connected to rajaniemi.freenode.net.

Edit: after idling again I got a second PING (with no PONG in-between) which just proves that Freenode accepts any traffic as a PONG.

@Arnavion
Copy link
Contributor

Yes you're right. My testing with both reveals the same. I'm sure it used to be that the networks sent PINGs always but atleast they don't do that now...

@CyberShadow
Copy link

You might be thinking of GameSurge, which sends a PING as soon as it receives a NICK, and has a rather low timeout (at least for registration or that initial PING).

So in light of this, do you still think the current behaviour is preferable?

@Arnavion
Copy link
Contributor

So since networks don't have this behavior, it doesn't make sense for HC to have this behavior either. I'll let TingPing assess the PR.

@Arnavion Arnavion reopened this Mar 13, 2016
@pstratem pstratem closed this Mar 13, 2016
@Arnavion
Copy link
Contributor

#1631

@Arnavion Arnavion mentioned this pull request Mar 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants