Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation Request about Ping Pong #265

Open
Zamiell opened this issue Nov 25, 2020 · 28 comments
Open

Documentation Request about Ping Pong #265

Zamiell opened this issue Nov 25, 2020 · 28 comments
Labels
Milestone

Comments

@Zamiell
Copy link
Contributor

Zamiell commented Nov 25, 2020

Hello nhooyr,

In gorilla/websocket, end-users must explicitly handle pings and pongs. From what I am reading on the issues forum, nhooyr/websocket handles this automatically. This is also implied by reading the source code from the chat example, which doesn't seem to have any ping pong code.

Can this please be documented front and center in the README? Right now, the README simply states that there is a "Ping pong API", but this doesn't communicate to the end user whether or not they should be writing pong pong code in their application or not.

Maybe the text of "Ping pong API" should be changed to "Automatic handling of pings and pongs, with an API exposing functionality for advanced users", or something along those lines. I would write a PR, but it's a one line change and you can probably word it the way you want yourself.

The reason that this seems important is that I expect typical writers of Go software will have used gorilla/websocket in the past, and are used to handling ping/pongs, so this will be a repeated point of confusion as more and more users use this library.

In closing, thanks for the library.

@nhooyr
Copy link
Contributor

nhooyr commented Nov 25, 2020

Hi @Zamiell

I've been a little busy but I did actually add a ping/pong example into the README dev branch but it hasn't been merged into master yet.

See https://github.com/nhooyr/websocket/blob/fdc407913d18e6fff8feacf9bc50f8545c234d9d/example_test.go#L138-L161

The library responds to pongs automatically as long as you have a goroutine in read but you must send pings yourself if you need them.

Also see https://pkg.go.dev/nhooyr.io/websocket#Conn.Ping and https://pkg.go.dev/nhooyr.io/websocket#Conn.CloseRead

@nhooyr
Copy link
Contributor

nhooyr commented Nov 25, 2020

Sorry, I meant I added the example to the dev branch.

@nhooyr nhooyr added the docs label Nov 25, 2020
@Zamiell
Copy link
Contributor Author

Zamiell commented Nov 25, 2020

Thanks for the quick response.

The library responds to pongs automatically as long as you have a goroutine in read

Forgive me, but I am new to this library. What does a "goroutine in read" mean? I opened up the chat example, and it doesn't seem to have any goroutines that are reading stuff.

you must send pings yourself if you need them.

Can you go into a bit more detail here? Apologies, I have not read the WebSocket RFC myself, but presumably a WebSocket server needs to send ping messages to the client after some interval. Which means that presumably everybody would need to send pings, right?

@nhooyr
Copy link
Contributor

nhooyr commented Nov 25, 2020

Forgive me, but I am new to this library. What does a "goroutine in read" mean? I opened up the chat example, and it doesn't seem to have any goroutines that are reading stuff.

A goroutine that is currently in a c.Reader call. i.e it's actively reading frames from the connection.

Can you go into a bit more detail here? Apologies, I have not read the WebSocket RFC myself, but presumably a WebSocket server needs to send ping messages to the client after some interval. Which means that presumably everybody would need to send pings, right?

There's no requirement to send ping messages to the client. Most of the time TCP keep alives are enough. If you need to send pings, you have to start a goroutine that calls c.Ping at some interval but also ensure there's some other goroutine actively calling c.Reader or c.Read.

If that doesn't make sense, I can update the chat example for you to see what it's like.

@nhooyr
Copy link
Contributor

nhooyr commented Nov 25, 2020

So yea to be clear, the chat example at the moment does not do a ping/pong because it doesn't need to. Go's HTTP server has TCP keep alives enabled by default.

As part of #227 I was thinking of adding a second example demonstrating ping/pongs and given this has popped up on the issue tracker multiple times now, it's definitely a good idea.

Also give #226 a read as well, I clarified the behaviour in that thread as well.

@Zamiell
Copy link
Contributor Author

Zamiell commented Nov 25, 2020

Thanks.
So it sounds like by default, WebSocket server applications should not be sending pings to clients.
I was not aware of that when opening this issue, so that kind of changes the premise of my original request.

This is all a little confusing because the gorilla/websocket chat example has code for doing pings, and it looks to be important for managing a proper read deadline, correct me if I am wrong.
Additionally, the melody WebSocket framework lists that one of its features is "automatic handling of ping/pong and session timeouts", which implies that gorilla/websocket does not have these features out of the box, and that these features are necessary for a good WebSocket connection.
(Does nhooyr/websocket have automatic session timeouts?)

Perhaps a short discussion of this in the README.md might be useful for those who are shopping around for Golang WebSocket frameworks.

@ghost
Copy link

ghost commented Nov 26, 2020

@Zamiell The following features are the same between the current package and gorilla:

  • Both packages automatically respond to ping with a pong.
  • Both packages require the application to read the connection to process control messages.
  • The decision to rely on TCP keep alive or to use ping and pong is the same.
  • The application must explicitly send pings. The packages do not send pings automatically.

The difference is in the API for detecting the pong. In the current package, the Ping method sends a ping and returns on receipt of the corresponding pong or cancelation of the context. If the error returned from Ping is nil, then the pong was received.

The gorilla package has a lower-level API. The API to write a ping does not wait for the pong. Instead, the application registers a pong handler to receive pongs. The API in the current package can be written using the gorilla API, but not vice versa.

An application cannot use ping/pong to manage the read deadline in the current package because the package does not expose an API for setting the read deadline.

@nhooyr
Copy link
Contributor

nhooyr commented Nov 26, 2020

Everything @TaylorRex has said is correct except for this:

An application cannot use ping/pong to manage the read deadline in the current package because the package does not expose an API for setting the read deadline.

It doesn't directly but you can still do it! See #87
That's how the net.Conn adapter for example works and implements SetReadDeadline.

I documented it at some point but I guess I accidentally removed it when refactoring the docs.

@ghost
Copy link

ghost commented Nov 26, 2020

@nhooyr I didn't mean to imply that the application cannot put a time bound on reading data.

Let me rephrase the last point: The gorilla examples use ping/pong and the read deadline on the underlying network connection to detect dead clients. That approach is not possible with the current package because the current package does not have an API for setting the read deadline on the underlying network connection.

Applications using the current package can use ping/pong to handle dead clients, but the code is different.

@Zamiell
Copy link
Contributor Author

Zamiell commented Nov 26, 2020

Thanks for the discussion.

Applications using the current package can use ping/pong to handle dead clients,

Presumably, when using nhooyr/websocket, dead clients will automatically be cleaned up and have the corresponding Close() method called when the TCP keepalive expires.

Can you elaborate on why application-authors would want to write extra code to handle ping pongs in order to handle dead clients? What application use-case would not be able to rely on a TCP keepalive?

@ghost
Copy link

ghost commented Nov 26, 2020

Ping/pong more reliably detects dead peer applications when proxies are in play. Also, ping/pong helps with proxies that close inactive connections.

@Zamiell
Copy link
Contributor Author

Zamiell commented Nov 26, 2020

Both of those sound like general enough & common enough things such that every application would want to have ping pongs. Or is that not true?

Maybe some specialized application would want to omit ping/pong if they were optimizing for the minimum amount of network packets sent, or something?

@Zamiell Zamiell changed the title Documentation Request Documentation Request about Ping Pong Nov 26, 2020
@Zamiell
Copy link
Contributor Author

Zamiell commented Nov 26, 2020

Is the tradeoff here simply more reliable connections through proxies versus a little bit of extra network traffic? If so, it seems like a no brainer to take that tradeoff.

If it is indeed the case that the vast majority of non-specialized Golang applications would want to handle the relatively-common case of handling WebSocket users behind proxies, then why doesn't the nhooyr/websocket library just handle sending pings to clients for me? Ideally, I would not want to stick ping pong code in my app - those are the kind of low-level details that I would prefer a WebSocket library abstract away from me. Or is nhooyr/websocket meant to be only be a lower level library, mimicking the functionality of gorilla?

(Hopefully it doesn't sound like I am complaining, I am just legitimately curious. =p)

@nhooyr
Copy link
Contributor

nhooyr commented Nov 26, 2020

You're posing great questions. These are things I considered writing the library.

You can see the full archive of my research/discussions/issue links in #1

I'm quite honestly still not sure if it was the right decision to not build in some keep alive loop that does ping/pong for you. There really isn't much data I could find on how much it really matters.

Theoretically use TCP keep alives are not guaranteed to work between proxies but in practice, it doesn't seem to matter much. There's plenty of sites that use persistent HTTP connections without a ping/pong and rely on TCP keep alives and things seem to work fine. And I've personally never seen any such issues.

Any app should be somewhat resilient to disconnections and reconnections anyway as servers will go down and connections will break.

The Go HTTP/2 library also by default only uses TCP keep alives.

And, it's very very easy to implement a ping loop this library. Looks like this:

package main

import "time"
import "context"
import "nhooyr.io/websocket"

func main() {
	// setup conn

	go heartbeat(context.Background(), nil, time.Minute)

	// read/write to conn
}

func heartbeat(ctx context.Context, c *websocket.Conn, d time.Duration) {
	t := time.NewTimer(d)
	defer t.Stop()
	for {
		select {
		case <-ctx.Done():
			return
		case <-t.C:
		}

		err := c.Ping(ctx)
		if err != nil {
			return
		}

		t.Reset(time.Minute)
	}
}

Maybe we should add that to the library as c.Heartbeat(ctx, time.Minute)?

While much of the web is moving to HTTP/2 which also has a protocol level ping/pong, they remain optional. I'm not sure if there are any statistics on how many HTTP/2 servers have a ping/pong loop implemented.

My theory is a ping/pong api in the protocol is mainly for being able to easily check the peer's latency. But I'm not really sure.
Could also be to make the protocol independent of the transport. If the transport does not have a similar mechanism, we can use the protocol level heartbeat but if the transport does, it's much cheaper to use the transport mechanism like it is with TCP keep alives.

@nhooyr
Copy link
Contributor

nhooyr commented Nov 26, 2020

(edited my previous comment quite a bit to make things clearer)

@nhooyr
Copy link
Contributor

nhooyr commented Nov 28, 2020

Yea I think it's time to add c.Heartbeat as described above. There have been enough issues/confusion about it that the current API isn't enough.

@nhooyr
Copy link
Contributor

nhooyr commented Nov 28, 2020

Opened #267

@Zamiell
Copy link
Contributor Author

Zamiell commented Dec 1, 2020

Thanks again nhooyr for the discussion.
Maybe some of this should go in the README.md, in a new section called "Ping/Pong". Do you want me to make a pull request?

@nhooyr
Copy link
Contributor

nhooyr commented Dec 1, 2020

Maybe some of this should go in the README.md, in a new section called "Ping/Pong". Do you want me to make a pull request?

Fully agreed! I've been lacking a bit in maintenance on this repository this year but I'm hoping to pick back up during the holidays.

But yea, I'm good with adding a link to this issue from the README for people who want to understand the Ping/Pong API better.

@Zamiell
Copy link
Contributor Author

Zamiell commented Dec 6, 2020

#269

@nhooyr
Copy link
Contributor

nhooyr commented Dec 9, 2020

So I checked out your PR and started reading https://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html

It turns out, TCP Keepalives aren't very useful by default, at least on Linux.

Even though Go sets the period to 15 seconds, the initial idle time before the keepalive loop begins is 2 hours!!! And then the keep alive packet must be unacknowledged by the peer 9 times for a total of another 2 minutes.

By default anyway, you can configure the kernel to lower the times. It's documented in the above URL!

Before adding any new docs, I think we need to do a survey. See how many cloud vendors have TCP keepalives enabled and at what configuration.

If many don't or they do but at subpar configurations, then I think we add c.Heartbeat or even make it default and have it be opt out via configuration in the options structures.

@Zamiell
Copy link
Contributor Author

Zamiell commented Dec 9, 2020

Ok, sounds good.

@sashker
Copy link

sashker commented Dec 14, 2020

Sorry, I interject here because my problem is related to Ping/Pong functionality. I'm trying to implement a client that connects to a python server. So, when the server sends pings to me, we reply back with a pong. (I can see the reply on a WebSocket dump) But for some reason, the server doesn't accept my pong and breaks the connection.

My question is - what would be the way to debug this (to send process/send pongs manually somehow)?

@nhooyr
Copy link
Contributor

nhooyr commented Dec 14, 2020

But for some reason, the server doesn't accept my pong and breaks the connection.

If the pong is being sent correctly and you can see it in the dump, there shouldn't be anything additional you need to debug on the client side. I'd check the server.

But if you really need to, I'd use gohack to create a modifiable source and then add a log statement at https://github.com/nhooyr/websocket/blob/c9f314abd11b749d43bb61fd214171f8bb4e4173/read.go#L150

You'd want to log the variable h to see all frames being passed on the connection.

To manually send pongs, you can export writeControl, see https://github.com/nhooyr/websocket/blob/c9f314abd11b749d43bb61fd214171f8bb4e4173/write.go#L232

Using something like wireshark to see the frames being passed is probably easier.

Perhaps the server expects a specific pong payload? The library as of now just echos the ping payload.

@nhooyr
Copy link
Contributor

nhooyr commented Dec 14, 2020

But I'll keep the above in mind for future additions to the library. A dump of all frames read/written would be quite nice when debugging and it's something I've had to use all the time when developing the library itself.

@sashker
Copy link

sashker commented Dec 14, 2020

Thank you for the quick response!
The pong response I see in Wireshark looks perfectly fine to me. We respond back with the payload we get in the ping (and the python client does the same).

So, the problem is somewhere deeper in the pythong library. And it's an excellent opportunity to try gohack 👍 to debug it.

@nhooyr nhooyr added this to the v1.9.0 milestone Jan 9, 2021
@philipjscott
Copy link

To make matters more complicated, some web servers like NGINX have default timeouts that are reliant on PINGs/PONGs [1]:

By default, the connection will be closed if the proxied server does not transmit any data within 60 seconds. This timeout can be increased with the proxy_read_timeout directive. Alternatively, the proxied server can be configured to periodically send WebSocket ping frames to reset the timeout and check if the connection is still alive.

Furthermore, keep-alive pings won't mitigate this [2]:

Keep-alive pings aren’t usable for working around the above mentioned timeout as they work on the TCP level are are just empty packets. They aren’t reported to the application and thus the application will never respond to them meaning that proxy_read_timeout will still trigger.

[1] http://nginx.org/en/docs/http/websocket.html
[2] https://blog.martinfjordvald.com/websockets-in-nginx/

This means that if somebody isn't explicitly sending out pings with nhooyr/websocket and is running the application behind NGINX, NGINX will kill the connection.

I created a little demo showcasing NGINX proxy timeouts using nhooyr/websocket: https://github.com/ScottyFillups/nginx-ws

@Zamiell
Copy link
Contributor Author

Zamiell commented Mar 17, 2021

@nhooyr If what Scotty says is true, then should issue #267 should just be included and working out-of-the-box for all nhooyr/websocket users? nginx is an extremely common web server.

Milerius added a commit to KomodoPlatform/go-binance that referenced this issue Jul 28, 2021
- KeepAlive heartbeat implemented using coder/websocket#265 (comment)
- Only spot websocket API switched to nhooyr websocket atm
- Close the websocket using StatusNormalClosure
adshao pushed a commit to adshao/go-binance that referenced this issue Jul 29, 2021
* feat(websocket): switch to nhooyr.io/websocket for wasm support

- KeepAlive heartbeat implemented using coder/websocket#265 (comment)
- Only spot websocket API switched to nhooyr websocket atm
- Close the websocket using StatusNormalClosure

* feat(websocket): apply same change for delivery and futures

* feat(websocket): update go.mod

* feat(websocket): update go.mod & go.sum (go get -u && go mod.tidy)
softdev87 added a commit to softdev87/go-binance that referenced this issue Sep 8, 2022
* feat(websocket): switch to nhooyr.io/websocket for wasm support

- KeepAlive heartbeat implemented using coder/websocket#265 (comment)
- Only spot websocket API switched to nhooyr websocket atm
- Close the websocket using StatusNormalClosure

* feat(websocket): apply same change for delivery and futures

* feat(websocket): update go.mod

* feat(websocket): update go.mod & go.sum (go get -u && go mod.tidy)
XCr-9 added a commit to XCr-9/binance that referenced this issue Jan 9, 2024
* feat(websocket): switch to nhooyr.io/websocket for wasm support

- KeepAlive heartbeat implemented using coder/websocket#265 (comment)
- Only spot websocket API switched to nhooyr websocket atm
- Close the websocket using StatusNormalClosure

* feat(websocket): apply same change for delivery and futures

* feat(websocket): update go.mod

* feat(websocket): update go.mod & go.sum (go get -u && go mod.tidy)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants