Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: broken pipe errors starting from Go 1.12 when using a deadline #35373

Open
AmirSimon opened this issue Nov 5, 2019 · 7 comments
Milestone

Comments

@AmirSimon
Copy link

@AmirSimon AmirSimon commented Nov 5, 2019

What version of Go are you using (go version)?

1.12

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

linux amd64

What did you do?

My code has been running well until I have decided to upgrade the go version in my project from 1.11.9 to 1.12.
updating to Go version 1.12, resulted in multiple integration tests in my project failing by the same error:
write tcp 10.131.1.55:47494->34.251.62.18:443: write: broken pipe

Its important to add I am using Gorilla package, v1.2.0 https://github.com/gorilla/websocket

webSocket connection is made by a test client connecting to a remote machine :

How I establish a connection:

Dialer = &Dialer{
	Proxy:            http.ProxyFromEnvironment,
	HandshakeTimeout: 45 * time.Second,
}

requestHeader := http.Header{}
requestHeader.Add("Cookie", cookie)
requestHeader.Add(XSRF_TOKEN_KEY, token)
websocket.Dialer.TLSClientConfig = &tls.Config{
	InsecureSkipVerify: true,
	MinVersion:         tls.VersionTLS12,
	MaxVersion:         tls.VersionTLS12,
}
ws, _, err := websocket.Dialer.Dial(url, requestHeader)

how the server listens:

ws.websocketConn.SetReadLimit(Max)
ws.websocketConn.SetReadDeadline(time.Now().Add(60 * time.Second))
ws.websocketConn.SetPongHandler(
		func(string) error { ws.websocketConn.SetReadDeadline(time.Now().Add(60 * time.Second)); return nil })
	for {
		_, msg, err := ws.websocketConn.ReadMessage()
		if err != nil {
			CLOSE CONNECTION
			break
		}

after playing with the tests I found out that when setting the read deadline via SetReadDeadline to 0 - no deadline at all, tests seems to work again. (net.go)
(20 minutes deadline still resulted in failed tests)

which led me to think that something after the connection established takes way to long.
went over all of the release notes of 1.12 in an attempt to see what changed that could affect my code, but with no luck. (specifically net/http and net changes)

Tried with multiple Go versions, and it looks like it's failing from 1.12 up to 1.13.

I'm not able to reproduce this issue with a simple connection from my machine to a remote one.
It happens mostly in my test environment where two remote machines establish a connection and passing information with a websocket

This issue might have something to do with gorilla package, however, I have been using v1.2 for years. (also tried upgrading gorilla package to their latest 1.4)

If you think of any more information I could provide, please tell me.

What did you expect to see?

Expected connection to stay stable.

What did you see instead?

Connection disconnected

@ianlancetaylor ianlancetaylor changed the title write: broken pipe errors starting from Go 1.12 net: broken pipe errors starting from Go 1.12 when using a deadline Nov 5, 2019
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Nov 5, 2019
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 5, 2019

Is there a way that we can recreate the problem ourselves?

You say that with no deadline, the tests pass, but with a deadline, they fail. When you set the deadline to 20 minutes, how long does it take the test to fail?

@AmirSimon

This comment has been minimized.

Copy link
Author

@AmirSimon AmirSimon commented Nov 12, 2019

@ianlancetaylor , Sorry for the late reply.

Did some more research - its indeed happening when trying to close a web socket from the client side which was already closed by the server because of timeout.

I stand correct on what I said before - setting the deadline to 20 minutes does make the tests pass, its all a matter of if the test runs before the deadline is met.

however, after setting the deadline to about a minute, I added timestamps to a specific test and notice something strange - on version 1.11 and version 1.12 the test takes about 5 minutes, but only the web-socket connection of version 1.12 is being closed beforehand for some reason - making the test failed when both runs call ws.Close() at the end.

am I missing something in terms of why is my connection at 1.11 staying alive until the end (ws.Close() called) compared to 1.12?

@networkimprov

This comment has been minimized.

Copy link

@networkimprov networkimprov commented Nov 12, 2019

1.12 implements TCP keepalive timeouts on client side by default. When they occur, they appear to be deadline events.

See also #31490 & #23459

@AmirSimon

This comment has been minimized.

Copy link
Author

@AmirSimon AmirSimon commented Nov 12, 2019

Tried to set the KeepAlive of the Dialer to a negative value, but saw no change.
will investigate further on this matter and update here.
Thanks @networkimprov.

@AmirSimon

This comment has been minimized.

Copy link
Author

@AmirSimon AmirSimon commented Nov 13, 2019

First of all, its important to add that this does not happen on MAC os, and it does fails on CentOS

@networkimprov I tried disabling the default enabled KeepAlive at the client (as stated in the documentation - setting KeepAlive to a negative value.
Also tried sending every pingPeriod a ping control message from the client to the server (similar to the gorilla package example: https://play.golang.org/p/V2OvXyAQzh)
but it still fails. (where pingPeriod varied from millisecond to 54 seconds)

What did work is setting the client's Dialer timeout to
so Basically the Server close the connection with i/o timeout( timeout which was received by the client) and then when the client comes to close the web socket connection, it fails with broken pipe.

what causes my client to throw timeout?

@networkimprov

This comment has been minimized.

Copy link

@networkimprov networkimprov commented Nov 13, 2019

First, update to latest Gorilla-websocket; 1.2 is 2y old. https://github.com/gorilla/websocket/releases

Then if you can provide two complete, simple programs that reproduce the problem with Go 1.13 (but are OK on 1.11), the Go team can investigate.

@AmirSimon

This comment has been minimized.

Copy link
Author

@AmirSimon AmirSimon commented Nov 13, 2019

As stated above - "also tried upgrading gorilla package to their latest 1.4"

I just want to add that if the client OR the server is running on go 1.11 (one of them is enough), everything works.

I'll try to come up with two simple programs that reproduces the problem, thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.