Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Conn hangs with SetReadDeadline #34385

Open
harshavardhana opened this issue Sep 19, 2019 · 15 comments

Comments

@harshavardhana
Copy link
Contributor

commented Sep 19, 2019

What version of Go are you using (go version)?

$ go version
1.13

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/harsha/.cache/go-build"
GOENV="/home/harsha/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/harsha/mygo"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/harsha/.gimme/versions/go1.13.linux.amd64"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/harsha/.gimme/versions/go1.13.linux.amd64/pkg/tool/linux_amd64"
GCCGO="/usr/bin/gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/harsha/mygo/src/github.com/minio/minio/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build110074343=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/Xpw5O5EsXQv
https://play.golang.org/p/C-Jdtg6Wr9M

$ NO_TIMEOUT=0 go run /tmp/http.go
<?php
$homepage = file_get_contents('http://localhost:1234/');
echo $homepage;
?>
~ for i in $(seq 1 5); do /usr/bin/time -f "%E" php7.2 http.php 1>/dev/null; done
0:05.04
0:00.06
0:05.02
0:05.05
0:05.04

What did you expect to see?

No hangs with following code

$ NO_TIMEOUT=1 go run /tmp/http.go
<?php
$homepage = file_get_contents('http://localhost:1234/');
echo $homepage;
?>
~ for i in $(seq 1 5); do /usr/bin/time -f "%E" php7.2 http.php 1>/dev/null; done
0:00.03
0:00.01
0:00.01
0:00.01
0:00.01

What did you see instead?

Hangs up to the SetReadDeadline timeout

$ NO_TIMEOUT=0 go run /tmp/http.go
<?php
$homepage = file_get_contents('http://localhost:1234/');
echo $homepage;
?>
~ for i in $(seq 1 5); do /usr/bin/time -f "%E" php7.2 http.php 1>/dev/null; done
0:05.04
0:00.06
0:05.02
0:05.05
0:05.04

This is another variant of the same issue #21133 and the work-around presented works still.

import (
	"net"
	"sync/atomic"
	"time"
)

// QuirkConn - similar to golang net.Conn struct, but contains a workaround of the
// following the go bug reported here https://github.com/golang/go/issues/21133.
// Once the bug will be fixed, we can remove this structure and replaces it with
// the standard net.Conn
type QuirkConn struct {
	net.Conn
	hadReadDeadlineInPast int32 // atomic
}

// SetReadDeadline - implements a workaround of SetReadDeadline go bug
func (q *QuirkConn) SetReadDeadline(t time.Time) error {
	inPast := int32(0)
	if t.Before(time.Now()) {
		inPast = 1
	}
	atomic.StoreInt32(&q.hadReadDeadlineInPast, inPast)
	return q.Conn.SetReadDeadline(t)
}

// canSetReadDeadline - returns if it is safe to set a new
// read deadline without triggering golang/go#21133 issue.
func (q *QuirkConn) canSetReadDeadline() bool {
	return atomic.LoadInt32(&q.hadReadDeadlineInPast) != 1
}

The question I have is why this workaround is needed at all. Is it not expected to SetReadDeadline() right before the read operation? but why does it work some times? It also looks like there may be some performance implications on doing something like this.

harshavardhana added a commit to harshavardhana/minio that referenced this issue Sep 19, 2019
This commits fixes a bug introduced in af6c6a2.

This workaround is needed to fix the upstream Go issue
golang/go#34385 related to Deadlines

Fixes minio#7852
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2019

The program you showed us doesn't look at NO_TIMEOUT at all. Are you showing us the exact code that you are testing?

@ianlancetaylor ianlancetaylor changed the title net.Conn hangs with SetReadDeadline net: Conn hangs with SetReadDeadline Sep 19, 2019
@ianlancetaylor ianlancetaylor added this to the Go1.14 milestone Sep 19, 2019
@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

Had to update the example, updated the right link.

harshavardhana added a commit to harshavardhana/minio that referenced this issue Sep 20, 2019
This commits fixes a bug introduced in af6c6a2.

Setting deadlines in Go results in arbitrary hangs as reported here
golang/go#34385

Fixes minio#7852
harshavardhana added a commit to harshavardhana/minio that referenced this issue Sep 20, 2019
This commits fixes a bug introduced in af6c6a2.

Setting deadlines in Go results in arbitrary hangs as reported here
golang/go#34385

Fixes minio#7852
harshavardhana added a commit to harshavardhana/minio that referenced this issue Sep 20, 2019
This commits fixes a bug introduced in af6c6a2.

Setting deadlines in Go results in arbitrary hangs as reported here
golang/go#34385

Fixes minio#7852
@av86743

This comment has been minimized.

Copy link

commented Sep 20, 2019

If c.SetReadDeadline(time.Now().Add(c.readTimeout)) in your example is commented out, deadlock still occurs. Consider resolving the problem with your customization of net.Listener and/or net.Conn first.

@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

If c.SetReadDeadline(time.Now().Add(c.readTimeout)) in your example is commented out, deadlock still occurs. Consider resolving the problem with your customization of net.Listener and/or net.Conn first.

@av86743 I don't think so it does here is how I tested the shared code.

package main

import (
	"log"
	"net"
	"net/http"
	"os"
	"time"
)

// timeoutableConn
type timeoutableConn struct {
	readTimeout time.Duration
	net.Conn
}

func (c *timeoutableConn) Read(b []byte) (n int, err error) {
	if c.readTimeout > 0 {
		//c.SetReadDeadline(time.Now().Add(c.readTimeout))
	}
	n, err = c.Conn.Read(b)
	return n, err
}

// customListener returns timeoutable net.Conn
type customListener struct {
	net.Listener
}

func (l *customListener) Accept() (conn net.Conn, err error) {
	c, e := l.Listener.Accept()
	if e != nil {
		return nil, e
	}
	return &timeoutableConn{Conn: c, readTimeout: 5 * time.Second}, nil
}

func main() {
	var l net.Listener
	l, err := net.Listen("tcp", ":1234")
	if err != nil {
		log.Fatal(err)
	}

	if os.Getenv("NO_TIMEOUT") != "1" {
		l = &customListener{l}
	}

	_ = http.Serve(l, http.HandlerFunc(
		func(w http.ResponseWriter, req *http.Request) {
			buf := []byte(`127.0.0.1	localhost
127.0.1.1	backspace

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

172.31.59.24    ceph-1
172.31.57.119    ceph-2
172.31.57.137    ceph-3
172.31.63.24    ceph-4
172.31.63.204    ceph-5
172.31.51.188    ceph-6
172.31.54.140    ceph-7
172.31.49.145    ceph-8


`)
			w.Write(buf)
			w.(http.Flusher).Flush()
		}))
}
~ go run /tmp/t.go
~ for i in $(seq 1 10); do /usr/bin/time -f "%E" php7.2 http.php 1>/dev/null; done
0:00.03
0:00.01
0:00.01
0:00.01
0:00.01
0:00.01
0:00.01
0:00.01
0:00.01
0:00.01

So no hangs if that code is commented out. Add the code back.

~ for i in $(seq 1 10); do /usr/bin/time -f "%E" php7.2 http.php 1>/dev/null; done
0:05.02
0:00.04
0:00.01
0:05.01
0:05.04
0:05.01
0:00.01
0:00.01
0:05.01
0:05.01

So there is no issue in the code itself, I am not sure how are you testing it.

@av86743

This comment has been minimized.

Copy link

commented Sep 20, 2019

... I am not sure how are you testing it.

Run on playground.

@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

Run on playground.

Playground doesn't run HTTP service @av86743 its a nacl container :-) any long-running go-routines such as HTTP service are rejected.

That is not how you would reproduce this issue IMHO, run it locally on your laptop.

@av86743

This comment has been minimized.

Copy link

commented Sep 20, 2019

Playground doesn't run HTTP service etc.

First: Why put your example in playground if it is not runnable from there?

Second: An original example from #21133 with your setting of port :1234 is runnable from playground all right.

Third: #21133 was closed without any intention to fix anything in net/http. Why do you want to present it differently as:

// QuirkConn - similar to golang net.Conn struct, but contains a workaround of the
// following the go bug reported here https://github.com/golang/go/issues/21133.
// Once the bug will be fixed, we can remove this structure and replaces it with
// the standard net.Conn
kannappanr added a commit to minio/minio that referenced this issue Sep 20, 2019
This commit fixes a bug introduced in af6c6a2.

Setting deadlines in Go results in arbitrary hangs as reported here
golang/go#34385

Fixes #7852
@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

@av86743 because original was never addressed, what I wanted to shed light on is that the workaround provided was also not correct because.

It basically disabled SetReadDeadline() altogether and that exactly what shouldn't happen. Setting deadlines repeatedly shouldn't cause hangs in the first place and the workaround is also wrong because it basically just disabled the SetReadDeadline() code.

First: Why put your example in playground if it is not runnable from there?

This is an odd argument, the intention is to show code how to reproduce it. This issue also talks about how to reproduce it using the command line using PHP code. I don't know what made you think that you can run in the playground and also run PHP code to reproduce this.

Second: An original example from #21133 with your setting of port :1234 is runnable from playground all right.

The original example added client code, which I didn't want to add because you need to use PHP to reproduce this issue not Go HTTP client. Please follow what I pasted above on how to reproduce it.

@av86743

This comment has been minimized.

Copy link

commented Sep 20, 2019

Second: An original example from #21133 with your setting of port :1234 is runnable from playground all right.

This is not a question and as such, it does not require your response. This is statement of a fact contrary to what you have earlier stated (I quote):

Playground doesn't run HTTP service @av86743 its a nacl container :-) any long-running go-routines such as HTTP service are rejected.


... This issue also talks about how to reproduce it using the command line using PHP code. ... you need to use PHP to reproduce this issue not Go HTTP client. ...

If you are not able to reproduce whatever it is without involving PHP, why do you claim that problem is on the go side and not on the PHP side?

@networkimprov

This comment has been minimized.

Copy link

commented Sep 20, 2019

@harshavardhana It's documented in #21133 why your code doesn't work, and what the workaround is. If the workaround has problems, you should open a new issue with code that demonstrates them.

Note curl is a better test client; not that many folks have PHP installed.

P.S. you don't have to respond to every comment you get on an issue :-)

@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

If you are not able to reproduce whatever it is without involving PHP, why do you claim that problem is on the go side and not on the PHP side?

Right now its just PHP - I haven't explored enough to show you if there are other clients out there which reproduce this issue - the reason is that without deadlines the code works fine. That makes me wonder that the issue is in Go - either deadlines don't work the way we want or we are using it wrong. Some explanation that would help in the documentation.

@harshavardhana It's documented in #21133 why your code doesn't work, and what the workaround is. If the workaround has problems, you should open a new issue with code that demonstrates them.

Fundamental reasoning is why does one need a workaround in the first place, the issue is closed as if the bug is in our custom net.Conn implementation which doesn't seem to be the case unless our usage of deadlines are dead wrong - if that is the case I am happy to change the implementation instead of applying a workaround. So any guidance on that area would be much appreciated.

@networkimprov

This comment has been minimized.

Copy link

commented Sep 20, 2019

EDIT: Please re-read #21133 (comment).

@av86743

This comment has been minimized.

Copy link

commented Sep 21, 2019

... without deadlines the code works fine. ... either deadlines don't work the way we want or we are using it wrong. ...

When read deadline is not set in customizedConn.Read, current implementation of net/http uses very short implicit deadline and time-outs almost instantly. What you call "code works fine" is actually "problems are not given sufficient time to manifest".

@harshavardhana

This comment has been minimized.

Copy link
Contributor Author

commented Sep 22, 2019

When read deadline is not set in customizedConn.Read, current implementation of net/http uses very short implicit deadline and time-outs almost instantly. What you call "code works fine" is actually "problems are not given sufficient time to manifest".

Okay, understood @av86743 - with that analogy it would certainly also mean that writing custom wrappers for net.Conn with deadlines is generally not advisable and can lead to issues - also come in the way.

Would it be possible to document such a thing? Because we shot ourselves in the foot without realizing this would come in the way as per net.Conn documentation.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.