Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: Connection reset incorrectly in high load performance test #26956

Open
crowfrog opened this issue Aug 13, 2018 · 8 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@crowfrog
Copy link

What version of Go are you using (go version)?

golang:1.10.0 linux

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

container: rhel
kubenet:

What did you do?

func (sc *serverConn) wroteFrame(res frameWriteResult) {
...
			sc.closeStream(st, errHandlerComplete)                  <=== _code1_
		}
...
	wr.replyToWriter(res.err)                                               <=== _code2_
...
}

In some cases, the server goroutine will be switch out execution context between code1 and code2.
If the handler goroutine switch back to execution context in these cases, that will make writeDataFromHandler() (writeHeaders() also?) failed and set responseWriterState to dirty.

func (sc *serverConn) writeDataFromHandler(stream *stream, data []byte, endStream bool) error {
...
	case <-stream.cw:
		// If both ch and stream.cw were ready (as might
		// happen on the final Write after an http.Handler
		// ends), prefer the write result. Otherwise this
		// might just be us successfully closing the stream.
		// The writeFrameAsync and serve goroutines guarantee
		// that the ch send will happen before the stream.cw
		// close.
		select {
		case err = <-ch:
			frameWriteDone = true
		default:
			return errStreamClosed
		}
	}
...
}
func (rws *responseWriterState) writeChunk(p []byte) (n int, err error) {
...
		if err := rws.conn.writeDataFromHandler(rws.stream, p, endStream); err != nil {
			rws.dirty = true
			return 0, err
		}
...
}

This issue can be reproduced easily if add sc.logf() between code1 and code2 in lower load performace test.
PS: I think in wroteFrame(), we shouldn'd call "wr.replyToWriter(res.err)" when closing stream with error. It will cause writeDataFromHandler() exit with no error. But in fact this stream/responseWriterState is already on incorrect status.

What did you expect to see?

no connection reset in load test

What did you see instead?

connecttion reset and traffic failed beofre new connection set up

@crowfrog crowfrog changed the title x/net/http2: responseWriterState set to dirty incorrectly in high load performance test x/net/http2: connection reset incorrectly in high load performance test Aug 13, 2018
@crowfrog crowfrog changed the title x/net/http2: connection reset incorrectly in high load performance test x/net/http2: Connection reset incorrectly in high load performance test Aug 13, 2018
@andybons
Copy link
Member

@bradfitz @tombergan

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 13, 2018
@andybons andybons added this to the Unplanned milestone Aug 13, 2018
@bradfitz
Copy link
Contributor

Got a repro?

@bradfitz
Copy link
Contributor

/cc @fraenkel

@fraenkel
Copy link
Contributor

I have tried a variety of things with h2load as the client and a server that returns 100 - 16k bytes of data in the response. I have not been able to cause any connection reset issue regardless of debug on the server side to cause significant slow downs.

@crowfrog
Copy link
Author

hi Fraenkel,

This issue can be reproduced easily if add sc.logf() between code1 and code2 in lower load performace test.
You need lots of streams, which with less data to transport, in one connection to reproduce it.
Such as, one connection, 3000 or more streams per second created, transported then closed.

B.R

@fraenkel
Copy link
Contributor

fraenkel commented Nov 3, 2018

@crowfrog I have added sc.logf() in multiple places between your code1 and code2 blocks. I still don't hit any error when driving with h2load.
It shows that I am achieving 7150 req/s.

I have also tried with and without debug to slow things down but that too doesn't cause any error.

@crowfrog
Copy link
Author

@fraenkel Does 7150 req/s belong to one connections to golang http2 server? Does the 7150 response/s have some contant in body?

@fraenkel
Copy link
Contributor

@crowfrog Yes. There were over 10k streams created. I tried smallish and larger payloads but all constant sizes.
If you can capture http2debug, that might prove helpful to narrow down the issue since I cannot seem to do it locally on the latest code base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants