Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade shutting down early for http2 connections but not http1.1 #12

Closed
chrispassas opened this issue Oct 22, 2018 · 10 comments
Closed
Labels
question Further information is requested

Comments

@chrispassas
Copy link

I'm running a web process on FreeBSD. Using curl to connect over http2 when I call SIGHUP on the process it shuts down and drops any connections without waiting for them to complete.

If I do the same but use the curl flag '--http1.1' the running connections complete before shutdown is called.

Any idea why http2 would not wait for the connections to complete? While http1.1 connections would?

Thank you,

@chrispassas
Copy link
Author

For what its worth I tested using a golang client and could not reproduce the issue. So many its something to do with how curl negotiates the connection?

@lmb
Copy link
Contributor

lmb commented Oct 23, 2018

What do you mean when you say "drops connections"? Command line output would be useful.

A quick google points towards http/2 having support for a message which indicates that a connection should not be used anymore (GOAWAY frame). HTTP/1 doesn't have such a thing (sadly) so might take longer to close.

@lmb lmb added the question Further information is requested label Oct 25, 2018
@chrispassas
Copy link
Author

I've got a simple example that reproduces something similar.

  1. Build and start go process
  2. curl -k https://localhost:8080/test
  3. kill -s sighup PROCESSID

This is attempting to start a request that will take 20 seconds. While that request is running we run a sighup to trigger the upgrade. If this is done over http2/ssl using curl it interrupts the curl to localhost:8080/test.

If you do this without TLS or without curl it works.

I've only tried it using curl and Go as the clients. Go worked but curl did not.

curl error: "curl: (56) Unexpected EOF"

My curl version if it matters

curl -V
curl 7.54.0 (x86_64-apple-darwin18.0) libcurl/7.54.0 LibreSSL/2.6.4 zlib/1.2.11 nghttp2/1.24.1
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz HTTP2 UnixSockets HTTPS-proxy

`

var pidFile = "pid"
var listenAddr = "localhost:8080"

flag.Parse()
log.SetPrefix(fmt.Sprintf("%d ", os.Getpid()))

upg, err := tableflip.New(tableflip.Options{
	PIDFile: pidFile,
})
if err != nil {
	panic(err)
}
defer upg.Stop()

// Do an upgrade on SIGHUP
go func() {
	sig := make(chan os.Signal, 1)
	signal.Notify(sig, syscall.SIGHUP)
	for range sig {
		err := upg.Upgrade()
		if err != nil {
			log.Println("Upgrade failed:", err)
		}
	}
}()

ln, err := upg.Fds.Listen("tcp", listenAddr)
if err != nil {
	log.Fatalln("Can't listen:", err)
}

server := http.Server{
	// Set timeouts, etc.
}

http.HandleFunc("/test", func(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "Hello\n")
	time.Sleep(20 * time.Second)
})

log.Printf("pid:%d", os.Getpid())

go func() {
	err := server.ServeTLS(ln, "/Users/cpassas/go/src/silkproxy_releases/http.crt", "/Users/cpassas/go/src/silkproxy_releases/http.key")
	if err != http.ErrServerClosed {
		log.Println("HTTP server:", err)
	}
}()

log.Printf("ready")
if err := upg.Ready(); err != nil {
	panic(err)
}
<-upg.Exit()

// Make sure to set a deadline on exiting the process
// after upg.Exit() is closed. No new upgrades can be
// performed if the parent doesn't exit.
time.AfterFunc(30*time.Second, func() {
	log.Println("Graceful shutdown timed out")
	os.Exit(1)
})

// Wait for connections to drain.
server.Shutdown(context.Background())

`

@lmb
Copy link
Contributor

lmb commented Oct 26, 2018

I see the same behaviour as well, on Go 1.11.

[  0.006] Connected
The negotiated protocol: h2
[  0.022] recv SETTINGS frame <length=24, flags=0x00, stream_id=0>
          (niv=4)
          [SETTINGS_MAX_FRAME_SIZE(0x05):1048576]
          [SETTINGS_MAX_CONCURRENT_STREAMS(0x03):250]
          [SETTINGS_MAX_HEADER_LIST_SIZE(0x06):1048896]
          [SETTINGS_INITIAL_WINDOW_SIZE(0x04):1048576]
[  0.022] send SETTINGS frame <length=12, flags=0x00, stream_id=0>
          (niv=2)
          [SETTINGS_MAX_CONCURRENT_STREAMS(0x03):100]
          [SETTINGS_INITIAL_WINDOW_SIZE(0x04):65535]
[  0.022] send SETTINGS frame <length=0, flags=0x01, stream_id=0>
          ; ACK
          (niv=0)
[  0.022] send PRIORITY frame <length=5, flags=0x00, stream_id=3>
          (dep_stream_id=0, weight=201, exclusive=0)
[  0.022] send PRIORITY frame <length=5, flags=0x00, stream_id=5>
          (dep_stream_id=0, weight=101, exclusive=0)
[  0.022] send PRIORITY frame <length=5, flags=0x00, stream_id=7>
          (dep_stream_id=0, weight=1, exclusive=0)
[  0.022] send PRIORITY frame <length=5, flags=0x00, stream_id=9>
          (dep_stream_id=7, weight=1, exclusive=0)
[  0.022] send PRIORITY frame <length=5, flags=0x00, stream_id=11>
          (dep_stream_id=3, weight=1, exclusive=0)
[  0.022] send HEADERS frame <length=43, flags=0x25, stream_id=13>
          ; END_STREAM | END_HEADERS | PRIORITY
          (padlen=0, dep_stream_id=11, weight=16, exclusive=0)
          ; Open new stream
          :method: GET
          :path: /test
          :scheme: https
          :authority: localhost:8080
          accept: */*
          accept-encoding: gzip, deflate
          user-agent: nghttp2/1.30.0
[  0.022] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=0>
          (window_size_increment=983041)
[  0.022] recv SETTINGS frame <length=0, flags=0x01, stream_id=0>
          ; ACK
          (niv=0)

I think the server should send a GOAWAY frame in this case. Adding a sleep after Shutdown() has returned does not fix this, so it's not just a case of not waiting long enough.

@chrispassas
Copy link
Author

@lmb do you believe this might be a bug in tableflip? Or something that can be addressed in the example code the readme provides?

@lmb
Copy link
Contributor

lmb commented Oct 29, 2018

I don't think this is a tableflip bug, mainly because there is nothing specific to http2 in the library. I've run the test case a few more times, most of the time I get a GOAWAY, but sometimes the connection just drops. This seems like a race condition in net/http2 of some sort. I need to find a reliable way to reproduce this to figure out what is going on.

@chrispassas
Copy link
Author

It may not be a tableflip bug but the example uses the http service. I think most would infer that exact code will work.

Maybe the example can be updated so that it will work with HTTP 2 but I’m not sure if that’s possible.

It does limit the usefulness if this can’t be used with go HTTP processes especially since you can’t force it to use HTTP 1 all the time.

@lmb
Copy link
Contributor

lmb commented Oct 29, 2018

I've repeated the test using curl and snooped the traffic using wireshark. Curl says:

$ SSLKEYLOGFILE=tlskeys curl -kv --http2-prior-knowledge https://localhost:8080/test
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=AU; ST=Some-State; O=Internet Widgits Pty Ltd; CN=localhost
*  start date: Oct 26 14:59:20 2018 GMT
*  expire date: Oct 26 14:59:20 2019 GMT
*  issuer: C=AU; ST=Some-State; O=Internet Widgits Pty Ltd; CN=localhost
*  SSL certificate verify result: self signed certificate (18), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55b2823fc8e0)
> GET /test HTTP/2
> Host: localhost:8080
> User-Agent: curl/7.58.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* TLSv1.2 (IN), TLS alert, Client hello (1):
* Unexpected EOF
* Connection #0 to host localhost left intact
curl: (56) Unexpected EOF

From wireshark it's clear that GOAWAY is sent:

HyperText Transfer Protocol 2
    Stream: GOAWAY, Stream ID: 0, Length 8
        Length: 8
        Type: GOAWAY (7)
        Flags: 0x00
        0... .... .... .... .... .... .... .... = Reserved: 0x0
        .000 0000 0000 0000 0000 0000 0000 0000 = Stream Identifier: 0
        0... .... .... .... .... .... .... .... = Reserved: 0x0
        .000 0000 0000 0000 0000 0000 0000 0001 = Promised-Stream-ID: 1
        Error: NO_ERROR (0)

Seems like this is a bug in curl, which was fixed in curl/curl#2510 / 7.60. Go net/http in combination with tableflip is doing the right thing here.

Re-reading your first comment it seems like you expect Go to wait 20s until exiting, the same as with HTTP/1?

@chrispassas
Copy link
Author

My example code the 20 seconds is just to simulate an HTTP request doing some work and finishing even after the sighup signal.

If this is just a curl bug that would be great. I was just testing tableflip using curl and ran into this issue. Similar to you I tried my test using the Go http client and could not reproduce the problem.

My local curl version is "curl 7.54.0" so it would fall into the version affected by the bug you posted.

I think we can close this unless someone comes up with another way to reproduce using a newer curl or go client. Agree?

@lmb
Copy link
Contributor

lmb commented Oct 30, 2018

Well, I think your test points out an interesting difference between H1 and H2. H1 will block until the handler has exited, while H2 "aborts" the handler. This is potentially a bug in the go standard library, do you want to raise an issue with them?

In the meantime I'll close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants