Timing issue with /v3alpha/lease/keepalive that is trivially reproducible with telnet #8237

jamesyonan · 2017-07-09T23:42:06Z

I'm trying to stream lease keepalives via grpc-gateway. When I send a full HTTP header before each request it works fine. However if I try to stream the keepalive messages inside a single HTTP request by sending a newline-delimited JSON message every few seconds inside an HTTP chunk, etcd silently ignores the streamed messages (after the first one in the request) and expires the lease.

Not sure if I'm doing this right, though I did see a line in the docs indicating that grpc-gateway supports newline-delimited JSON streaming. I had no problem streaming replies (for example with /v3alpha/watch) but can't seem to figure out how to stream requests.

Thanks for any guidance.

etcd Version: 3.2.0+git

POST /v3alpha/lease/keepalive HTTP/1.1
Host: 127.0.0.1
User-Agent: Me
Content-Type: application/json
Transfer-Encoding: chunked
Connection: keep-alive
Accept: */*

1d
{"ID":"7587823380406781187"}

1d
{"ID":"7587823380406781187"}

1d
{"ID":"7587823380406781187"}

. . .

Thanks,
James

The text was updated successfully, but these errors were encountered:

jamesyonan · 2017-07-10T08:05:15Z

Looking into this further, it seems that etcd/grpc-gateway may have a timing/buffering issue with streamed requests.

This is very easy to reproduce with telnet.

If I paste 10 keepalive messages at once (as chunked, newline-delimited JSON) as a request into the telnet session, I get the expected 10 replies:

$ telnet 127.0.0.1 2379
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
POST /v3alpha/lease/keepalive HTTP/1.1
Host: 127.0.0.1
Transfer-Encoding: chunked

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

0

HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 10 Jul 2017 07:47:19 GMT
Transfer-Encoding: chunked

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

0

Now retry the experiment, but instead of pasting the entire request at once, paste the individual chunks/messages into the telnet session one by one, for example one chunk per second.

Now the reply only acknowledges 2 of the messages, even though 10 messages were sent.

$ telnet 127.0.0.1 2379
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
POST /v3alpha/lease/keepalive HTTP/1.1
Host: 127.0.0.1
Transfer-Encoding: chunked

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

1e
{"ID":"7587823380406781187"}

0

HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 10 Jul 2017 07:57:18 GMT
Transfer-Encoding: chunked

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

9c
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"286","raft_term":"42"},"ID":"7587823380406781187"}}

0

Thanks,
James

heyitsanthony · 2017-07-11T21:14:13Z

Can repro with this script:

#!/bin/bash

set -euo pipefail

function keepalive {
        echo -e -n "POST /v3alpha/lease/keepalive HTTP/1.1\x0d\x0a"
        echo -e -n "Host: 127.0.0.1\x0d\x0a"
        echo -e -n "Transfer-Encoding: chunked\x0d\x0a"
        echo -e -n "\x0d\x0a"
        msg="{\"ID\":$id}"
        len=`printf "%x" $(echo -e -n "$msg" | wc -c)`
        for a in `seq 1 10`; do
                echo -e -n "$len\x0d\x0a"
                echo -e -n "$msg\x0d\x0a"
                sleep 0.1s
        done
        echo -e -n "0\x0d\x0a"
        echo -e -n "\x0d\x0a"
}

resp=`curl -s -l http://localhost:2379/v3alpha/lease/grant -XPOST -d'{"TTL": 120}'`
id=`echo $resp | tr ',' '\n' | grep ID | cut -f2 -d':'`
echo "got id $id..."
keepalive $id | nc localhost 2379

The output is:

HTTP/1.1 200 OK
Content-Type: application/json
Date: Tue, 11 Jul 2017 21:09:17 GMT
Transfer-Encoding: chunked

a7
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"13","raft_term":"12"},"ID":"7587823422460437052","TTL":"120"}}

a7
{"result":{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"13","raft_term":"12"},"ID":"7587823422460437052","TTL":"120"}}

89
{"error":{"grpc_code":1,"http_code":408,"message":"rpc error: code = Canceled desc = context canceled","http_status":"Request Timeout"}}

0

So a context is getting canceled somewhere in the server and prematurely closing the stream.

hexfusion · 2017-07-11T23:25:11Z

@jamesyonan, @heyitsanthony I would be interested in taking a stab at resolving this unless you were planning on it.

yudai · 2017-07-12T01:27:34Z

I think I know the cause. grpc-gateway doesn't support real bi-directonal streams, as mentioned in its README.

In the etcd process, you can see logs like below that complains about closed bodies when you run the script provided by @heyitsanthony above.

 etcdserver/api/v3rpc: Failed to decode request: http: invalid Read on closed Body

This error is raised at:

https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/gw/rpc.pb.gw.go#L189

The problem is that this code tries to read the request body multiple times, even after the handler has started sending a response. However, in the document of the http package, you can find a statement that tells it's not supported (https://golang.org/pkg/net/http/#ResponseWriter):

        // Depending on the HTTP protocol version and the client, calling
        // Write or WriteHeader may prevent future reads on the
        // Request.Body. For HTTP/1.x requests, handlers should read any
        // needed request body data before writing the response. Once the
        // headers have been flushed (due to either an explicit Flusher.Flush
        // call or writing enough data to trigger a flush), the request body
        // may be unavailable. For HTTP/2 requests, the Go HTTP server permits
        // handlers to continue to read the request body while concurrently
        // writing the response. However, such behavior may not be supported
        // by all HTTP/2 clients. Handlers should read before writing if
        // possible to maximize compatibility.

So if you Write or Flush once, you can no longer read the body.

The first Flush happens here:
https://github.com/grpc-ecosystem/grpc-gateway/blob/master/runtime/handler.go#L41

Once the first Flush() has happened, the next Read fails.

I removed all Write and Flush to the ResponseWriter from the handler, and it seems it works without any error and you can see responses in the stdout of your etcd process, like this:

https://gist.github.com/yudai/fa4d66dd1ae66caf65e8c09eeb2be6b9

yudai · 2017-07-12T02:34:51Z

The error itself is defined here:
https://github.com/golang/go/blob/a48998beb571ce03650da75326a052991134e5ca/src/net/http/transfer.go#L759-L763

// ErrBodyReadAfterClose is returned when reading a Request or Response
// Body after the body has been closed. This typically happens when the body is
// read after an HTTP Handler calls WriteHeader or Write on its
// ResponseWriter.
var ErrBodyReadAfterClose = errors.New("http: invalid Read on closed Body")

And here is an issue about that:
golang/go#4637

We could use https://github.com/tmc/grpc-websocket-proxy mentioned in grpc-ecosystem/grpc-gateway#170 (comment)

heyitsanthony · 2017-07-13T00:06:50Z

The websocket proxy looks like an OK fix since it's all opt-in.

yudai · 2017-07-13T19:15:10Z

Created a PR for giving it a try.

https://github.com/lafikl/telsocket is an easy way to test websocket.
You can play like:

#!/bin/bash

set -euo pipefail

function keepalive {
        msg="{\"ID\":$1}"
        for a in `seq 1 10`; do
                echo -e -n "$msg\x0d\x0a"
                sleep 0.1s
        done
}

resp=`curl -s -l http://localhost:2379/v3alpha/lease/grant -XPOST -d'{"TTL": 120}'`
id=`echo $resp | tr ',' '\n' | grep ID | cut -f2 -d':'`
echo "got id $id..."
keepalive "$id" | telsocket -url ws://127.0.0.1:2379/v3alpha/lease/keepalive

It looks working fine.

yudai · 2017-07-13T19:20:42Z

I'm still not sure if users want to use websockets indeed, because it requires non-traditional HTTP requests. So I love to hear opinions from @hexfusion as well.
Another option could be hijaking requests ourselves.

hexfusion · 2017-07-16T14:09:10Z

@yudai thank you for the details here, great stuff! FWIW I would like to try to support this directly via grpc-gateway if possible as well. When you removed the manual Flush calls it fell back to the internal "flush after chuck" handling of "net/http" which as you noted works in this circumstance. But these manual Flush calls are required to allow for example Watch streams to work. So as an opt-in process I was thinking of passing a custom header key to facilitate disabling manual Flush. See details

grpc-ecosystem/grpc-gateway#435

Upside would be no code changes to etcd if the idea were accepted.

yudai · 2017-07-17T17:26:51Z

@hexfusion Hi, thanks for the comment.
I have a question about how to deal with the buffer size. Even when you don't flush manually, once the internal buffer gets full, the content of the buffer will be sent to the client. Your client can send some KeepAlive messages until the buffer is full, but after that, you get the same error?

hexfusion · 2017-07-18T17:32:56Z

@yudai yes you are right it could be much more complicated or much more simple. I need to test this more completely but look what happens when you pass Connection: close header. https://tools.ietf.org/html/rfc2616 has a few references to this. More testing is welcome :)

#!/bin/bash

set -euo pipefail

function keepalive {
        echo -e -n "POST /v3alpha/lease/keepalive HTTP/1.1\x0d\x0a"
        echo -e -n "Host: 127.0.0.1\x0d\x0a"
        echo -e -n "Transfer-Encoding: chunked\x0d\x0a"
        echo -e -n "Connection: close\x0d\x0a"
        echo -e -n "\x0d\x0a"
        msg="{\"ID\":$id}"
        len=`printf "%x" $(echo -e -n "$msg" | wc -c)`
        for a in `seq 1 10`; do
                echo -e -n "$len\x0d\x0a"
                echo -e -n "$msg\x0d\x0a"
                sleep 0.1s
        done
        echo -e -n "0\x0d\x0a"
        echo -e -n "\x0d\x0a"
}

resp=`curl -s -l http://localhost:2379/v3alpha/lease/grant -XPOST -d'{"TTL": 120}'`
id=`echo $resp | tr ',' '\n' | grep ID | cut -f2 -d':'`
echo "got id $id..."
keepalive $id | nc localhost 2379

I ran 1000 iterations without error

yudai · 2017-07-20T00:18:46Z

@hexfusion Guessing if you change for a in seq 1 10; do to for a in seq 1 10000; do, you would get a context error?

jamesyonan · 2017-07-21T10:02:07Z

@yudai, @hexfusion, thanks for looking into this.

@yudai, the websocket streaming support looks promising, I've moved over to using it instead of long-polling standard HTTP requests/replies -- seems like a cleaner approach that offers true bidirectional streaming. Hope to see this merged.

hexfusion · 2017-08-09T13:09:25Z

@yudai yeah this is a good example where the grpc-gateway might not be a viable, complete solution.

yudai · 2017-08-09T17:13:50Z

I'll update the PR when I have time, hopefully next week.

hexfusion · 2017-08-15T22:43:02Z

@yudai I ran this 10000 iterations without failure did your testing cause failure? I am going to try a more robust test.


#!/bin/bash

set -euo pipefail

function keepalive {
        echo -e -n "POST /v3alpha/lease/keepalive HTTP/1.1\x0d\x0a"
        echo -e -n "Host: 127.0.0.1\x0d\x0a"
        echo -e -n "Transfer-Encoding: chunked\x0d\x0a"
        echo -e -n "Connection: close\x0d\x0a"
        echo -e -n "\x0d\x0a"
        msg="{\"ID\":$id}"
        len=`printf "%x" $(echo -e -n "$msg" | wc -c)`
        for a in `seq 1 10000`; do
                echo -e -n "$len\x0d\x0a"
                echo -e -n "$msg\x0d\x0a"
                sleep 0.1s
        done
        echo -e -n "0\x0d\x0a"
        echo -e -n "\x0d\x0a"
}

resp=`curl -s -l http://localhost:2379/v3alpha/lease/grant -XPOST -d'{"TTL": 120}'`
id=`echo $resp | tr ',' '\n' | grep ID | cut -f2 -d':'`
echo "got id $id..."
keepalive $id | nc localhost 2379

using etcd master

hexfusion · 2017-08-16T12:33:16Z

Just FYI 80k loop test went fine, now going to randomize the sleep for 100k then add this to the documentation with an e2e test. It's not pretty but could be worse, only requiring a single extra header in curl. This is not meant to take away from websocket streaming which is a great idea on it's own. But instead a workaround for grpc-gateway that is already implemented.

yudai · 2017-08-16T17:19:12Z

@hexfusion The script above did not fail on my environment as well.

jamesyonan · 2017-09-25T07:43:49Z

@yudai -- hope that you can merge the websocket patch. Still working well for us.

Thanks,
James

xiang90 · 2017-10-05T21:01:25Z

@yudai what is the latest progress on this one? hopefully, we could get this issue resolved, and promote the http gateway proxy to beta from alpha.

yudai · 2017-10-05T22:42:24Z

Sorry for the delay. I updated the PR and it should be ready to merge (besides tests, as mentioned in the PR).

xiang90 · 2017-10-05T23:31:50Z

fixed by #8257

jamesyonan changed the title ~~Trying to stream lease keepalives via grpc-gateway~~ Timing issue with /v3alpha/lease/keepalive that is trivially reproducible with telnet Jul 10, 2017

heyitsanthony added the type/bug label Jul 11, 2017

yudai mentioned this issue Jul 13, 2017

embed: support websocket for bi-directional streams #8257

Merged

hexfusion mentioned this issue Jul 16, 2017

Support streaming grpc-ecosystem/grpc-gateway#435

Closed

heyitsanthony added this to the unplanned milestone Jul 18, 2017

hexfusion mentioned this issue Aug 15, 2017

Watch api hangs sometimes. #8401

Closed

xiang90 closed this as completed Oct 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timing issue with /v3alpha/lease/keepalive that is trivially reproducible with telnet #8237

Timing issue with /v3alpha/lease/keepalive that is trivially reproducible with telnet #8237

jamesyonan commented Jul 9, 2017

jamesyonan commented Jul 10, 2017 •

edited

heyitsanthony commented Jul 11, 2017

hexfusion commented Jul 11, 2017

yudai commented Jul 12, 2017

yudai commented Jul 12, 2017

heyitsanthony commented Jul 13, 2017

yudai commented Jul 13, 2017

yudai commented Jul 13, 2017 •

edited

hexfusion commented Jul 16, 2017 •

edited

yudai commented Jul 17, 2017

hexfusion commented Jul 18, 2017 •

edited

yudai commented Jul 20, 2017

jamesyonan commented Jul 21, 2017

hexfusion commented Aug 9, 2017

yudai commented Aug 9, 2017

hexfusion commented Aug 15, 2017 •

edited

hexfusion commented Aug 16, 2017 •

edited

yudai commented Aug 16, 2017

jamesyonan commented Sep 25, 2017

xiang90 commented Oct 5, 2017

yudai commented Oct 5, 2017

xiang90 commented Oct 5, 2017

Timing issue with /v3alpha/lease/keepalive that is trivially reproducible with telnet #8237

Timing issue with /v3alpha/lease/keepalive that is trivially reproducible with telnet #8237

Comments

jamesyonan commented Jul 9, 2017

jamesyonan commented Jul 10, 2017 • edited

heyitsanthony commented Jul 11, 2017

hexfusion commented Jul 11, 2017

yudai commented Jul 12, 2017

yudai commented Jul 12, 2017

heyitsanthony commented Jul 13, 2017

yudai commented Jul 13, 2017

yudai commented Jul 13, 2017 • edited

hexfusion commented Jul 16, 2017 • edited

yudai commented Jul 17, 2017

hexfusion commented Jul 18, 2017 • edited

yudai commented Jul 20, 2017

jamesyonan commented Jul 21, 2017

hexfusion commented Aug 9, 2017

yudai commented Aug 9, 2017

hexfusion commented Aug 15, 2017 • edited

hexfusion commented Aug 16, 2017 • edited

yudai commented Aug 16, 2017

jamesyonan commented Sep 25, 2017

xiang90 commented Oct 5, 2017

yudai commented Oct 5, 2017

xiang90 commented Oct 5, 2017

jamesyonan commented Jul 10, 2017 •

edited

yudai commented Jul 13, 2017 •

edited

hexfusion commented Jul 16, 2017 •

edited

hexfusion commented Jul 18, 2017 •

edited

hexfusion commented Aug 15, 2017 •

edited

hexfusion commented Aug 16, 2017 •

edited