Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-go client failed to receive "a lot of messages" from go server #1262

Closed
olomix opened this issue May 26, 2017 · 9 comments
Closed

non-go client failed to receive "a lot of messages" from go server #1262

olomix opened this issue May 26, 2017 · 9 comments

Comments

@olomix
Copy link

olomix commented May 26, 2017

If you stream a lot of messages fast (around 30000 messages) from go server to python or js client, then python client is terminating with error

Traceback (most recent call last):
  File "pysrc/main.py", line 18, in <module>
    main()
  File "pysrc/main.py", line 13, in main
    for i in response:
  File "/Users/alek/Documents/src/grpc-bug/.pyenv/lib/python2.7/site-packages/grpc/_channel.py", line 366, in next
    return self._next()
  File "/Users/alek/Documents/src/grpc-bug/.pyenv/lib/python2.7/site-packages/grpc/_channel.py", line 357, in _next
    raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Failed parsing HTTP/2)>
make: *** [py-client] Error 1

OS is macOS 10.12.5.
go version go1.8.3 darwin/amd64
I have put a demo of client/server that implements this bug to https://github.com/olomix/grpc-bug/
To run server: make go-server && ./main
To run client: make py-client

If server/client is GO/GO, then the same implementation works fine. If the client/server is python/python, it also works fine.

If in go server implementation put time.Sleep(100*time.Microseconds) between each send, then python also will complete receive successfully.

@dfawley
Copy link
Member

dfawley commented May 26, 2017

@mehrdada - I wonder if this is the same root cause as #1192? Slowing down sending could mean it's an issue with flow control.

@mehrdada
Copy link
Member

@dfawley I agree. It looks very much so.

@olomix Can you please run the python server like the following and then run the go client and see what the error message it prints out?

GRPC_TRACE=all GRPC_VERBOSITY=ERROR python client.py

(if you don't see anything relevant, try GRPC_VERBOSITY=INFO or DEBUG as well.)

In particular, does it say something along the lines of "flow control of X incoming window of Y"?

@olomix
Copy link
Author

olomix commented May 27, 2017

@mehrdada

run the python server like the following and then run the go client

you mean go server & python client.

Sure, I've uploaded logs (error, info & debug) from python client to google drive https://drive.google.com/open?id=0B-irbkAqk9i7YmVkSy1BTW91TlU

Unfortunately those envs do not affect go server. All logs I can see from go server is via x/net/trace package

2017/05/27 06:11:10.723813	10.635269	/Greeter/SayHello
06:11:10.723899	 .    86	... RPC: from [::1]:58962 deadline:none
06:11:10.724006	 .   107	... recv: name:"you"
06:11:10.724162	 .   156	... sent: message:"i=0"
06:11:10.724189	 .    27	... sent: message:"i=1"
06:11:21.358981	 .    18	... (32622 events discarded)
06:11:21.358999	 .    19	... sent: message:"i=32624"
06:11:21.359043	 .    44	... sent: message:"i=32625"
06:11:21.359065	 .    21	... rpc error: code = Canceled desc = context canceled
06:11:21.359078	 .    14	... context canceled
06:11:21.359081	 .     3	... rpc error: code = Canceled desc = context canceled

@mehrdada mehrdada self-assigned this May 30, 2017
@mehrdada
Copy link
Member

@olomix Thanks for the repro case and logs. The symptoms indeed look very similar to #1192.

@mehrdada
Copy link
Member

I managed to narrow the initial revision where the repro case seems to be broken down to commit grpc/grpc@80ccebb on the C implementation. (PR: grpc/grpc#9230), which, as expected, is a flow control-related change. I used to think that the go implementation is doing something wrong, however, based on my observations, while still inconclusive, I am more inclined to point to the C direction at this point.

I'll look a bit more and probably will end up filing a bug on grpc/grpc C implementation and closing this one.

@mehrdada
Copy link
Member

mehrdada commented May 30, 2017

This patch on the C core (simply removing 1 + from https://github.com/grpc/grpc/blob/master/src/core/ext/transport/chttp2/transport/chttp2_transport.c#L2297) seems to work around the problem (though I can't reproduce it on current HEAD—i.e. it may be already fixed—and it might not be a fix for the root cause)...

@olomix
Copy link
Author

olomix commented May 30, 2017

I have implemented a client based on grpc-java for my go-server. It works fine. You are right. Seems only c-based grpc implementations are affected.

@mehrdada
Copy link
Member

I am now quite certain this is the same issue and I'm closing this bug as a duplicate of #1192. @olomix please hop on that thread to track the status.

@mehrdada
Copy link
Member

The issue seems to be caused by mishandling of flow control arithmetic due to an integer overflow. #1268 fixes this.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants