Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRpc "context canceled" error closed stream connection between client and server leading to lost data #3039

Closed
khsahaji opened this issue Sep 25, 2019 · 8 comments

Comments

@khsahaji
Copy link

gRpc stream server gets conext canceled rpc error in streamServerInstance.Recv() function which leads to the stram connection getting dropped between the server and client. Client side I am using streamClientInstance.Send("data") function which is losing data because its is not getting the error back as the stream is closed now. Is this a bug or I am missing somethiong ? This happens if I send a burst of data using the stream connection, happens usually after 1:30 minutes approxymately or roughly in between 40kto 60k data sent through the stream. The context was not created "withcancel" option at client side.

@khsahaji
Copy link
Author

khsahaji commented Sep 25, 2019

Client Side

ctx := context.Background()

stream, err := client.RaiseEventStream(ctx)
if err != nil {
	log.Println("Error while calling RaiseEventStream.", client, err)
	conn.Close()
	time.Sleep(time.Duration(retryTimeSec) * time.Second)
	return nil, nil, err
}

for {
err := (*stream).Send("data")
}

server side

    for {
	inRequest, err := stream.Recv()
	if err == io.EOF {
		tracelog.Info("efsmonitor", "getVnfDetails", "Reached EOF")
		return nil
	}
	if err != nil {
		tracelog.Errorf(err, "eventreceiver", "RaiseEventStream", "Error in stream:") <-- this      error is getting hit with message "rpc error: code = Canceled desc = context canceled".
		return err
	}
       }

@khsahaji khsahaji changed the title gRpc context cancel error closed stream connection between client and server leading to lost adta gRpc "context canceled" error closed stream connection between client and server leading to lost adta Sep 25, 2019
@khsahaji
Copy link
Author

khsahaji commented Sep 25, 2019

This is related to this issue -
#2159
which is closed so I can not comment there
This comment section is from stream.go code
// Listen on cc and stream contexts to cleanup when the user closes the
// ClientConn or cancels the stream context. In all other cases, an error
// should already be injected into the recv buffer by the transport, which
// the client will eventually receive, and then we will cancel the stream's
// context in clientStream.finish.
This is telling that context will be cancelled for any error at the transport level. This is what is happening in my case and leading to data loss.
Since stream.Send() buffers the data in internal buffer as explained in the link above I have no indication at client side about this failure. If I check the context for error before stream.Send() is called every time will it guarantee no data loss ?

@khsahaji
Copy link
Author

@dfawley Can you help here ?

@khsahaji khsahaji changed the title gRpc "context canceled" error closed stream connection between client and server leading to lost adta gRpc "context canceled" error closed stream connection between client and server leading to lost data Sep 25, 2019
@dfawley
Copy link
Member

dfawley commented Sep 25, 2019

Since stream.Send() buffers the data in internal buffer as explained in the link above I have no indication at client side about this failure. If I check the context for error before stream.Send() is called every time will it guarantee no data loss ?

You can't know whether the server has processed the data sent by the client until the client observes a successful stream end. I.e. stream.Recv() is called until it gets an io.EOF error. Any other non-nil error means the server may not have received all the data the client attempted to send. The client should retry the RPC in this case.

For long-lived streams, you could build ACKs into your protocol, but this is not something supported by grpc itself.

Hope that helps.

@khsahaji
Copy link
Author

Hi dfawley,
this definitely helps. I have a few questions though about the implementation. Since client is waiting on stream.Recv() (since its a blocking call) then server has to call SendAndClose() after it has received data in its own Recv() call right ? Also does Recv() block indefinitely ? What happens if server does not ever receive the data i.e it is also waiting on its own Recv() call or it has called SendAndClose() but the response in lost in transport due to an issue ? Does Recv() have a timeout of sort ? If I have to implement a timeout for Recv() how do I unblock the Recv() call in client side ?

@dfawley
Copy link
Member

dfawley commented Sep 26, 2019

When the server's' method handler returns, the RPC ends and any client Recv calls will unblock. These will return io.EOF or the RPC status error at that time.

If the connection between the client and server is lost, the transport should eventually fail all its streams, or the deadline of the RPC will be reached, and Recv will also return in those scenarios. There are no timeouts for individual operations on a stream. To help detect the situation where the network has disconnected faster, you can set keepalive on the client (or server), e.g.: https://godoc.org/google.golang.org/grpc#WithKeepaliveParams.

@dfawley
Copy link
Member

dfawley commented Sep 26, 2019

There is a way of doing per-operation timeouts if you wish (killing the stream if the timeout is reached). See #1229 (comment) for an example. This is written assuming server-side but you can do something similar on the client side and cancel the stream's context when the per-operation deadline is reached.

@khsahaji
Copy link
Author

Thanks a lot for the explanation and guidance. This really clarifies some of the doubts I was having , I shall try and come back and comment here if I get stuck any where.

@dfawley dfawley closed this as completed Sep 27, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants