-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"io.grpc.StatusRuntimeException: UNKNOWN: channel closed" on network disconnect #10120
Comments
|
I agree that disconnects are generally UNAVAILABLE. The problem in this case is "you shouldn't have seen that error." It could have been normal, or it could have been an internal error. 9bdb8f0 is working, and we have some stacktrace information to go on. But it is rather strange that we learn that the connection is closed when trying to write headers. You are using plain-text (no TLS)? |
Yes this is plaintext. |
I think there's enough here to try and make educated guesses on ways that might improve it. But it will be troublesome without a reproduction. It is timing-based, so a reproduction will be probabilistic. @cocreature, could you share a toxiproxy configuration that kills the connection? I'm not already familiar with it, but it seems great for trying to trigger this. Are you using |
Let me describe our toxiproxy config as best as I can: We're creating the proxy within our tests through Our client (which is throwing the error) connects to the listen address of the proxy. The actual server is behind the upstream address. To break the connection we use That calls https://github.com/shopify/toxiproxy#down. We don't use reset_peer. |
Looks like |
Getting same exception.
|
@marx-freedom, your issue seems unrelated to this. File a separate issue if you need to discuss it. In your case gRPC did know why the connection was closed: "Connection reset by peer." |
What version of gRPC-Java are you using?
1.44.0
What is your environment?
Ubuntu 22.04
What did you expect to see?
An UNAVAILABLE status code or something similar
What did you see instead?
We saw things fail with this exception and stacktrace:
Interestingly, it does look like the channel recovered from this after the connection established again.
Steps to reproduce the bug
In our test setup, we kill the connection with toxiproxy and then see this failure but only relatively rarely. I don't have a reliable reproduce unfortunately (nor one that I can make public).
Is that expected? Given that it recovers should we just retry on
UNKNOWN: channel closed
like we do on an UNAVAILABLE?The text was updated successfully, but these errors were encountered: