-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error { kind: Failed, description: "Received a new call on in-use question id 0" } #359
Comments
Is this a timeout deliberately triggered by you, for testing purposes? How are you causing the timeout? |
I have a sleep on the E component, and on the F side I'm using |
I have observed more CI runs that product this same error. Usually the pattern is: F calls E and passes a capability |
I should add that I'm using |
Just adding a few more details. I sprinkled a few debug statements in This is a normal interaction that finishes successfully:
Now this is an interaction that includes a timeout:
When the timeout happens I reuse the runtime object to create and send a new request. It seems the other party (pycapnp) reuses the same question_id when calling us back on the same connection, when a previous question is still in play, which is a bug I guess. What I'm trying to do now is when a timeout happens I destroy the current connection and setup a new one. For this I'm using the disconnector object we can get from RpcSystem, but the RpcSystem future itself never returns even after the disconnector finds the state as "Disconnected". I inspected the TaskSet and it appears to have in_progress Pending futures that never complete. I thought the disconnector would force the termination of whatever questions were still running, is this not the case @dwrensha ? |
Yeah, that sounds like a bug. I expect that once the |
I'm trying to extract some code to replicate this issue, but it's not been easy. I think it's related to losing some packets due to the way I was reading the socket on the Python side. Meanwhile, by manually dropping some capnp responses on the Python side I was able to trigger a panic on the Rust side:
Apparently the |
Ok, I finally tracked this down. The The issue with the So, I guess it was all my fault all along. 😅 |
I've eliminated those panics on the Rust side: 6087383 |
Hello.
I'm seeing an error happen sporadically when I run my tests on CI, and I've been trying to reproduce it locally with little luck.
Basically I have two components: F (capnproto-rust) and E (pycapnp). F calls E and passes it, among other things, a capability for it to call F. The test where this error happens is one where there the F->E call timeouts and it needs to retry. RPC calls start throwing the error "Error { kind: Failed, description: "Received a new call on in-use question id 0" }", and shortly after the
RpcSystem
promise also returns with the same error.I know I don't have a lot of information to give, but I was wondering if someone could explain to me in which circumstances I can obtain the above error so that I can more easily reproduce it. It seems like a bug in the rpc library that should be fixed. From looking at the code I get the impression that what causes the error is the call E->F, even though I only notice the issue afterwards when I try to do more rpc calls from F.
The text was updated successfully, but these errors were encountered: