Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent calls return Scale codec error: No such variant in enum Phase #186

Closed
sander2 opened this issue Oct 26, 2020 · 13 comments · Fixed by #189
Closed

Concurrent calls return Scale codec error: No such variant in enum Phase #186

sander2 opened this issue Oct 26, 2020 · 13 comments · Fixed by #189
Labels
bug Something isn't working

Comments

@sander2
Copy link
Contributor

sander2 commented Oct 26, 2020

When two clients concurrently call the same RPC function, they both return XtError(Codec(Error("No such variant in enum Phase"))). I'm able to reliably reproduce this with the code below. The do_accept_replace function just calls an RPC function and prints the return value. When I uncomment the delay, both of these functions behave correctly.

let task1 = do_accept_replace(client1.clone());
let task2 = do_accept_replace(client2.clone());
tokio::spawn(async move {
    task1.await;
});
// tokio::time::delay_for(tokio::time::Duration::from_secs(10)).await;
tokio::spawn(async move {
    task2.await;
});

Versions from Cargo.toml:

sp-core = "2.0.0"
frame-system = "2.0.0"
substrate-subxt-proc-macro = { git = "https://github.com/paritytech/substrate-subxt", branch = "master" }
substrate-subxt = { git = "https://github.com/paritytech/substrate-subxt", branch = "master" }
substrate-subxt-client = "0.4.0"
@ascjones ascjones added the bug Something isn't working label Oct 27, 2020
@ascjones
Copy link
Contributor

Possibly an issue in the jsonrpsee Client

@gregdhill
Copy link
Contributor

Some relevant logs:

[2020-10-30T12:33:24Z INFO  substrate_subxt::rpc] Found block 0xa31009f676caa6036750a6da1b50321fced28a85194cec78003e61387db7ca5d, with 3 extrinsics
decoding event 0: "00000000000000482d7c0900000000020000000100000000010300028837e62c0000000000000000020000000705d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d640000000000000000000000000000000000020000000a00d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d640000000000000000000000000000000000020000000e038eaf04151687736326c9fea17e25fc5287613693c912909cb226aa4794f26a48a08601000000000000000000000000000000020000001000a73f5047d04a883795ab6bb0ea149cad5f50d7344bf19ec4803b074f6d17ee38d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27da08601000000000000000000000000008eaf04151687736326c9fea17e25fc5287613693c912909cb226aa4794f26a48b348688cd069447fda142950c268126f7ee712da0000020000000000a08f593a00000000000000"
decoding event 1: "000100000000010300028837e62c0000000000000000020000000705d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d640000000000000000000000000000000000020000000a00d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d640000000000000000000000000000000000020000000e038eaf04151687736326c9fea17e25fc5287613693c912909cb226aa4794f26a48a08601000000000000000000000000000000020000001000a73f5047d04a883795ab6bb0ea149cad5f50d7344bf19ec4803b074f6d17ee38d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27da08601000000000000000000000000008eaf04151687736326c9fea17e25fc5287613693c912909cb226aa4794f26a48b348688cd069447fda142950c268126f7ee712da0000020000000000a08f593a00000000000000"
decoding event 2: "8837e62c0000000000000000020000000705d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d640000000000000000000000000000000000020000000a00d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d640000000000000000000000000000000000020000000e038eaf04151687736326c9fea17e25fc5287613693c912909cb226aa4794f26a48a08601000000000000000000000000000000020000001000a73f5047d04a883795ab6bb0ea149cad5f50d7344bf19ec4803b074f6d17ee38d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27da08601000000000000000000000000008eaf04151687736326c9fea17e25fc5287613693c912909cb226aa4794f26a48b348688cd069447fda142950c268126f7ee712da0000020000000000a08f593a00000000000000"
thread 'http.worker00' panicked at 'called `Result::unwrap()` on an `Err` value: XtError(Codec(Error("No such variant in enum Phase")))', vault/src/api.rs:135:10

The event record it fails to decode appears to be RequestIssueEvent - defined here which corresponds to this module (index 16).

@dvdplm
Copy link
Contributor

dvdplm commented Oct 30, 2020

cc @niklasad1

@gregdhill
Copy link
Contributor

At first I thought the issue was due to the missing Initialized field on Phase (substrate) but the module_variant this library would decode in this case is also wrong.

@niklasad1
Copy link
Member

niklasad1 commented Oct 30, 2020

It indeed sounds like a jsonrpsee bug but I can't really understand how it could happen because each request gets its own ID and each request should be handled separately by ID.

@sander2

The client(s) that you clone is that substrate_subxt::Client? Would be good if could elaborate in your example how you instantiate the client(s) it's not clear if it's the same client or separate clients are used and which transport you are using i.e, http or ws?

In addition, is it possible for you to fetch the log from the node that produced the block i.e., so we could diff the block/extrinsics from both client and server side in order to determine what's the underlying issue is?

@sander2
Copy link
Contributor Author

sander2 commented Oct 30, 2020

They are separate clients, using different signer accounts. Both use ws.

I uploaded logs for the parachain here and for the client here

@gregdhill
Copy link
Contributor

I was able to reproduce the issue here: https://gitlab.com/gregdhill/jsonrpsee-test

See the README for setup instructions.

@niklasad1
Copy link
Member

Ok, thanks for the info.

I could reproduce it, there's quite some queries and subscriptions that occur there. I'll let you know when I figure out what's going on.

@niklasad1
Copy link
Member

niklasad1 commented Nov 3, 2020

So, I actually managed to debug this:

From wireshark I could capture that the following rpc message was sent from the node (btc-parachain) to the client:

{"jsonrpc":"2.0","method":"state_storage","params":{"result":{"block":"0x159f94d7b0b0343b5c904e8f54c30f839068af7ebd2b309ba94e0c8e409a9899","changes":[["0x26aa394eea5630e07c48ae0c9558cef780d41e5e16056765bc8461851072c9d7","0x0c00000000000000482d7c0900000000020000000100000000010300035884723300000000000000000200000000010300035884723300000000000000"]]},"subscription":"a5PSaZKb3Df7CTmk"}}~º{"jsonrpc":"2.0","method":"author_extrinsicUpdate","params":{"result":{"inBlock":"0x159f94d7b0b0343b5c904e8f54c30f839068af7ebd2b309ba94e0c8e409a9899"},"subscription":"KVuXpY7bTO597oLy"}}

Then 84723300000000000000000200000000010300035884723300000000000000 fails to be decoded as Phase which is part of the changes: 0x0c00000000000000482d7c0900000000020000000100000000010300035884723300000000000000000200000000010300035884723300000000000000 but doesn't seem to jsonrpsee related at all.

[2020-11-03T13:29:24Z ERROR substrate_subxt::events] decode: 84723300000000000000000200000000010300035884723300000000000000
thread 'main[2020-11-03T13:29:24Z TRACE soketto::connection] 7b5b0a20: send: (Binary (fin 1) (rsv 000) (mask (1 d118843b)) (len 130))
' panicked at 'Error: Scale codec error: No such variant in enum Phase', src/main.rs:42:21

FWIW, you still get an error when only using one substrate_subxt::Client: Runtime error: Runtime module error: NonDefaultComposite from System but I guess that occurs later that's why it's only seen with one client.

//cc @ascjones I don't know really what's going here but sounds like a node/subxt-runtime issue to me, but maybe you do? :)

@ascjones
Copy link
Contributor

ascjones commented Nov 3, 2020

Thanks for the investigation @niklasad1, I'll take a look

@gregdhill
Copy link
Contributor

Is it possible that there is a bug in the encoding / decoding? I can see in the prior iteration of the decode_events loop (before failing to decode Phase) that we retrieve an ExtrinsicFailed event. I can then successfully manually decode this appended data as DispatchInfo - though that may just be luck.

@gregdhill
Copy link
Contributor

gregdhill commented Nov 5, 2020

@ascjones I think I see the problem. The library tries to decode a ExtrinsicFailedEvent but immediately returns the DispatchError which means it doesn't decode the DispatchInfo part of the event. Subsequently, since this data has not been read, decoding the next event fails. I'll work on a fix.

@ascjones
Copy link
Contributor

ascjones commented Nov 5, 2020

Awesome thanks @gregdhill!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants