Improved light client requests networking protocol #9

tomaka · 2023-07-19T09:48:32Z

This is a continuation of w3f/PPPs#10

Compared to w3f/PPPs#10, I've changed my mind about truncated proofs. I think that it's better if the client can fully verify the proofs that are returned even when the response size limit would be exceeded. This also means that the proof no longer needs to be deterministic, and I've removed the paragraph about this.

cc @cheme

tomaka · 2023-08-22T09:20:10Z

@cheme Do you have any opinion? I think that you and I are pretty much the only people who have opinions/interest on this topic.

cheme · 2023-08-28T09:57:59Z

text/0009-improved-net-light-client-requests.md

+Implementers of the replier side should be careful to detect early on when a reply would exceed the maximum reply size, rather than inconditionally generate a reply, as this could take a very large amount of CPU, disk I/O, and memory. Existing implementations might currently be accidentally protected from such an attack thanks to the fact that requests have a maximum size, and thus that the list of keys in the query was bounded. After this proposal, this accidental protection would no longer exist.
+
+Malicious server nodes might truncate Merkle proofs even when they don't strictly need to, and it is not possible for the client to (easily) detect this situation. However, malicious server nodes can already do undesirable things such as throttle down their upload bandwidth or simply not respond. There is no need to handle unnecessarily truncated Merkle proofs any differently than a server simply not answering the request.
+


Agree with malicious remark.
One thing that client can see as malicious is restarting a truncated proof at the wrong node level (like proof is node a-b-c first reply is a-b if second reply is b-c this is malicious).

cheme · 2023-08-28T10:02:59Z

text/0009-improved-net-light-client-requests.md

+
+This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.
+
+The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.


In my implementation I do not allow restarting a proof with a wrong trie node eg we query k1 k2 and k3, proof to all is nodes a-b-c-d-e proof to k1-k3 is only abde then if first reply is ab we got k1 but if second is de we don't have k2 and fail.
So only second valid reply is one starting with node c and any usefull following nodes.
So not really sure if the intention was that to discard allow discarding key in the middle of a query, but I think it will be simpler to word "allowed to discard all following keys of the request and terminate reply".

cheme · 2023-08-28T10:06:26Z

text/0009-improved-net-light-client-requests.md

+For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is `concat(":child_storage:default:", child_trie_hash)` and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning.
+Also note that child tries aren't considered as descendants of the main trie when it comes to the `includeDescendants` flag. In other words, if the request concerns the main trie, no content coming from child tries is ever sent back.
+
+This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.


Here, in my implementation the replier is allowed to return a single trie node even if it does not add a reply to any key (eg a big branch), as long as it make the query reply progress.
(I can implement the constraint at substrate level though (at trie level it makes sense to me to allow chunk with any non empty trie content that participate in query progress).

tomaka added 5 commits July 19, 2023 11:48

Improved light client requests networking protocol

5a9237d

Small tweaks

38faa9b

Forgot the author field

db2d399

Add remark about too many requested keys

1f33ef5

Add remark about malicious servers

228d53b

cheme reviewed Aug 28, 2023

View reviewed changes

bkchr mentioned this pull request Sep 6, 2023

Specify merkle value paritytech/json-rpc-interface-spec#89

Merged

tomaka mentioned this pull request Oct 10, 2023

chainHead_storage: Support more resilient pagination of descendants? paritytech/json-rpc-interface-spec#85

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved light client requests networking protocol #9

Improved light client requests networking protocol #9

tomaka commented Jul 19, 2023

tomaka commented Aug 22, 2023

cheme Aug 28, 2023

cheme Aug 28, 2023

cheme Aug 28, 2023

		Implementers of the replier side should be careful to detect early on when a reply would exceed the maximum reply size, rather than inconditionally generate a reply, as this could take a very large amount of CPU, disk I/O, and memory. Existing implementations might currently be accidentally protected from such an attack thanks to the fact that requests have a maximum size, and thus that the list of keys in the query was bounded. After this proposal, this accidental protection would no longer exist.

		Malicious server nodes might truncate Merkle proofs even when they don't strictly need to, and it is not possible for the client to (easily) detect this situation. However, malicious server nodes can already do undesirable things such as throttle down their upload bandwidth or simply not respond. There is no need to handle unnecessarily truncated Merkle proofs any differently than a server simply not answering the request.


		This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.

		The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.

Improved light client requests networking protocol #9

Are you sure you want to change the base?

Improved light client requests networking protocol #9

Conversation

tomaka commented Jul 19, 2023

tomaka commented Aug 22, 2023

cheme Aug 28, 2023

Choose a reason for hiding this comment

cheme Aug 28, 2023

Choose a reason for hiding this comment

cheme Aug 28, 2023

Choose a reason for hiding this comment