Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved light client requests networking protocol #9

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tomaka
Copy link
Contributor

@tomaka tomaka commented Jul 19, 2023

Rendered

This is a continuation of w3f/PPPs#10

Compared to w3f/PPPs#10, I've changed my mind about truncated proofs. I think that it's better if the client can fully verify the proofs that are returned even when the response size limit would be exceeded. This also means that the proof no longer needs to be deterministic, and I've removed the paragraph about this.

cc @cheme

@tomaka
Copy link
Contributor Author

tomaka commented Aug 22, 2023

@cheme Do you have any opinion? I think that you and I are pretty much the only people who have opinions/interest on this topic.

Implementers of the replier side should be careful to detect early on when a reply would exceed the maximum reply size, rather than inconditionally generate a reply, as this could take a very large amount of CPU, disk I/O, and memory. Existing implementations might currently be accidentally protected from such an attack thanks to the fact that requests have a maximum size, and thus that the list of keys in the query was bounded. After this proposal, this accidental protection would no longer exist.

Malicious server nodes might truncate Merkle proofs even when they don't strictly need to, and it is not possible for the client to (easily) detect this situation. However, malicious server nodes can already do undesirable things such as throttle down their upload bandwidth or simply not respond. There is no need to handle unnecessarily truncated Merkle proofs any differently than a server simply not answering the request.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with malicious remark.
One thing that client can see as malicious is restarting a truncated proof at the wrong node level (like proof is node a-b-c first reply is a-b if second reply is b-c this is malicious).


This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.

The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my implementation I do not allow restarting a proof with a wrong trie node eg we query k1 k2 and k3, proof to all is nodes a-b-c-d-e proof to k1-k3 is only abde then if first reply is ab we got k1 but if second is de we don't have k2 and fail.
So only second valid reply is one starting with node c and any usefull following nodes.
So not really sure if the intention was that to discard allow discarding key in the middle of a query, but I think it will be simpler to word "allowed to discard all following keys of the request and terminate reply".

For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is `concat(":child_storage:default:", child_trie_hash)` and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning.
Also note that child tries aren't considered as descendants of the main trie when it comes to the `includeDescendants` flag. In other words, if the request concerns the main trie, no content coming from child tries is ever sent back.

This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, in my implementation the replier is allowed to return a single trie node even if it does not add a reply to any key (eg a big branch), as long as it make the query reply progress.
(I can implement the constraint at substrate level though (at trie level it makes sense to me to allow chunk with any non empty trie content that participate in query progress).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants