Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate chunk state witness pre validation logic from orphan witness handler to partial witness actor #11193

Closed
5 tasks done
Tracked by #46
shreyan-gupta opened this issue May 1, 2024 · 8 comments
Closed
5 tasks done
Tracked by #46
Assignees
Labels
A-stateless-validation Area: stateless validation

Comments

@shreyan-gupta
Copy link
Contributor

shreyan-gupta commented May 1, 2024

The first contact a node had with the network layer to receive the chunk state witness is crucial to do proper validation.

Before the introduction of partial state witness, the first point of contact was ChunkStateWitnessMessage from network adapter, however after the introduction of partial witnesses, we would have to shift all the validation functionality to the partial_witness_actor.

This mainly includes shifting validation defined in partially_validate_state_witness in Client and handle_orphan_state_witness in Client

The first part of this has already been implemented in validate_partial_encoded_state_witness but we need eyes on the rest of the validation.

@shreyan-gupta shreyan-gupta added the A-stateless-validation Area: stateless validation label May 1, 2024
@shreyan-gupta shreyan-gupta self-assigned this May 1, 2024
@shreyan-gupta
Copy link
Contributor Author

@pugachAG @jancionear have previously worked on the validation side.

@walnut-the-cat
Copy link
Contributor

Currently, both @pugachAG and @shreyan-gupta are listed as assignee of this task. Please update it accordingly as we start working on this-

@pugachAG
Copy link
Contributor

pugachAG commented May 6, 2024

We need to do the following:

  1. reject partial witness for chunks before the last final block:
  • not sure LRU cache cleanup is required, probably it is OK to just let those entries get evicted as we process new witnesses
  • requires access to the chain head from the partial witness actor
  1. keep track of the already processed witnesses and reject witness parts: similar to feat: avoid processing state witness multiple times #11111
  2. ensure that chunk witness corresponds to the metadata from PartialEncodedStateWitnessInner: validate epoch_id, shard_id and height_created.
  3. add EpochId as part of parts_cache key to have (ShardId, BlockHeight, EpochId)

@walnut-the-cat
Copy link
Contributor

Adding arbitrary timeline for now. Please adjust with the correct estimation

@shreyan-gupta
Copy link
Contributor Author

Notes from conversation Anton and I had offline

  • We need to add an additional check for the size of the partial witness. That would look something on the lines of max_state_witness_size * encoding_size_increase_factor / num_validators, approx 10 MB * 1.25 / ~50 validators
  • We need epoch_id in lru cache key of partial state witness tracker as we can potentially have valid heights, shard_id pairs in different epochs based on forks and where the epoch boundary is.
  • Reading chain head directly from RocksDB shouldn't be too bad. This is the better alternative of passing it around and fetching it from client.
  • The set of checks we need to have in place should be at a level similar to those we have for chunk distribution in order to avoid over-engineering things.

@shreyan-gupta
Copy link
Contributor Author

keep track of the already processed witnesses and reject witness parts: similar to #11111

Quick comment on this, I think one good way of ensuring this would be to have a separate LRU cache just keeping track of "seen" or "observed" witnesses.

Currently we have the parts_cache in partial_witness_tracker that is possibly exposed to overflowing and cache eviction. In the current implementation, once we mark an entry as is_decoded once we have enough parts and send the decoded state witness to client.

I'm slightly concerned what would happen if this cache entry gets evicted and a malicious chunk producer sends over all the parts again. In such a case we would have no memory of having already processed this witness.

A quick solution is to have a tiny LRU cache for all witnesses we have decoded and quickly check that once.

@shreyan-gupta
Copy link
Contributor Author

Potential discussion points: Is it fine to have hard checks based on "distance from head". For example, can we safely say, "we should not be processing any state witnesses that are a distance 50 away from our current head"?

Can there be issues with respect to skipped blocks etc.?

@pugachAG
Copy link
Contributor

pugachAG commented May 8, 2024

After looking into think for quite some time now I think it makes sense to consolidate partial witness tracking with the orphan pool. All validation logic from the orphan pool is applicable when validating witness parts. This can be implemented by adding an actix message sent from the Client to the PartialWitnessActor notifying about the processed block. This way partial witness tracker doesn't immediately send the witness for processing when prev block is not present, but retains it until the corresponding block is received from the Client.
Also I think we can safely restrict the range of considered state witness heights to (last_final_height..chain_head_height + 5] (the same as front horizon MAX_HEIGHTS_AHEAD constant from chunk_cache.rs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stateless-validation Area: stateless validation
Projects
None yet
Development

No branches or pull requests

4 participants