Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support read quorum for P2P #332

Closed
7 of 8 tasks
ghost opened this issue May 19, 2022 · 3 comments · Fixed by #369
Closed
7 of 8 tasks

Support read quorum for P2P #332

ghost opened this issue May 19, 2022 · 3 comments · Fixed by #369
Labels
core team Assigned to the core team feature New feature request P2P Involve P2P networking

Comments

@ghost
Copy link

ghost commented May 19, 2022

Problem to solve

In distributed system when we are dealing with several replicas sharding the same information, during the read time we want to avoid as much as possible stale data, to be up to date.

Current implementation request most of the time the nearest node but this can make consistency issues and retrieved not accurate data.

Solution

First proposal

We can implement a read quorum policy to query few nodes and except a set of nodes to respond accordingly to the policy.

Usually in distributed system the rule of quorum is: reads + writes > replicas

Where :

  • the reads are the number of nodes which reply to the read request
  • the writes are the number of nodes which confirm the write
  • the replicas is the entire set of replicas

In Archethic's case, the writes are the confirmations coming from the atomic commitment from a validation node point of view, where the beacon chain and the client received a replication attestation. (subtree of the replication tree)

Then we can deduct a number reads from the election algorithm of the replicas and the validation nodes: reads = writes - replicas

But sometimes we also want to apply some acceptance rules from the client request, then we should also support extensibility by given an acceptance function for a set of replies (winner can be the longest chain, or the latest update)

Second proposal

While the first approach seems logical in most of distributed system and databases, as the number of replicas will be high, in term of performance the cost of consistent reads becomes high (can be N/2 or N/3).

Also the integration becomes more complex as we have to define conflict resolutions functions and setup to determine the number of replication attestations for each information. However sometimes we are requesting N request because there are not transaction validations for some queries.

Also because Archethic leverages atomic commitment we can use it in the queries as well.
Then we can introduce atomic reads to be sure all the nodes requested have to return the same results otherwise we pick others or return failure.

With this approach we can also leverage the network coordinates to ensure the data loading from the nearest nodes

Remarks:

Problem with this approach is we can have huge amount of failures, as the consistency is not guarantee in a distributed system because of lateness

Third proposal

Another interesting approach is monotonic quorum reads where in 2 successive quorum reads, it’s guaranteed the 2nd one won’t get something older than the 1st one.

Integrations:

In all the cases, the function must be integrated in the following modules:

  • replication module
  • mining module
  • self repair module
  • beacon chain module
  • transaction chain module
  • top module
  • live chain explorers

Finally we need to:

  • Ensure non regression
@ghost ghost added P2P Involve P2P networking feature New feature request core team Assigned to the core team labels May 19, 2022
@ghost ghost changed the title Read quorum Support read quorum for P2P May 25, 2022
@ghost ghost mentioned this issue May 25, 2022
@apoorv-2204
Copy link
Contributor

What do you think about mnesia?

maybe I think:
We dropped it because it uses table to store data if too big its fragmented?

@ghost
Copy link
Author

ghost commented Jun 1, 2022

I don't think Mnesia here will help, mnesia is a distributed database in erlang, here we just want to improve read queries for P2P to aggregate and perform a client choice on the answers, to avoid stale data

@internet-zero
Copy link
Member

Hey team! Please add your planning poker estimate with ZenHub @apoorv-2204 @blackode @imnik11 @Neylix @roychowdhuryrohit-dev @samuel-uniris

@ghost ghost self-assigned this Jun 3, 2022
@ghost ghost mentioned this issue Jun 8, 2022
@ghost ghost closed this as completed Jun 14, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core team Assigned to the core team feature New feature request P2P Involve P2P networking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants