Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧩 566 & 645, Aggregate Fix & Fix self repair notifier #645 #671

Conversation

apoorv-2204
Copy link
Contributor

@apoorv-2204 apoorv-2204 commented Nov 7, 2022

Description

Self-Repair:

-Determine which shards plummeted.
-Determine which transactions those shards were in charge of.
-Expand new shards for those transactions.

Proposed Solution :

  • Use minimal information to inform the new shard.
    • %ShardRepair{} message with  two fields, the genesis address and the last address.
  • The New Shard collects data on its own.
    • Via fetch Transaction and Replicate validate transaction chain
  • Handle concurrent task requests using an FSM for each genesis address,to have bottleneck .
    • As a result multiple same job are not deployed concurrently.
  • Fix self-repair notifier #645
  • Notify beacon summary in case of network topology change #566

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Testing

  • Unit tests

To test in real condition we need to update some conditions to limit the number of nodes we need to run.
Here is the patch to apply which change

  • number of storage node always return 3
  • minimum geo patch to 1 and minimum avg_availability to 0.0
  • remove burning fee address from IO storage node
  • return static value for coingecko api (just to avoid spamming the API and having error)
  • use a static geopatch when the node is a local one
    patch.txt

Then you can run 5 nodes

  • wait them to be authorized
  • send a transaction
  • look for which node stored it (normally 3 nodes)
  • stop 1 or 2 of these nodes
    After the next self repair when they goes unavailable, the notifier of each node should be triggered. The last storage node of the transaction should send to 1 or 2 new node the ShardRepair message, and these nodes should replicate it.
    Other node should also replicate a SummaryAggregate if the disconnected node was a storage node for it

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@apoorv-2204 apoorv-2204 self-assigned this Nov 9, 2022
@apoorv-2204 apoorv-2204 added bug Something isn't working self repair Involve SelfRepair mechanism labels Nov 9, 2022
@apoorv-2204 apoorv-2204 marked this pull request as ready for review November 10, 2022 07:17
@apoorv-2204 apoorv-2204 changed the title [WIP] Fix self repair notifier #645 [Testing] Fix self repair notifier #645 Nov 10, 2022
@apoorv-2204 apoorv-2204 changed the title [Testing] Fix self repair notifier #645 Fix self repair notifier #645 Nov 15, 2022
@apoorv-2204 apoorv-2204 reopened this Nov 22, 2022
@apoorv-2204 apoorv-2204 changed the title Fix self repair notifier #645 566 & 635, Aggregate Fix & Fix self repair notifier #645 Nov 23, 2022
@apoorv-2204 apoorv-2204 changed the title 566 & 635, Aggregate Fix & Fix self repair notifier #645 566 & 645, Aggregate Fix & Fix self repair notifier #645 Nov 23, 2022
@apoorv-2204 apoorv-2204 changed the title 566 & 645, Aggregate Fix & Fix self repair notifier #645 🔢 566 & 645, Aggregate Fix & Fix self repair notifier #645 Nov 23, 2022
@apoorv-2204 apoorv-2204 changed the title 🔢 566 & 645, Aggregate Fix & Fix self repair notifier #645 🧩 566 & 645, Aggregate Fix & Fix self repair notifier #645 Nov 23, 2022
@Neylix Neylix force-pushed the fix_self-repair_notifier_#645 branch 2 times, most recently from 8df78dd to a0aca17 Compare November 23, 2022 13:10
@samuelmanzanera samuelmanzanera added the core team Assigned to the core team label Nov 28, 2022
@Neylix Neylix force-pushed the fix_self-repair_notifier_#645 branch from 497d951 to 9f03fdd Compare November 28, 2022 14:38
@Neylix Neylix assigned Neylix and unassigned apoorv-2204 Nov 29, 2022
@Neylix Neylix requested review from samuelmanzanera and removed request for Neylix November 29, 2022 12:26
@Neylix Neylix force-pushed the fix_self-repair_notifier_#645 branch from 12fc395 to 4e5ec54 Compare December 1, 2022 16:42
@samuelmanzanera samuelmanzanera merged commit 091d8e8 into archethic-foundation:develop Dec 2, 2022
@Neylix Neylix mentioned this pull request Dec 2, 2022
@apoorv-2204 apoorv-2204 deleted the fix_self-repair_notifier_#645 branch December 22, 2022 11:49
@apoorv-2204 apoorv-2204 restored the fix_self-repair_notifier_#645 branch December 22, 2022 11:49
@apoorv-2204 apoorv-2204 deleted the fix_self-repair_notifier_#645 branch December 22, 2022 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core team Assigned to the core team self repair Involve SelfRepair mechanism to-analyze
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix self-repair notifier Notify beacon summary in case of network topology change
3 participants