Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bootsrap p2p view #1093

Merged

Conversation

Neylix
Copy link
Member

@Neylix Neylix commented Jun 13, 2023

Description

Improve the way a node load the nodes into the memtable.

Currently when a node bootstraps, it request the node view from the bootstrapping nodes and add this view into the memtable. But if the node needs to do a bootstrap sync to repair the missed beacon summaries, this view may not be the good one and the self repair verification crashes. Also the node may have into its memtable some nodes that doesn't exists according to the last beacon summaries the node has synchronized.

To fix this I assumed that a node or its informations should be in the memtable only if the according transaction has been replicated. So when a node bootstrap it now request the current node view to be able to download what it needs, but only connect to these node but doesn't add them into the memtable. The only needed node in the memtable is the first enrolled node (to pass the self repair verification)

Here is major changes:

  • During all bootstrap we now pass a list of connected node and doesn't use P2P.authorized_and_available_nodes (since the memtable is empty)
  • When getting closest node we also get the first enrolled node and add it into memtable (used to synchronize first summaries)
  • The message ListNodes now has an parameter to return only the authorized and available nodes (do not need other node when bootstrapping)
  • A node always restore its last P2P view from the DB doesn't matter if it missed a self repair or not

/!\ Attention: This PR breaks a control that avoid an already connected node to overwrite the node connection informations and restart the connection with old informations. This will be fixed in other PR (to simplify review)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

  • Starts 4 nodes and wait authorized and available

  • Stop one in beginning of summary (the transaction of this summary will be signed by the other 3 nodes)

  • At beginning of next summary stop another node

  • When the other node goes globally unavailable, restart the first stopped node

Without the fix the restarted node cannot do its bootstrap_sync as it has only 2 available nodes in it's view (the 2 remaining up node) but the summary contains transaction signed by 3 nodes

With the fix it restart using it's last known P2P view (3 nodes) but use only the 2 remaining nodes to download thing. And the self repair verification works well

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@Neylix Neylix added bug Something isn't working bootstrap labels Jun 13, 2023
@Neylix Neylix added this to the 1.1.2 milestone Jun 13, 2023
Copy link
Member

@bchamagne bchamagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with 4 nodes:
1st node boot ✔️
Node joining network ✔️
Node restart ✔️
Node reset ✔️

@samuelmanzanera samuelmanzanera merged commit cd70584 into archethic-foundation:1.1.2 Jun 14, 2023
1 check passed
@Neylix Neylix deleted the Fix-bootsrap-P2P-view branch June 14, 2023 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bootstrap bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants