Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Reject sync request if too many peers are syncing #3147

Closed
MaksymZavershynskyi opened this issue Aug 12, 2020 · 3 comments
Closed
Assignees
Labels
A-network Area: Network

Comments

@MaksymZavershynskyi
Copy link
Contributor

Motivation

When all network nodes are rebooted after the update they try syncing at the same time and this makes the booting slow.

Proposed design 1

Node A should reject syncing requests using structured error X when it already has more than Y nodes syncing. Number Y should be determined by benchmarking the syncing code. When node B receives structured error X during the sync it should attempt more nodes, and have a some delay retry mechanism on the peers that returned structured error X. This will naturally line up nodes into a queue.

This also makes monitoring of such network easier, since it will be easier to observe why node has not synced yet.

Proposed design 2

@evgenykuzyakov proposed that nearup can have a random delay before it starts the node. @nearmax 's argument against it is that it will not work universally, e.g. it won't work when nodes are upgraded by the community and NEAR foundation does not have perfect control on when and how people start them. Besides having randomization is a heuristics which adds to the maintenance of the system.

@bowenwang1996
Copy link
Collaborator

Let's not conflate nearup (which is a tool to manage the node) with the behavior of the node itself. Whatever we do with nearup should be separate from nearcore. As for syncing, since a node has a limited number of peers, the number of peers that are syncing is naturally limited. Also, other than state sync (for which we already have limits), syncing is not very resource intensive so I don't think that we probably don't need to impose extra restrictions, although I do agree that limiting the number of peers syncing is a way to prevent eclipse attack.

@MaksymZavershynskyi
Copy link
Contributor Author

Let's not conflate nearup (which is a tool to manage the node) with the behavior of the node itself. Whatever we do with nearup should be separate from nearcore.

I agree, let's not add hacks into nearup, like adding a randomized timer, that would solve node issues. Inability of the node to efficiently communicate with the peers and decide when and how to sync is the node issue.

@MaksymZavershynskyi
Copy link
Contributor Author

I think there is some mutual misunderstanding here.

I see, you were talking about rolling release. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-network Area: Network
Projects
None yet
Development

No branches or pull requests

3 participants