Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to sync a mainnet nearcore node from a public snapshot #6095

Closed
ashpool37 opened this issue Jan 17, 2022 · 2 comments
Closed

Unable to sync a mainnet nearcore node from a public snapshot #6095

ashpool37 opened this issue Jan 17, 2022 · 2 comments
Assignees
Labels
C-bug Category: This is a bug T-SRE Team: issues relevant to the SRE team

Comments

@ashpool37
Copy link

ashpool37 commented Jan 17, 2022

Describe the bug

As of 16 Jan 2022, a mainnet nearcore node is unable to properly sync given the contents of an official node data snapshot as the initial state. Attempting to make it do so reliably leads to inconsistent state and inability to catch up on blocks.

The approximate time this was last reproduced is 17 Jan 2022, 1500 UTC.
First discovered 16 Jan 2022 0800 UTC
The sha256sum of a particular offending snapshot is d95e42c5c833d9647c410884bf48b059c890c17153574afb5df288898d0031118 data.tar
At the time of downloading, the snapshot file was a little over 320GiB

To Reproduce

  • Initialize the node by running neard init --download-genesis --download-config
  • cd ~/.near/data
  • curl https://near-protocol-public.s3.ca-central-1.amazonaws.com/backups/mainnet/rpc/data.tar | tar xf -
  • Start the node (neard run) and wait for it to start downloading headers
  • Observe the node log the message: ERROR client: Error processing sync blocks: Invalid Next BP Hash and ban a lot of peers
  • Upon finishing sync, observe the node get stuck at one block hash.

Expected behavior
The node is expected to undergo the usual sync process of downloading headers, then downloading blocks, and eventually catching up with the network.

Logs
These logs were captured during three separate attempts to sync a node

Notice the common error message (ERROR client: Error processing sync blocks: Invalid Next BP Hash) and the excessive peer banning behaviour.

Version (please complete the following information):

  • neard: Version: 1.23.1, Build: 1.23.0-9-gc0551c84b-modified, Latest Protocol: 50
  • Docker CE 20.10.12 (e91ed57) on linux/amd64
  • mainnet

Additional context
We're fairly certain this isn't a nearcore issue, but rather a "snapshots" one. We were able to properly sync the node from another snapshot source.

@frol frol added T-SRE Team: issues relevant to the SRE team C-bug Category: This is a bug labels Jan 18, 2022
@bowenwang1996
Copy link
Collaborator

I believe the snapshot is fixed. @ashpooljh could you confirm this?

@ashpool37
Copy link
Author

Yes, as of now the snapshots are working, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug T-SRE Team: issues relevant to the SRE team
Projects
None yet
Development

No branches or pull requests

4 participants