Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't Sync from 0 on latest release [MAINNET] #1564

Closed
LordKurono opened this issue Jul 4, 2023 · 6 comments
Closed

Can't Sync from 0 on latest release [MAINNET] #1564

LordKurono opened this issue Jul 4, 2023 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@LordKurono
Copy link

LordKurono commented Jul 4, 2023

Describe the bug
Starting a new pocket node on version RC-0.10.0 using this command:
pocket start --seeds="7c0d7ec36db6594c1ffaa99724e1f8300bbd52d0@seed1.mainnet.pokt.network:26662,cdcf936d70726dd724e0e6a8353d8e5ba5abdd20@seed2.mainnet.pokt.network:26663,74b4322a91c4a7f3e774648d0730c1e610494691@seed3.mainnet.pokt.network:26662,b3235089ff302c9615ba661e13e601d9d6265b15@seed4.mainnet.pokt.network:26663" --mainnet

Crashes after processing the first block

I[2023-07-04|12:14:16.187] Received                                     module=blockchain src="Peer{MConn{50.171.22.146:26656} 5595f94dc0f9e205544abaf03b6488dfab0d8f79 out}" height=19
I[2023-07-04|12:14:16.187] Received                                     module=blockchain src="Peer{MConn{50.171.22.146:26656} 5595f94dc0f9e205544abaf03b6488dfab0d8f79 out}" height=38
I[2023-07-04|12:14:16.198] Committed state                              module=state height=1 txs=0 appHash=44AE48F34E908F8E387DA34DEFBD099FDC4B956BA8F0CDF28FF719E4DB35481B
I[2023-07-04|12:14:16.198] Indexed block                                module=state height=1
I[2023-07-04|12:14:16.203] Indexed block                                module=txindex height=1
I[2023-07-04|12:14:16.205] makeNextRequests will make following requests module=blockchain number=1 heights=[65]
I[2023-07-04|12:14:16.205] assigned request to peer                     module=blockchain peer=b82236fda3ceaffc52c82328ac80c6f13601f444 height=65
panic: failed to process committed block (2:0379FFD028E6A0D08B2C082B85B3E98C504B1EBDBC7DDAA431391486F3D77CB7): wrong Block.Header.AppHash.  Expected 44AE48F34E908F8E387DA34DEFBD099FDC4B956BA8F0CDF28FF719E4DB35481B, got 6C81DAD888A261A1CD41C1AA7A639F22D0A2A31668E002910403FB74CD573CBB

goroutine 69 [running]:
github.com/tendermint/tendermint/blockchain/v1.(*BlockchainReactor).processBlock(0x1400189a000)
        /Users/kurono/go/pkg/mod/github.com/pokt-network/tendermint@v0.32.11-0.20230405220629-96c095f0058d/blockchain/v1/reactor.go:476 +0x4ac
github.com/tendermint/tendermint/blockchain/v1.(*BlockchainReactor).processBlocksRoutine(0x1400189a000, 0x14001c82060)
        /Users/kurono/go/pkg/mod/github.com/pokt-network/tendermint@v0.32.11-0.20230405220629-96c095f0058d/blockchain/v1/reactor.go:335 +0x16c
created by github.com/tendermint/tendermint/blockchain/v1.(*BlockchainReactor).poolRoutine
        /Users/kurono/go/pkg/mod/github.com/pokt-network/tendermint@v0.32.11-0.20230405220629-96c095f0058d/blockchain/v1/reactor.go:374 +0xdc

Using version RC-0.9.2 yields the expected behaviors and the node syncs.

Expected behavior
Node syncs from block 1 to TOP of the chain

@LordKurono LordKurono added the bug Something isn't working label Jul 4, 2023
@LordKurono LordKurono changed the title Can't Sync from 0 on latest release Can't Sync from 0 on latest release [MAINNET] Jul 4, 2023
@nodiesBlade
Copy link
Contributor

I noticed the same issue when scratching from sync. I thought it was just a POKT issue the entire time, but now that you mention it, I was also using the latest staging branch. Interesting that RC v0.9.2 is working just fine for you, definitely worthy sanity-checking before we upgrade to V0.10.0.

When syncing from scratch, I also received wrong Block.Header.AppHash.. I also tried a snapshot around block ~60,000 and also received wrong Block.Header.AppHash. as an error, but this one might've been caused due to some other unrelated testing.

FYI: We are not on RC 0.10.0 yet on mainnet. But i'm not sure why that would make the syncing fail. Best case - this is not an issue, worst case - it was flagged before the network was upgraded. @LordKurono thanks for the report

@Olshansk
Copy link
Member

Olshansk commented Jul 4, 2023

@PoktBlade you likely have more experience synching nodes from scratch so would appreciate your help here:

  1. Does this happen every time or ephemerally?

  2. Have you hit it in the past?

  3. What do you recommend we sanity check?

  4. Which snapshot are you using?

  5. Would you be willing to take on this investigation?

@nodiesBlade
Copy link
Contributor

@PoktBlade you likely have more experience synching nodes from scratch so would appreciate your help here:

  1. Does this happen every time or ephemerally?
  2. Have you hit it in the past?
  3. What do you recommend we sanity check?
  4. Which snapshot are you using?
  5. Would you be willing to take on this investigation?

I actually don't have too much experience syncing from scratch, however, I do know it's pretty important that syncing from a point of time still works.

  1. When syncing from scratch, pretty much every time.
  2. No, I haven't actually synced from scratch in a very long long time. I did hear reports this was an issue before in the past, but the details are really fuzzy.
  3. I would start with a couple of things:
    1. Try to sync from not so far snapshot (i.e. block 85xxx) and see if it crashes, or if this is specific to scratch only or really historical snapshots only.
    1. The only thing that really changed with the state machine explicitly in V0.10.0 was this https://github.com/pokt-network/pocket-core/pull/1534/files. So we can start there while debugging the app state of the validator as well. The other thing I think we should look into is the upgraded app version if that has any significance when syncing.
  1. https://link.us1.storjshare.io/raw/jwfbmq6ar3vsyzeqkconsiz24sja/pocket-public-blockchains/pocket-network-data-0026-rc-0.6.3.6.tar was the snapshot I was trying. I was trying an old snapshot for some other testing purposes unrelated to this issue.
  2. I can help triage the issue a bit more for you and report back here. I.E - I can try to revert the above PR and see if the issue still persists. But preferably if it is due to a recent change in v0.10.0, let's get the right code owners to apply a fix once triaged.

@Olshansk
Copy link
Member

Olshansk commented Jul 5, 2023

When syncing from scratch, pretty much every time.

🤔

No, I haven't actually synced from scratch in a very long long time. I did hear reports this was an issue before in the past, but the details are really fuzzy.

Q1: @msmania @okdas Have either of you seen this?

Have either of you seen this?

Try to sync from not so far snapshot (i.e. block 85xxx) and see if it crashes, or if this is specific to scratch only or really historical snapshots only.

R1: @PoktBlade Please give it a shot and report back! You can look at the snapshots @Andrew-Pohl's team has set up in #1565

The only thing that really changed with the state machine explicitly in V0.10.0 was this #1534 (files). So we can start there while debugging the app state of the validator as well. The other thing I think we should look into is the upgraded app version if that has any significance when syncing.

Looked at the code and I don't fully understand why it would impact it given that we haven't upgraded the protocol but trust your judgment. R2: Please report back once you've given it a shot!

link.us1.storjshare.io/raw/jwfbmq6ar3vsyzeqkconsiz24sja/pocket-public-blockchains/pocket-network-data-0026-rc-0.6.3.6.tar was the snapshot I was trying. I was trying an old snapshot for some other testing purposes unrelated to this issue.

Repeating comment above, take a look at #1565 to see if it helps.

I can help triage the issue a bit more for you and report back here. I.E - I can try to revert the above PR and see if the issue still persists. But preferably if it is due to a recent change in v0.10.0, let's get the right code owners to apply a fix once triaged.

R2: Can you revert (i.e. hard reset) to 410e12c3b40106e6e9c98edd4f17ed1234a7d435 and see if the same issue occurs?

R3: Can you rm 0aa12e71a33fca29c8e8ad5f60dcccae775a8045 and see if the same issue occurs?

Thanks again for your help in advance! Seeing PNF (@jacklaing) for visibility to know that community members are helping debug this.


Legend:

  • Q(N): Question N
  • R(N): Request N

@nodiesBlade
Copy link
Contributor

triaged and fixed in #1566

Olshansk pushed a commit that referenced this issue Jul 5, 2023
## Description

The mainnet genesis file was changed as part of PR #1535 when adding misc changes for localnet. Since I believe the genesis file is generated anded loaded via `InitGenesis`  and `InitChainer`, this resulted in a different genesis from the rest of the network and ultimately a different generated app hash than what the other peers are synced to. As a result, this caused syncing issues when trying to produce the first block and onwards as reported in #1564.

This change reverts the changes made in the genesis json in #1535

Tested by applying this change and running 
```sh
pocket start --seeds="7c0d7ec36db6594c1ffaa99724e1f8300bbd52d0@seed1.mainnet.pokt.network:26662,cdcf936d70726dd724e0e6a8353d8e5ba5abdd20@seed2.mainnet.pokt.network:26663,74b4322a91c4a7f3e774648d0730c1e610494691@seed3.mainnet.pokt.network:26662,b3235089ff302c9615ba661e13e601d9d6265b15@seed4.mainnet.pokt.network:26663" --mainnet
```

<!-- reviewpad:summarize:start -->
### Summary generated by Reviewpad on 05 Jul 23 07:43 UTC
This pull request reverts the mainnet genesis file from #1535. It updates the service URLs for several nodes in the mainnetGenesis variable in the genesis.go file. The URLs are changed from the dev environment to the production environment for the poktnodes.com and pokt.net domains. Additionally, the URLs for the pokt.foundation domain are updated as well.
<!-- reviewpad:summarize:end -->


Co-authored-by: poktblade <baaspoolsllc@gmail.com>
@Olshansk
Copy link
Member

Olshansk commented Jul 5, 2023

Resolved in @PoktBlade's fix!

@Olshansk Olshansk closed this as completed Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

3 participants