You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fast Catchup not working on MainNet with v2.1.5 - catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts
#1558
Closed
Thireus opened this issue
Sep 26, 2020
· 5 comments
Dedicated server with 1Gbps connectivity (up/down)
Fresh install of algorand with v2.1.5 (no prior install)
Steps to reproduce
Before:
$ goal node status -d /var/lib/algorand -w 2000
Last committed block: 1983270
Time since last block: 0.0s
Sync Time: 44769.4s
Last consensus protocol: https://github.com/algorandfoundation/specs/tree/5615adc36bad610c7f165fa2967f4ecfa75125f0
Next consensus protocol: https://github.com/algorandfoundation/specs/tree/5615adc36bad610c7f165fa2967f4ecfa75125f0
Round for next consensus protocol: 1983271
Next consensus protocol supported: true
Last Catchpoint:
Genesis ID: mainnet-v1.0
Genesis hash: wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=
The node processes the following additional steps:
Catchpoint accounts processed...
Catchpoint total blocks...
During the last few minutes of fast catchup:
$ goal node status -d /var/lib/algorand -w 2000
Last committed block: 1983548
Sync Time: 1351.4s
Catchpoint: 9290000#PQRSAA3T7USCIGCVRH7MEXWCEXTASVKPZNCH4UMSLYEXXHHK2D3Q
Catchpoint total accounts: 5877716
Catchpoint accounts processed: 5877716
Catchpoint total blocks: 1000
Catchpoint downloaded blocks: 907
Genesis ID: mainnet-v1.0
Genesis hash: wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=
Once "Catchpoint downloaded" is "done", fast catchup fails silently and the node returns to syncing from where it stopped before the fast catchup attempt:
$ goal node status -d /var/lib/algorand -w 2000
Last committed block: 1983720
Time since last block: 0.4s
Sync Time: 9.2s
Last consensus protocol: https://github.com/algorandfoundation/specs/tree/5615adc36bad610c7f165fa2967f4ecfa75125f0
Next consensus protocol: https://github.com/algorandfoundation/specs/tree/5615adc36bad610c7f165fa2967f4ecfa75125f0
Round for next consensus protocol: 1983721
Next consensus protocol supported: true
Last Catchpoint:
Genesis ID: mainnet-v1.0
Genesis hash: wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=
Expected behaviour
Fast Catchup should work.
Actual behaviour
Fast Catchup fails silently after the "Catchpoint total blocks" process.
Those are the logs during the exact moment the fast catchup silently failed:
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.(*NetworkFetcher).FetchBlock","level":"info","line":219,"msg":"networkFetcher.FetchBlock: asking client r-ag.algorand-mainnet.network:4160 for block 9289094","time":"2020-09-26T12:41:23.370661+02:00"}
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.(*NetworkFetcher).FetchBlock","level":"info","line":219,"msg":"networkFetcher.FetchBlock: asking client r-he.algorand-mainnet.network:4160 for block 9289093","time":"2020-09-26T12:41:23.589744+02:00"}
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.(*NetworkFetcher).FetchBlock","level":"info","line":219,"msg":"networkFetcher.FetchBlock: asking client r-co.algorand-mainnet.network:4160 for block 9289092","time":"2020-09-26T12:41:24.530497+02:00"}
{"file":"logger.go","function":"github.com/algorand/go-algorand/daemon/algod/api/server/lib/middlewares.(*LoggerMiddleware).handler.func1","level":"info","line":56,"msg":"127.0.0.1:60808 - - [2020-09-26 12:41:25.328691883 +0200 CEST m=+47488.546827919] \"GET /v2/status HTTP/1.1\" 200 637 \"Go-http-client/1.1\" 27.616µs","time":"2020-09-26T12:41:25.328746+02:00"}
{"file":"logger.go","function":"github.com/algorand/go-algorand/daemon/algod/api/server/lib/middlewares.(*LoggerMiddleware).handler.func1","level":"info","line":56,"msg":"127.0.0.1:60808 - - [2020-09-26 12:41:25.32911893 +0200 CEST m=+47488.547254967] \"GET /versions HTTP/1.1\" 200 0 \"Go-http-client/1.1\" 18.427µs","time":"2020-09-26T12:41:25.329158+02:00"}
{"callee":"github.com/algorand/go-algorand/ledger.(*CatchpointCatchupAccessorImpl).ResetStagingBalances.func1","caller":"github.com/algorand/go-algorand/ledger/catchupaccessor.go:180","file":"dbutil.go","function":"github.com/algorand/go-algorand/util/db.(*Accessor).atomic","level":"warning","line":344,"msg":"dbatomic: tx surpassed expected deadline by 382.724586ms","name":"","readonly":false,"time":"2020-09-26T12:41:26.173825+02:00"}
{"Context":"sync","file":"service.go","function":"github.com/algorand/go-algorand/catchup.(*Service).periodicSync","level":"info","line":422,"msg":"network ready","name":"","time":"2020-09-26T12:41:26.174094+02:00"}
{"Context":"sync","details":{"StartRound":1983548},"file":"telemetry.go","function":"github.com/algorand/go-algorand/logging.(*telemetryState).logTelemetry","instanceName":"/nRykeA74vF7XHoV","level":"info","line":213,"msg":"/ApplicationState/CatchupStart","name":"","session":"","time":"2020-09-26T12:41:26.174162+02:00"}
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.NetworkFetcherFactory.NewOverGossip","level":"info","line":127,"msg":"no gossip peers for NewOverGossip","time":"2020-09-26T12:41:26.174198+02:00"}
{"file":"node.go","function":"github.com/algorand/go-algorand/node.(*AlgorandFullNode).SetCatchpointCatchupMode.func1","level":"info","line":935,"msg":"Indexer is not available - indexer is not active","name":"","time":"2020-09-26T12:41:26.174256+02:00"}
{"file":"catchpointService.go","function":"github.com/algorand/go-algorand/catchup.(*CatchpointCatchupService).run","level":"warning","line":201,"msg":"catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts","name":"","time":"2020-09-26T12:41:26.174291+02:00"}
{"file":"persistence.go","function":"github.com/algorand/go-algorand/agreement.restore.func3","level":"info","line":146,"msg":"restore (agreement): crash state not found (n = 0)","time":"2020-09-26T12:41:26.174387+02:00"}
{"file":"persistence.go","function":"github.com/algorand/go-algorand/agreement.restore.func3.1","level":"info","line":127,"msg":"restore (agreement): resetting crash state","time":"2020-09-26T12:41:26.174416+02:00"}
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.(*NetworkFetcher).FetchBlock","level":"info","line":219,"msg":"networkFetcher.FetchBlock: asking client r-rh.algorand-mainnet.network:4160 for block 1983549","time":"2020-09-26T12:41:26.174662+02:00"}
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.(*NetworkFetcher).FetchBlock","level":"info","line":219,"msg":"networkFetcher.FetchBlock: asking client r-ti.algorand-mainnet.network:4160 for block 1983553","time":"2020-09-26T12:41:26.174693+02:00"}
{"file":"fetcher.go","function":"github.com/algorand/go-algorand/catchup.(*NetworkFetcher).FetchBlock","level":"info","line":219,"msg":"networkFetcher.FetchBlock: asking client r-db.algorand-mainnet.network:4160 for block 1983550","time":"2020-09-26T12:41:26.174753+02:00"}
This is the failure message exctracted from the logs above that caused the fast catchup to fail: {"file":"catchpointService.go","function":"github.com/algorand/go-algorand/catchup.(*CatchpointCatchupService).run","level":"warning","line":201,"msg":"catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts","name":"","time":"2020-09-26T12:41:26.174291+02:00"}
Thireus
changed the title
Fast Catchup not working on MainNet with v2.1.6 - catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts
Fast Catchup not working on MainNet with v2.1.5 - catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts
Sep 28, 2020
Subject of the issue
Despite several attempts to fast catchup using the last checkpoint from https://algorand-catchpoints.s3.us-east-2.amazonaws.com/channel/mainnet/latest.catchpoint, my non-relay node is unable to fast catchup.
catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts
Your environment
Steps to reproduce
Before:
Checkpoint set:
goal node catchup -d /var/lib/algorand 9290000#PQRSAA3T7USCIGCVRH7MEXWCEXTASVKPZNCH4UMSLYEXXHHK2D3Q
The node processes the following additional steps:
During the last few minutes of fast catchup:
Once "Catchpoint downloaded" is "done", fast catchup fails silently and the node returns to syncing from where it stopped before the fast catchup attempt:
Expected behaviour
Fast Catchup should work.
Actual behaviour
Fast Catchup fails silently after the "Catchpoint total blocks" process.
Those are the logs during the exact moment the fast catchup silently failed:
This is the failure message exctracted from the logs above that caused the fast catchup to fail:
{"file":"catchpointService.go","function":"github.com/algorand/go-algorand/catchup.(*CatchpointCatchupService).run","level":"warning","line":201,"msg":"catchpoint catchup stage error : processStageBlocksDownload failed after multiple blocks download attempts","name":"","time":"2020-09-26T12:41:26.174291+02:00"}
Related forum post: https://forum.algorand.org/t/fast-catchup-not-working-for-our-node-on-mainnet/1950/46
The text was updated successfully, but these errors were encountered: