You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I came across an issue while running a Validator node.
The issue is that every so often, the system finds a mismatch in a block and crashes.
"Corruption on data-block checksum mismatch error".
All the obvious thing, like deleting DB, re-syncing, starting a new validator, new accounts, reinstalling dependencies, etc. have been tried.
The mistake keeps reoccurring.
The blocks are different each time, and the head block that the chain is synced up to, is much higher than the mismatch.
In fact the validator works perfectly for a while, before falling.
NOTE: OFTEN the chain keeps on syncing (6 - 12 hours after) if I leave it, it of course, crashes again thereafter
Expected Behavior
Chain should be syncing stably and constantly
Reproduction
Not sure if its possible to reproduce on purpose.
But it has been mentioned in one way or another in some places across other DB's i.e. BTC, ETH:
This is how the mistake itself looks, where the chain crashes, although the block number can differ from time to time: CONSENSUS FAILURE!!! module=consensus err="leveldb/table: corruption on data-block (pos=399680): checksum mismatch, want=0xcf6de1ec got=0x99ba8252 [file=97839418.ldb]" stack="goroutine 1022538 [running]:\nruntime/debug.Stack(0xc0f3301870, 0xfd53c0, 0xc0578403c0)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9d\ngithub.com/tendermint/tendermint/consensus.
This is how the log looks after it tries to sync with the mismatch already in place:
(Different crush to the above, but it looks exactly the same) E[2019-10-01|07:01:45.455] Connection failed @ sendRoutine module=p2p peer=561ac562a79db5c7aebc4dbefd2d728836ce412e@0.0.0.0:26656 conn=MConn{93.125.26.210:26656} err="pong timeout" E[2019-10-01|07:01:45.455] Stopping peer for error module=p2p peer="Peer{MConn{93.125.26.210:26656} 561ac562a79db5c7aebc4dbefd2d728836ce412e out}" err="pong timeout" E[2019-10-01|07:01:45.539] Connection failed @ sendRoutine module=p2p peer=b34bcaa7536d0f7e09f775d56ceced3c29ba62c0@95.216.244.235:46656 conn=MConn{95.216.244.235:46656} err="pong timeout" E[2019-10-01|07:01:45.539] Stopping peer for error module=p2p peer="Peer{MConn{95.216.244.235:46656} b34bcaa7536d0f7e09f775d56ceced3c29ba62c0 out}" err="pong timeout" E[2019-10-01|07:02:00.651] Failed Sanity Check! Cant add old address to new bucket module=p2p book=/root/.cyberd/config/addrbook.json ka="&{Addr:b34bcaa7536d0f7e09f775d56ceced3c29ba62c0@95.216.244.235:46656 Src:6a0fb53aeedbd6882963413ad6cc5bd52cf01cdb@0.0.0.0:26656 Attempts:0 LastAttempt:2019-09-30 16:23:23.102398865 +0000 UTC m=+13558.026448007 LastSuccess:2019-09-30 16:23:23.102398865 +0000 UTC m=+13558.026448007 BucketType:2 Buckets:[50]}" bucket=102 E[2019-10-01|07:02:05.332] Error on broadcastTxCommit module=rpc err="Timed out waiting for tx to be included in a block
Current Behavior
I came across an issue while running a Validator node.
The issue is that every so often, the system finds a mismatch in a block and crashes.
"Corruption on data-block checksum mismatch error".
All the obvious thing, like deleting DB, re-syncing, starting a new validator, new accounts, reinstalling dependencies, etc. have been tried.
The mistake keeps reoccurring.
The blocks are different each time, and the head block that the chain is synced up to, is much higher than the mismatch.
In fact the validator works perfectly for a while, before falling.
NOTE: OFTEN the chain keeps on syncing (6 - 12 hours after) if I leave it, it of course, crashes again thereafter
Expected Behavior
Chain should be syncing stably and constantly
Reproduction
Not sure if its possible to reproduce on purpose.
But it has been mentioned in one way or another in some places across other DB's i.e. BTC, ETH:
Log
This is how the mistake itself looks, where the chain crashes, although the block number can differ from time to time:
CONSENSUS FAILURE!!! module=consensus err="leveldb/table: corruption on data-block (pos=399680): checksum mismatch, want=0xcf6de1ec got=0x99ba8252 [file=97839418.ldb]" stack="goroutine 1022538 [running]:\nruntime/debug.Stack(0xc0f3301870, 0xfd53c0, 0xc0578403c0)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9d\ngithub.com/tendermint/tendermint/consensus.
This is how the log looks after it tries to sync with the mismatch already in place:
(Different crush to the above, but it looks exactly the same)
E[2019-10-01|07:01:45.455] Connection failed @ sendRoutine module=p2p peer=561ac562a79db5c7aebc4dbefd2d728836ce412e@0.0.0.0:26656 conn=MConn{93.125.26.210:26656} err="pong timeout" E[2019-10-01|07:01:45.455] Stopping peer for error module=p2p peer="Peer{MConn{93.125.26.210:26656} 561ac562a79db5c7aebc4dbefd2d728836ce412e out}" err="pong timeout" E[2019-10-01|07:01:45.539] Connection failed @ sendRoutine module=p2p peer=b34bcaa7536d0f7e09f775d56ceced3c29ba62c0@95.216.244.235:46656 conn=MConn{95.216.244.235:46656} err="pong timeout" E[2019-10-01|07:01:45.539] Stopping peer for error module=p2p peer="Peer{MConn{95.216.244.235:46656} b34bcaa7536d0f7e09f775d56ceced3c29ba62c0 out}" err="pong timeout" E[2019-10-01|07:02:00.651] Failed Sanity Check! Cant add old address to new bucket module=p2p book=/root/.cyberd/config/addrbook.json ka="&{Addr:b34bcaa7536d0f7e09f775d56ceced3c29ba62c0@95.216.244.235:46656 Src:6a0fb53aeedbd6882963413ad6cc5bd52cf01cdb@0.0.0.0:26656 Attempts:0 LastAttempt:2019-09-30 16:23:23.102398865 +0000 UTC m=+13558.026448007 LastSuccess:2019-09-30 16:23:23.102398865 +0000 UTC m=+13558.026448007 BucketType:2 Buckets:[50]}" bucket=102 E[2019-10-01|07:02:05.332] Error on broadcastTxCommit module=rpc err="Timed out waiting for tx to be included in a block
Additional Information
System (local machine):
Some information from tendermint users (no one actually has a solution.
I have opened a similar issue on the tendermint git:
The text was updated successfully, but these errors were encountered: