"Head state missing, repairing chain" I lost many blocks , how can I get them back? #19124

relaxbao · 2019-02-19T09:05:58Z

Hi，

I run geth using supervisor on Ubuntu as a privatechain, the geth crashed at yesterday because generating DAG file need more CPU than our server

when it started , we lost many blocks , the blocknumber is from 1110002 back to 422522.
I need the lost blocks ,can I find them back?

looking forward to your response, thank you ~

System information

Geth version: 1.8.11
OS & Version: Linux Ubuntu

Expected behaviour

the height is 1110002

Actual behaviour

now the blockNumber is 422522, it shold be 1110002, how can I back to 1110002.
here is the log

Head state missing, repairing chain      number=1110002 hash=3610a2…5c104d
INFO [02-18|16:07:13] Rewound blockchain to past state         number=422522  hash=e6875a…a56025
INFO [02-18|16:07:13] Loaded most recent local header          number=1110002 hash=7620a2…5c104d td=875201842768
INFO [02-18|16:07:13] Loaded most recent local full block      number=422522  hash=e6875a…a56025 td=460514461017
INFO [02-18|16:07:13] Loaded most recent local fast block      number=1110002 hash=7620a2…5c104d td=875201842768

Steps to reproduce the behaviour

I restarted geth , but it's just the same

The text was updated successfully, but these errors were encountered:

karalabe · 2019-02-19T10:56:59Z

Geth keeps the state in memory (and garbage collects in memory) and only flushes every hour or so. If Geth crashes, whatever was in memory is lost.

In your case the block data is still there, just the historical states got lost so the chain rolled back. Normally this is not a big of an issue as when you reconnect to the network, Geth reprocesses from a past block. If you run a single node however, there might be no remote peer with the data.

Long term I think we should fix Geth so that it reprocesses the blocks locally instead of reaching out to the network. Short term that won't help you, but you could try to do a geth export chain.rlp 0 1110002 and then import into a different datadir (to make sure you don't lose any data).

tsujp · 2019-03-07T05:36:37Z

This only affects blocks right? Not the keystore etc?

relaxbao · 2019-03-07T05:51:01Z

This only affects blocks right? Not the keystore etc?

yes, It only affects blocks.

acctually, I can still get the Transactions in the higher blocks , but when it starts mining , the blockNumber is increasing from the lower number 422522.

on the other hand, the storages in the contract were back to the status in the blockNumber 422522 . while i need the status in the 1110002

can I find some way to solve this problem?

karalabe · 2019-03-07T11:10:14Z

I wrote in my previous comment that you could have exported your chain and fixed it that way. If you started mining on top, it's probably way too messy now to try and extract the correct blocks.

@hito This only affects the state, yes.

relaxbao · 2019-03-14T03:29:24Z

Geth keeps the state in memory (and garbage collects in memory) and only flushes every hour or so. If Geth crashes, whatever was in memory is lost.

In your case the block data is still there, just the historical states got lost so the chain rolled back. Normally this is not a big of an issue as when you reconnect to the network, Geth reprocesses from a past block. If you run a single node however, there might be no remote peer with the data.

Long term I think we should fix Geth so that it reprocesses the blocks locally instead of reaching out to the network. Short term that won't help you, but you could try to do a geth export chain.rlp 0 1110002 and then import into a different datadir (to make sure you don't lose any data).

Thank you so much , I think it's a great way to save all the data .
But after export my data , and import it to a new datadir , i found an Error.

INFO [03-14|11:23:04] Imported new chain segment               blocks=2500 txs=43   mgas=2.321  elapsed=2.286s mgasps=1.015  number=420000 hash=ce1ed8…6dc233 cache=1.12mB
INFO [03-14|11:23:06] Imported new chain segment               blocks=2500 txs=5    mgas=1.446  elapsed=2.098s mgasps=0.689  number=422500 hash=0b6e6d…117d9e cache=1.12mB
ERROR[03-14|11:23:06] Non contiguous block insert              number=423619 hash=100756…a3b36b parent=ff2b53…f13d44 prevnumber=423618 prevhash=8d7930…bcebb6
ERROR[03-14|11:23:06] Import error                             err="invalid block 423619: non contiguous insert: item 1117 is #423618 [8d793034…], item 1118 is #423619 [10075641…] (parent [ff2b53e8…])"
INFO [03-14|11:23:06] Writing cached state to disk             block=422500 hash=0b6e6d…117d9e root=e6191e…df4f85

here is my block 423618 and 423619 and the parentBlock of 423619, is there something wrong with it ?

> eth.getBlock(423618)
{
  difficulty: 941215,
  extraData: "0xd88301080b846765746888676f312e31302e32856c696e7578",
  gasLimit: 4294967295,
  gasUsed: 0,
  hash: "0x8d79303491e8384dedb57812e5c8eefd83d8125e5c287a7009000ed292bcebb6",
  logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  miner: "0xa969f32fcdc83a6039286f267f2e7a246b4b030a",
  mixHash: "0xad860354cf2e2b2a9e2c13bb72d1abef7606ea6e1d60f4d995e0e0e09f159f22",
  nonce: "0x230828cb4887e0b0",
  number: 423618,
  parentHash: "0xe2025cb6ddf79f3dc2414301b715b54a9aad10b0f25e494882133c2551377493",
  receiptsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
  sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
  size: 540,
  stateRoot: "0xce0241488e1af373ffb9ae91eaf74cbeacb7984c0a3e293c64f676eed1c36fc1",
  timestamp: 1550546650,
  totalDifficulty: 460584013838,
  transactions: [],
  transactionsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
  uncles: []
}

> eth.getBlock(423619)
{
  difficulty: 1000444,
  extraData: "0xd88301080b846765746888676f312e31302e32856c696e7578",
  gasLimit: 4294967295,
  gasUsed: 0,
  hash: "0x10075641add742f1447a67c9fc1136a5492a9b622e883042f38864ba33a3b36b",
  logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  miner: "0xd271baa1ed277c3730ad5b88ef97a5921d7a8c77",
  mixHash: "0x59f649b3123e348d84c889ae2e50f9240f1b826f65d2818c803ef94c15e6c84e",
  nonce: "0x0d7db510f26a7025",
  number: 423619,
  parentHash: "0xff2b53e8424ddaa724a9ab3561ef44c8dfee4d260b938b5212813cc379f13d44",
  receiptsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
  sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
  size: 540,
  stateRoot: "0xba6860c90a95997f28b4da3c7f42cbf8d2e48e9f288dfeefd7e374b164eff745",
  timestamp: 1536726254,
  totalDifficulty: 460585640739,
  transactions: [],
  transactionsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
  uncles: []
}

> eth.getBlock("0xff2b53e8424ddaa724a9ab3561ef44c8dfee4d260b938b5212813cc379f13d44")
{
  difficulty: 999952,
  extraData: "0xd88301080b846765746888676f312e31302e32856c696e7578",
  gasLimit: 4294967295,
  gasUsed: 0,
  hash: "0xff2b53e8424ddaa724a9ab3561ef44c8dfee4d260b938b5212813cc379f13d44",
  logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  miner: "0x10592c5155ad6655189bac1a61af49083e37152c",
  mixHash: "0xc52cd8c7110438d401e37dd359e31dfd719ec8ff20bfc7827697b40d9afc3ad4",
  nonce: "0x5e46c9d131d5a90d",
  number: 423618,
  parentHash: "0xc8a26455ee1781826047fe913824669fd3f30646b59fb226f917869f3cbcafb1",
  receiptsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
  sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
  size: 540,
  stateRoot: "0xddbf4b764d59a7c11cf453a5bfaa90ad5b45ced46c159acacd82e550da59e55d",
  timestamp: 1536726248,
  totalDifficulty: 460584640295,
  transactions: [],
  transactionsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
  uncles: []
}

I used the following steps to recover my data, is there something wrong with it ?

geth --datadir /home/workspace/recoverdatas/rdata export /home/workspace/recoverdatas/rdata/chain423619.rlp 0 423619

geth --datadir "/home/workspace/recoverdatas/datanew" init "/home/workspace/data/conf/genesis.json"

geth --datadir /home/workspace/recoverdatas/datanew import /home/workspace/recoverdatas/rdata/chain423619.rlp

…th-persistent-disks ETH Deployment: kill StatefulSets Every now and again we were losing the entire chain state in our `keep-dev` env. After tracking things down (see thread), determined it happens when all of our participating eth nodes get shot by Kube. This shouldn't matter since were using persistent disks with the eth `datadir` set there. It turns out this isn't enough. We lose state in memory when the geth process is terminated. Per ethereum/go-ethereum#19124. This is normally fine because there's at least one other node on the real networks that can fill in the blanks when you come back up. In our case there's not always another node, we were running only 2. Now we're running 6 to try and hedge against all getting shot at once. Here we've removed the `StatefulSet` deployments because they're doing us no good.

BeOleg · 2019-04-28T09:25:30Z

I get this as well every time I restart the node via docker compose

root@nexwallet-eth3:/mnt/STORAGE/WALLETS# cat /var/lib/docker/containers/23406ad9bc38a5b9ab3c8e342295c91f2550d3b5c16f00d681f181aed9721c0d/23406ad9bc38a5b9ab3c8e342295c91f2550d3b5c16f00d681f181aed9721c0d-json.log | grep 'Head state missing'
{"log":"WARN [04-24|12:13:48.884] Head state missing, repairing chain      number=7623347 hash=b6c254…3a0dad\n","stream":"stderr","time":"2019-04-24T12:13:48.885446409Z"}
{"log":"WARN [04-25|19:19:10.008] Head state missing, repairing chain      number=7638271 hash=c73676…151eb8\n","stream":"stderr","time":"2019-04-25T19:19:10.008881355Z"}
{"log":"WARN [04-26|08:52:25.125] Head state missing, repairing chain      number=7641791 hash=c263d9…dfbe79\n","stream":"stderr","time":"2019-04-26T08:52:25.125447127Z"}
{"log":"WARN [04-28|09:21:28.698] Head state missing, repairing chain      number=7655058 hash=f11663…507ae5\n","stream":"stderr","time":"2019-04-28T09:21:28.698886159Z"}

Or if it restarts due to some fault, I lost a day or 2 of blocks.
How to solve this? how to properly restart?

holiman · 2019-04-29T07:45:52Z

@BeOleg I see that you've opened #19504 , so let's continue that one there.

@relaxbao yes, something is wrong with it! It seems to have lost track of the canon chain, and there's a discrepancy in the chain. This is very interesting, however, since you're on a very old version 1.8.11, I doubt we'll be able to go to the bottom of that.

holiman · 2019-05-21T08:51:23Z

@relaxbao your scenario was fixed in #19514

relaxbao · 2019-07-04T03:25:26Z

@relaxbao your scenario was fixed in #19514

@holiman Thank you very much . but I still have to questions :

Can I get the blocks back if I stay in the version 1.8.11 ?
Is there someway to avoid this happen again ?

…th-persistent-disks ETH Deployment: kill StatefulSets Every now and again we were losing the entire chain state in our `keep-dev` env. After tracking things down (see thread), determined it happens when all of our participating eth nodes get shot by Kube. This shouldn't matter since were using persistent disks with the eth `datadir` set there. It turns out this isn't enough. We lose state in memory when the geth process is terminated. Per ethereum/go-ethereum#19124. This is normally fine because there's at least one other node on the real networks that can fill in the blanks when you come back up. In our case there's not always another node, we were running only 2. Now we're running 6 to try and hedge against all getting shot at once. Here we've removed the `StatefulSet` deployments because they're doing us no good.

sthompson22 mentioned this issue Mar 27, 2019

ETH Deployment: kill StatefulSets keep-network/keep-core#690

Merged

holiman mentioned this issue May 1, 2019

core: add test to demonstrate canonicality confusion #19514

Merged

adamschmideg added the status:triage label May 21, 2019

adamschmideg closed this as completed May 21, 2019

xuzhitong mentioned this issue Jul 2, 2019

Head state missing, repairing chain Consensys/quorum#764

Closed

fjl removed the status:triage label Aug 27, 2020

easeev mentioned this issue May 12, 2021

All BSC nodes are OFF SYNC bnb-chain/bsc#189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Head state missing, repairing chain" I lost many blocks , how can I get them back? #19124

"Head state missing, repairing chain" I lost many blocks , how can I get them back? #19124

relaxbao commented Feb 19, 2019 •

edited

Loading

karalabe commented Feb 19, 2019 •

edited

Loading

tsujp commented Mar 7, 2019

relaxbao commented Mar 7, 2019

karalabe commented Mar 7, 2019

relaxbao commented Mar 14, 2019 •

edited

Loading

BeOleg commented Apr 28, 2019

holiman commented Apr 29, 2019

holiman commented May 21, 2019

relaxbao commented Jul 4, 2019 •

edited

Loading

"Head state missing, repairing chain" I lost many blocks , how can I get them back? #19124

"Head state missing, repairing chain" I lost many blocks , how can I get them back? #19124

Comments

relaxbao commented Feb 19, 2019 • edited Loading

System information

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

karalabe commented Feb 19, 2019 • edited Loading

tsujp commented Mar 7, 2019

relaxbao commented Mar 7, 2019

karalabe commented Mar 7, 2019

relaxbao commented Mar 14, 2019 • edited Loading

BeOleg commented Apr 28, 2019

holiman commented Apr 29, 2019

holiman commented May 21, 2019

relaxbao commented Jul 4, 2019 • edited Loading

relaxbao commented Feb 19, 2019 •

edited

Loading

karalabe commented Feb 19, 2019 •

edited

Loading

relaxbao commented Mar 14, 2019 •

edited

Loading

relaxbao commented Jul 4, 2019 •

edited

Loading