New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating snapshots makes geth slow #23873
Comments
found a related issue: #23114 ..compared to that (& if I got it right), in my case the blocks complete faster. I set cache to 4096 & now the log look slightly different.. no crazy estimated times anymore, but block sync still lags behind.
EDIT I take that back, same error still:
|
@elsampsa Looks like you are running an archive node. Since you are upgrading from a legacy database(by enable the snapshot), so it's extremely slow for regeneration. In order to speedup the snapshot generation, one trick you can apply is:
After that your snapshot will be rebuilt. However with the help of existent snapshot data it can save you a lot of time for regeneration. |
My geth cli arguments are:
According to geth:
So I'd need to use "archive" to have an archive node..? I think my node is just a regular "full" node, not an archive node..? But anyway, thanks for the tip. I will try it for sure. |
Ah right. It's a normal full node. My bad. |
This is interesting -- block import times are 12-13 seconds per block. There is a thrashing-component when doing snapshot generation simultaneously as importing blocks, which I tried to address in #23728 . However, that PR is incomplete and would not help you. So blocks are being imported really slowly. A few things would be interesting to know:
Also, @rjl493456442 's trick is good. Another thing you might try is to run it in A third thing we could test is to hook it up to our graphing endpoint, and see there if we can detect what the anomaly comes from. |
Chaindata in Read speed test:
Write speed test I'd rather not do.. Disk usage:
It's a "full" node that's been running for few years.. never did re-fast-sync or anything like that. Maybe that's the problem? It's now doing that offline snapshot generation as you suggested. Will let you know then. |
Yes, the size has grown to |
I wish I could connect with my own node to my own node for a fast re-sync (a slight pun intended). If I do that snapshot import trick as discussed above, that will solve all my performance issues I guess..? |
Unfortunately not. |
So, the only way to keep the db "non-bloated" & fast, is to perform periodically a "fast sync". hmm.. if everyone does that, then it means that eventually the global network will only consist of fast-synced nodes. I guess that't not a problem..? I guess somebody has the complete state trie somewhere.. eeh.. what do I know. I'm just an old guy with an old laptop. :-) |
Just to clarify -- it doesn't matter if anybody does. If I want to know what the state was at block 5M, then I can always re-execute from genesis to obtain it. Might take me a few days (or a week for later >12M blocks), but it's not 'gone'. State is derived from block progression |
@elsampsa Yes it's not good for accumulating "junk" states all the time. However I am working a new state scheme recently, with the new scheme all the stale states will be pruned by default. It means you can enjoy a full node with low state growth rate. Unfortunately this big change still needs time to be finished/reviewed/tested. Hopefully we can ship it as soon as possible. |
@elsampsa In case you want to play my trick, don't forget to sync your new node with this mode "syncmode=snap", not the legacy fast sync! Also I don't think your node can serve the new snap sync since your snapshot is not generated yet. But snap sync is faster than fast sync, so even sync with public network is supposed to be working. |
Thank you for taking the time to explain me the basics. Now I (sort of) get that.. Slowdown: when a new block arrives, it must be verified. For verification, we need the state. Geth needs to crunch throught the whole database to get the (partial?) state (i) for the accounts that are being permutated by the new block. However, if there is a snapshot of the complete state, then no need to crunch through the whole database (=fast). Or (ii) alternatively to (i), for a full node that's been running for ages, there is a complete state trie, but it's been constructed in an unoptimal way..? So geth needs more time to crunch through it. Looking forward to that new state scheme update..! Please feel free to close this ticket now. :) (woops.. I closed it accidentally myself) |
My explanation was not fully correct. Whenever we want to process a new block, we need to read a lot of state. Historically, we've looked up e.g. an account by iterating the trie, meaning that each account lookup takes maybe ~7 db lookups. With snapshots, it's more of a O(1) db-lookup, so that's faster. However, when the trie root is calculated, we still need to lookup a lot of trie nodes. So snapshots makes it faster, but if you have a very large DB, even if we reduce the number of lookups, it may still be too slow. Because the db becomes less efficient, basically - the trie keys are spread all over the disk, and the caches/bloomfilters aren't as effective as they would be on a smaller dataset. |
"the trie keys are spread all over the disk".. this sound similar normal filesystem fragmentation.. could one make a "defrag" on the trie..? |
Same issue: my |
I'm experiencing a slow-down due to snapshot generation. Snapshot generation takes ages (the estimated time just keeps growing) and the blockchain sync slows down as well.
System information
Geth version:
Geth/v1.10.10-stable-bb74230f/linux-amd64/go1.15.6
OS & Version: Linux
Commit hash : (not
develop
)Expected behaviour
Actual behaviour
Steps to reproduce the behaviour
Run geth in a laptop with:
Backtrace
ETA for snapshots keeps on growing, so does the "age" for the chain segments.
The text was updated successfully, but these errors were encountered: