-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tips for running a BSC full node #502
Comments
@guagualvcha thank you. Could you please elaborate on what exactly |
ERROR[11-02|06:02:55.001] Failed to open snapshot tree err="head doesn't match snapshot: have 0x5c17a8fc0164dabedd446e954b64e8a54fc7c8b4fee1bbd707c3cc3ed1e45fff, want 0x431565cee8b7f3d7bbdde1265304fa4574dc3531e511e9ffe43ae79d28e431d6" |
@guagualvcha |
I dont want to be rude here, but bsc is in real danger. These past couple of weeks was a nightmare for me: cant resync. I started digging, got in relation with admins etc and it's not just me. Geth 1.1.3 was a nightmare and 1.1.4 not helping much, the solution you give here doesnt solve anything. If you guys dont figure out the syncing issue with a proper patch we dont have a bring future. Yesturday i tried to dl the EU snap, it was corrupted and the full state of today seems corrupted also (reteying a download) |
Agreed, its a total mess. |
Ethereum is a grid network, while BSC is an areatus network. It means the transactions flow from different fullnodes all around the world to the 21 validators. Usually validators are guarded by a sentry node who joins the network directly. As the transaction volume on BSC is much larger, the sentry nodes are under pressure dealing with the transaction exchange protocol. We extend the protocol so that any full nodes can claim that they are not interested in pending transactions as they are not validator/miner, it will save a lot of network and calculation resources. It can be enabled by adding |
No need to wait. |
sorry about that. As I know the ops is uploading a new snapshot after they are aware of that, some monitors to ensure the data integrity. For the syncing issue, would you open the pprof port of your nodes and do |
Is v1.1.3 or v1.1.4 recommended now? |
I have upgrade to 1.1.4, and started node with snapshot data. |
@guagualvcha Is the pruning done?
|
well, mirrors are gone, and for now i dont have a node as i can't sync them from genesis (and can't download snapshots). I tried your method again and got (tryed twice): To be honest, it's very confusing, i used to fast resync from genesis, it was 6-9hours to import state entries then 3 days to download everything. |
I totally agree with that as we face exactly the same scenario. What is the worst is the total lack of communication in comparaison with the size of the project. |
@Lajoix @jcaffet If you're constantly having sync issues and can't catch up to network - review your hardware setup, maybe spin up a different instance on a different region, maybe you just got a busy host, who knows. |
Thanks for your feedback. We have aws i3en.xlarge instances with xfs ... but no raid0 yet. |
I'm running a 24 cores 64 gb 1Tb NVMe ssd on ubuntu with 1gbps connection. I'm in ext4, but i'm not sure changing to xfs could really make the difference. I used to resync from genesis before with no issue.
i'm very confused to be honest. |
This timeline lines up with my experience as well. Node ran fine for months until about a week ago. Current server is ryzen 5950 with 128gb ram + nvme ssd on raid1. Seen others have no issues with lesser specs, but I have been unable to keep up with the current block, always about 50 behind. |
yes |
Would it be possible to have some guidance on what tools, if any, we can use to analyse the profile_60s.out file ourselves? |
so,its means I cannot subscribe pendingTransactions? |
After upgrading to version v1.1.3 and running the service with your suggested settings. I have stuck in syncing, error log in brief shows as below:
|
Totally agree |
I'm following all of these recommendations, and I'm using the recommended AWS hardware for a validator node, even though I'm only operating a regular node. My node still can't catch up after starting from the latest snapshot this morning. The likely problem is that there are not enough healthy nodes in the network providing blocks to nodes that go out of sync. Do you have any suggestions for this problem? Can Binance provide healthy nodes to ensure that others can sync? |
It's probably more of a problem of too many blocks and/or too large a size and it takes too long to process them versus any propagation issues. |
No it's not. I've regularly seen higher processing throughput on the same node hardware under healthier network conditions. It's reported as mgasps, gas units per second during the "imported new chain segment message." .Increasing the block size wouldn't change the node's rate of processing gas units. The bottleneck is that the node likely isn't receiving enough new blocks to reach it's throughput capacity. I'm using the recommended validator hardware. |
Hi guys, Is the pruning command line still the same? According to this release, there is a new data prune tool introduced. I am trying to prune on my BSC full node but i'm unsure of the command. Any help will be appreciated. Thanks! |
We've successfully started a full node from snapshot. The snapshot started at 1.5T - is this the minimal size at this time? |
BNB48Club/bsc-snapshots contributing another dump of snapshot (543.61G) |
Cool! How do you make these? Could you share any links to docs or explanation? @du5 |
How it can be that much small? I have pruned my node and it only reduced to 1.3TB |
follow the official documentation |
I've successfully setup and ran my fullnode using BNB48Club's snapshots and have been able to get it to sync. Is this due to the snapshot that was downloaded? Or could it be configuration of the node? For reference, my startup parameters are:
IOPS and mgasps are exceeding (averaging around 300 mgasps - so not synchronization is not the issue). Likewise, my config file is here:
Any advice on how to fix this issue? The only possible thing I could think of was the snapshot not having the correct indexes - which doesn't make a ton of sense and probably wouldn't even allow sync. |
@tpalaz the reason for the small size is that the historical transactions are prune |
@tpalaz If the transaction you need to retrieve is a long time past, it is recommended to use erigon and turn off prune, it will surprise you with its efficiency |
I got it running very quickly. Thanks for sharing. For our purpose we only need recent blocks so it fits very well. |
dear @guagualvcha , the mgas value of one of our nodes is below 50, attach is the profile file... please help checking... |
Excuse me, it's not fully synced, can I restart full mode from 0? How to set starting from 0? |
BAD BLOCK with version v1.1.12. I have installed new version and it was running fine for a day. On block 20,023,433 it stopped with BAD BLOCK message and can not move forward. it was running with --diffsync I have tried solution recommended in #628. I was trying to restart it with --snapshot=false and then with --snapshot=false. It did not help. Any suggestion how to fix it? Any technical discord forum related to BSC? The invite https://discord.com/invite/binancesmartchain does not work anymore. |
After syncing from snapshot geth-20220816.tar.lz4 I can get block data after block number "19144096". All eth.getBlock from Block 1 to 19144096 return null, however, from 19144097 to lastBlockNumber it returns data. Geth: 1.1.12 running command: ./geth_linux --config ./config.toml --datadir ./mainnet --cache 100000 --rpc.allow-unprotected-txs --txlookuplimit 0 --http --maxpeers 100 --ws --syncmode=full --snapshot=true --diffsync Is snapshot is a full copy of all blocks(from genesis to now) or it's just a copy of lastest blocks? |
snapshot probably contains only the last 128 blocks. It's for sure not the archive (all the data) snapshot, only the last blocks |
after release v1.1.8 bsc official snapshots are block-pruned, before that it was just state-pruned. thats the reason blockdetails are missing |
Snapshots start from the latest snapshot and are not started from the original snapshot.
通过 [Proton Mail](https://proton.me/) 安全邮箱发送。
…------- Original Message -------
2022年8月19日 星期五 4:19 下午,Suyog ***@***.***> 来信:
> After syncing from snapshot [geth-20220816.tar.lz4](https://download.bsc-snapshot.workers.dev/geth-20220816.tar.lz4) I can get block data after block number "19144096". All eth.getBlock from Block 1 to 19144096 return null, however, from 19144097 to lastBlockNumber it returns data.
>
> Geth: 1.1.12 OS: Ubuntu 20.04 Snapshot: geth-20220816.tar.lz4
>
> running command: ./geth_linux --config ./config.toml --datadir ./mainnet --cache 100000 --rpc.allow-unprotected-txs --txlookuplimit 0 --http --maxpeers 100 --ws --syncmode=full --snapshot=true --diffsync
>
> Is snapshot is a full copy of all blocks(from genesis to now) or it's just a copy of lastest blocks?
after release v1.1.8 bsc official snapshots are block-pruned, before that it was just state-pruned. thats the reason blockdetails are missing
—
Reply to this email directly, [view it on GitHub](#502 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AVH7RJX6X4AH6NL6NR5IRYTVZ47PPANCNFSM5HEABJYQ).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
replaced by: #1947 |
Some of the enhancements below can address the existing challenges with running a BSC full node:
Binary
All the clients are suggested to upgrade to the latest release. The latest version is supposed to be more stable and get better performance.
Storage
According to the test, the performance of a fullnoded will degrade when the storage size exceeds 1.5T. We suggest the fullnode always keeps light storage by pruning the storage.
Following are the steps to do prune:
nohup geth snapshot prune-state --datadir {the data dir of your bsc node} &
. It will take 3-5 hours to finish.The maintainers should always have a few backup nodes so that you can switch to the backup ones when one of them is pruning.
The hardware is also important, make sure the SSD meets: 2T GB of free disk space, solid-state drive(SSD), gp3, 8k IOPS, 250MB/S throughput, read latency <1ms.
Light Storage
When the node crashes or is force killed, the node will sync from a block that was a few minutes or a few hours ago. This is because the state in memory is not persisted into the database in real time, and the node needs to replay blocks from the last checkpoint. The replaying time dependents on the configuration
TrieTimeout
in the config.toml. We suggest you raise it if you can tolerate with long replaying time, so the node can keep light storage.Performance Tuning
In the logs,
mgasps
means the block processing ability of the fullnode, make sure the value is above 50.The node can enable the profile function by
—pprof
Profile by
curl -sK -v http://127.0.0.1:6060/debug/pprof/profile?seconds=60 > profile_60s.out
, and the dev community can help to analyze the performance.New Node
If you build a new BSC node, please fetch snapshot from: https://github.com/binance-chain/bsc-snapshots
The text was updated successfully, but these errors were encountered: