Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck syncing #105

Closed
Lodimup opened this issue Sep 5, 2023 · 10 comments
Closed

Stuck syncing #105

Lodimup opened this issue Sep 5, 2023 · 10 comments

Comments

@Lodimup
Copy link

Lodimup commented Sep 5, 2023

The node is always at around

Latest synced block behind by: 724 minutes

CPU 13900K
RAM 16GB
NVME

@ism
Copy link

ism commented Sep 12, 2023

I can confirm this issue, we started node 14 days ago and it still can not finish full sync. Any tips to troubleshoot and finish syncing?

@wbnns
Copy link
Member

wbnns commented Sep 18, 2023

@Lodimup @ism

Heya, what are you all using for the L1 RPC? Also, how much disk space do you all have available? Are you all syncing mainnet or testnet?

@ism
Copy link

ism commented Sep 19, 2023

@Lodimup @ism

Heya, what are you all using for the L1 RPC? Also, how much disk space do you all have available? Are you all syncing mainnet or testnet?

thank you for reply! :)

i am using
OP_NODE_L1_ETH_RPC=https://ethereum.publicnode.com
disk size is 1.8 TB SSD

are those delays related to L1 network?

@laptrinhbockchain
Copy link

laptrinhbockchain commented Sep 19, 2023

I also have the same problem, the sync speed seems to be slower than the new block generation speed. If this continues, the synchronization process will never be completed.
Here is the log I measured:

Time to check: 2023-09-19 10:43:45
Latest synced block behind by: 1032 minutes
-------------------------------------
Time to check: 2023-09-19 10:58:45
Latest synced block behind by: 1020 minutes
-------------------------------------
Time to check: 2023-09-19 11:13:46
Latest synced block behind by: 1019 minutes
-------------------------------------
Time to check: 2023-09-19 11:28:46
Latest synced block behind by: 1023 minutes
-------------------------------------
Time to check: 2023-09-19 11:43:46
Latest synced block behind by: 1030 minutes
-------------------------------------
Time to check: 2023-09-19 11:58:51
Latest synced block behind by: 1038 minutes
-------------------------------------
Time to check: 2023-09-19 12:13:55
Latest synced block behind by: 1047 minutes
-------------------------------------
Time to check: 2023-09-19 12:28:56
Latest synced block behind by: 1057 minutes
-------------------------------------
Time to check: 2023-09-19 12:44:00
Latest synced block behind by: 1060 minutes
-------------------------------------
Time to check: 2023-09-19 12:59:00
Latest synced block behind by: 1064 minutes
-------------------------------------
Time to check: 2023-09-19 13:14:00
Latest synced block behind by: 1066 minutes
-------------------------------------
Time to check: 2023-09-19 13:29:02
Latest synced block behind by: 1069 minutes
-------------------------------------
Time to check: 2023-09-19 13:44:07
Latest synced block behind by: 1073 minutes
-------------------------------------
Time to check: 2023-09-19 13:59:10
Latest synced block behind by: 1074 minutes
-------------------------------------
Time to check: 2023-09-19 14:14:10
Latest synced block behind by: 1070 minutes

I also see that the following error often appears in the log. I have changed to many different Ethereum RPCs but it's still the same:

base-node-node-1  | t=2023-09-19T10:40:46+0000 lvl=warn msg="Failed to share forkchoice-updated signal" state="&{HeadBlockHash:0x2dd80513c3f4c954b469ac3935feea5c9f9bb74e0381ab2817e7c604989351a6 SafeBlockHash:0x2dd80513c3f4c954b469ac3935feea5c9f9bb74e0381ab2817e7c604989351a6 FinalizedBlockHash:0x504af924e026880c56992f4efde5e016a8493375645eb635b2dd5f47f11280e8}" attr="&{Timestamp:0x6508881f PrevRandao:0xca73e67ad270d886396e27f89d1fb4143a68c53b8606856b22cc90b2710e12ee SuggestedFeeRecipient:0x4200000000000000000000000000000000000011 Transactions:[0x7ef90159a0ddf3d36766b23572ac0ca45729cff3aa438b912fc879f70a639c47772ada633f94deaddeaddeaddeaddeaddeaddeaddeaddead00019442000000000000000000000000000000000000158080830f424080b90104015d8eb90000000000000000000000000000000000000000000000000000000001152a3d0000000000000000000000000000000000000000000000000000000065088797000000000000000000000000000000000000000000000000000000073bf3882e69f036d5d59619b9d9b0c00df8112f9c3eac226113d3d503c7718d6b109c1b8600000000000000000000000000000000000000000000000000000000000000000000000000000000000000005050f69a9786f081509234f1a7f4684b5e5b76c900000000000000000000000000000000000000000000000000000000000000bc00000000000000000000000000000000000000000000000000000000000a6fe0 0x7ef90227a003b15add2088cd77f276da3d15b6b3fabed5e5bbc7cfe7f22aa5c7cd247279e094977f82a600a1414e583f7f13623f1ac5d58b1c0b944200000000000000000000000000000000000007876a94d74f430000876a94d74f4300008304658880b901c4d764ad0b0001000000000000000000000000000000000000000000000000000000018d8c0000000000000000000000003154cf16ccdb4c6d922629664174b904d80f2c350000000000000000000000004200000000000000000000000000000000000010000000000000000000000000000000000000000000000000006a94d74f430000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000c000000000000000000000000000000000000000000000000000000000000000c41635f5fd000000000000000000000000f8ce08466c86e90600bb0f5263b610fb9e5e7152000000000000000000000000a541b05f878329b3d10bed978d63a5fe32cecb20000000000000000000000000000000000000000000000000006a94d74f4300000000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000000b77281eaecb20000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0xf9026f83103ffd8407839ac38302362794aab5a48cfc03efa9cc34a2c1aacccb84b4b770e480b902046c459a2800000000000000000000000038de71124f7a447a01d67945a51edce9ff4912510000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000006508dc7d00000000000000000000000000000000000000000000000000000000000001400000000000000000000000000000000000000000000000000000000000000084704316e5000000000000000000000000000000000000000000000000000000000000006f93ab8d764cac5e1372b261a90865a8b2e0b8cda9e3d95bd8387c27347a34318e000000000000000000000000000000000000000000000000000000000000001493ab8d764cac5e1372b261a90865a8b2e0b8cda9e3d95bd8387c27347a34318e0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000821110eb9e10e520159839642804323d30e2236542fabb87cb30a1d42b6a2b057873cf849e83210583c3aaf1505a41a93a32a9a2eed712a12ead91b2aaed208e5c1c0df5ba627cb650abcac992f263c62ab57af4432e78d5c190e31cf345d8f41bd31ee2df3b323eb15c6a569ddd40f186ece1f80d4d04c7a5dc187c8d825c2cebba1c00000000000000000000000000000000000000000000000000000000000082422ea08d045faec4ebdc63d0347afb63035cdad671bc106b2f209abd0af1fb5a2f6e35a052aac62a178b1d18372d6a74ceef4d841bc9cc1e28b91e9ba774b6d6b3d4676f 0x02f8bb82210582022f8405f5e100845a0562d8830156ef94cf205808ed36593aa40a44f10c7f7c2f67d4a4d487a51cc429efb000b8446945b123000000000000000000000000a63e466faaa29e7288f290d2e81e21587b9b98db0000000000000000000000000000000000000000000000000000000000000001c001a0ea6ec514394a532e90be47b59ccb487001230fb5f130a977d500c48d253c49c1a041529d4455534f36469f193a5eb6d909e1a7b831b862ae017fc3fd1b52180653 0x02f8b2822105648405f5e100845a0562d88301313594cf205808ed36593aa40a44f10c7f7c2f67d4a4d480b844b51d053400000000000000000000000005f92318b0d0b588a237ec49a7179ab9c58864000000000000000000000000000000000000000000000000000000000000000001c001a01da56e42b5793fe8e384890495a92237f832b1910d5be2fbf62bcdd3880cd9b0a0622174bf32e35ba0e2157f5e18e31971cfa29dff40ae85ae46fbabb41786ec7c 0x02f8bb8221058205dd8405f5e100845a0562d8830156ef94cf205808ed36593aa40a44f10c7f7c2f67d4a4d487186cc6acd4b000b8446945b123000000000000000000000000ea7b37c517998e28acca2184b386a75f79991d920000000000000000000000000000000000000000000000000000000000000001c001a0739eb7a6c35c2178c6445184cdfc21111ba40652878c6cd9618cb35b56784c41a071166624a253d9bdb843766d7e9199cf986d64d220cba46794c856ea9ab262ee 0x02f8d6822105830102168405f5e0ff8502540be400830249f094ff231524719db94a0ef08a88fc4e6939134eade880b864eca0ca01000000000000000000000000b1964a4a2672b6bbc3c6358ec70a14c1468139b60000000000000000000000000000000000000000000000000000000000000028000000000000000000000000000000000000000000000000d02ab486cedc0000c001a06785418b6eb17c13a146fe3f72a87649a436c0160e6ba4a6dbd074f8dd062d59a05eb9938491ee69d1e54595723df87b26c0b38b9c741206f24c36ffcbc8dd559d 0x02f8908221058281ef8405f5e0ff8502540be400830557309488e6aeb90795f586542b4062cb9f853a5582966c80a0000000000000000000001e28b1964a4a2672b6bbc3c6358ec70a14c1468139b6c001a0d1086d202b6c6efd78824c17ae91cabce5c8b0c115788cb8a0fec174bcf5c53fa071a4e4d4190e247eb1d5777d3f5659fea1d02d8cd6f461421eeabb9a5f1a5fb7 0x02f8f4822105822639839c3cdc839c3cdc830830d7940000000000cc2f88402b31ca4045f37bf79f3e5980b8860000000003000006d47267f8757b251a2d088964a3ec466762d6e0eb42000000000000000000000000000000000000060003304da90d2bd672c2fd4a7aad55176ab27417f71eeb466342c4d449bc9f53a865d5cb90586f40521500154c36388be6f416a29c8d8eee81c771ce6be14b18d9aaec86b65d86f6a7b5b1b0c42ffa531710b6ca0015c001a0cde42beede839201228bf1912e067e7b4eea2e31496f50b9363f6cc9d04b6c66a01db57802c6cb613f242afbb1ffafc4f8789880da307b7a42f85c627c7e32d601 0x02f8b282210581b1830186a0834c4b408401318230940000000000771a79d0fc7f3b7fe270eb4498f20b80b8448973e2cb000000000000000000000000000000000000000000000000000000000000006400000000000000000000000000000000000000000000000000000000000000b0c001a00e7acef7ba8076eadeacc8435a87cac7f06d1e264e0fc5b0861d71d834a4e36ca07d24230a4833f08f5bd4616b13b4e70d4b53f2c95cfc51374582217171d9a33a] NoTxPool:true GasLimit:0x1c9c380}" err="Post \"http://geth:8551\": context deadline exceeded"
base-node-node-1  | t=2023-09-19T10:40:46+0000 lvl=warn msg="Derivation process temporary error"     attempts=1 err="engine stage failed: temp: temporarily cannot insert new safe block: failed to create new block via forkchoice: Post \"http://geth:8551\": context deadline exceeded"

Server information is as follows:

  • Type: Server AWS EC2 t2.large
  • 2 vCPUs, 8G RAM, 1.6 TB SSD

Hard drive information is as follows:

Filesystem     Type   Size  Used Avail Use% Mounted on
/dev/root      ext4   1.6T  572G  980G  37% /
tmpfs          tmpfs  3.9G     0  3.9G   0% /dev/shm
tmpfs          tmpfs  1.6G  1.1M  1.6G   1% /run
tmpfs          tmpfs  5.0M     0  5.0M   0% /run/lock
/dev/xvda15    vfat   105M  6.1M   99M   6% /boot/efi
tmpfs          tmpfs  795M  4.0K  795M   1% /run/user/1000

I tried with L1 RPC https://rpc.ankr.com/eth, https://ethereum.publicnode.com/ and https://ethereum.blockpi.network/v1/rpc/public, all have the same problem.

@khanh-ld
Copy link

Any update on this issue? Same issue here.

@MrFrogoz
Copy link

MrFrogoz commented Sep 23, 2023

Hi guys, you can probably fix the problem with my last comment in the issue: #104 (comment) + upgrade to 4 core + create a non root ebs volume gp3 1TB 3.000 IOPS 125 MB/s and mount on that disk for example /data/geth

@laptrinhbockchain
Copy link

laptrinhbockchain commented Sep 24, 2023

I found the reason related to Storage, because the "Timing buffered disk reads" speed is too slow.
Below is the Storage information of 2 servers:

# Server 01 (Sync is OK)
# Using storage GP3
sudo hdparm -Tt /dev/xvda
------------------------------------
/dev/xvda:
 Timing cached reads:   19108 MB in  1.99 seconds = 9612.10 MB/sec
 Timing buffered disk reads: 466 MB in  3.00 seconds = 155.21 MB/sec
------------------------------------

# Server 02 (Sync is not OK)
# Using storage GP3
sudo hdparm -Tt /dev/xvda
------------------------------------
/dev/xvda:
 Timing cached reads:   18140 MB in  1.99 seconds = 9120.76 MB/sec
 Timing buffered disk reads:  42 MB in  3.56 seconds =  11.79 MB/sec

When I changed Storage from GP3 to GP2, the synchronization is OK, even though my server is an AWS EC2 t2.large Server (2 vCPUs, 8G RAM, 1.3 TB Storage GP2). Initially the delay time was 9 hours, then increased to 15 hours, then began to gradually decrease and after more than 30 hours the synchronization was completed.

If you use EC2 of AWS, you should pay attention:

  • Select Region accordingly
  • Select Storage Type as GP2 or GP3 depending on the Region.
  • You should test the speed of Storage before running node

@ism
Copy link

ism commented Sep 26, 2023

@laptrinhbockchain thanks!
i started to dig in, and it turned i have 2TB HDD drive, not SSD. i was provided with a wrong box, i am pretty sure that's an issue.

@Lodimup
Copy link
Author

Lodimup commented Sep 27, 2023

I used snapshot method found in their discord to solve the issue.

Pretty frustrating dev done spend time updating docs.

@wbnns
Copy link
Member

wbnns commented Oct 3, 2023

Thanks all for the feedback on this. I've opened base-org/web#41 to add a callout in the Guide for Node Operators on Base.org to hopefully assist others in the future who may encounter this.

If you all have any other thoughts or suggestions here, please don't hesitate to let us know!

@wbnns wbnns closed this as completed Oct 3, 2023
zencephalon pushed a commit to base-org/web that referenced this issue Oct 4, 2023
…guide (#41)

* run-a-base-node: Add section regarding snapshots

* run-a-base-node: Add hardware requirements

* run-a-base-node: Add callout for EBS users

Context: base-org/node#105
dev0614 added a commit to dev0614/web that referenced this issue Jan 5, 2024
…guide (#41)

* run-a-base-node: Add section regarding snapshots

* run-a-base-node: Add hardware requirements

* run-a-base-node: Add callout for EBS users

Context: base-org/node#105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants