Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explorer nodes can't catch up with latest blocks #3740

Closed
LeoHChen opened this issue Jun 7, 2021 · 11 comments
Closed

Explorer nodes can't catch up with latest blocks #3740

LeoHChen opened this issue Jun 7, 2021 · 11 comments
Assignees
Labels
high priority high priority issue with customer impact

Comments

@LeoHChen
Copy link
Contributor

LeoHChen commented Jun 7, 2021

Describe the bug
Recently, with a higher volume of tx happened on-chain, the explorer nodes, either archival or non-archival nodes, can't catch up with the latest blockchain blocks.

To Reproduce
We've seen this problem on this build v6999-v4.0.0-66-g343dbe89 on the mainnet.

The explorer nodes can't catch up with the latest blocks by a few minutes. However, there is no sign of CPU / IO overload on the nodes. After I restarted the harmony process, it can catch up very quickly. So I think it is not the resource issue, there could be some logic that is slow to catch up.

Expected behavior
Explorer nodes should be able to catch up with the latest block when there is no resource overload issue.

@LeoHChen LeoHChen added the high priority high priority issue with customer impact label Jun 7, 2021
@LeoHChen LeoHChen pinned this issue Jun 7, 2021
@rlan35
Copy link
Contributor

rlan35 commented Jun 7, 2021

on one explorer node I checked (35.81.82.117), it's CPU overloaded.
image

@rlan35
Copy link
Contributor

rlan35 commented Jun 7, 2021

another node (34.222.182.47) definitely isn't catching up with the latest block and keeps triggering the block sync logic to catch up (which has 60s delay). It's very likely caused by too much CPU/memory load where the blocks broadcasted from consensus aren't processed in time (due to missing parent block). So likely the lastMile block logic need some optimization. @JackyWYX . you can check the log on 34.222.182.47, the issue is clear.

@LeoHChen
Copy link
Contributor Author

LeoHChen commented Jun 8, 2021

@rlan35 , 382% CPU is not CPU overloaded, this node has 8 cores, you may use htop to take a took. The CPU load is under 50%

@LeoHChen
Copy link
Contributor Author

LeoHChen commented Jun 8, 2021

But I agree with you, I doubt the last mile block sync should be a bit more aggressive in order to catch up. We do have enough CPU power. Some parameters may need to be tuned in the last mile catch. @JackyWYX

@JackyWYX
Copy link
Contributor

JackyWYX commented Jun 8, 2021

My current guess is follows: The explorer node is heavy in p2p message handling, resulting in committed block handled at AddNewBlockForExplorer is out of order. Thus result in unknown ancestor problems. This issue is only for explorer node. For validator node, there is already caching mechanism in consensus module handling this situation.

I will first try to confirm my guess, and apply the fix later.

@LeoHChen
Copy link
Contributor Author

LeoHChen commented Jun 8, 2021

heavy p2p message is reasonable as those explorer nodes are also RPC end points, received a lot of RPC requests from users and sent p2p messages to the network.

@JackyWYX
Copy link
Contributor

JackyWYX commented Jun 9, 2021

The issue is observed on one custom explorer node. The issue appears to be there is no caching mechanism for the last mile blocks on explorer node. When there are heavy p2p message load, the sequence of processing last mile blocks is not ensured. Thus I am currently working a fix to add a cache to explorer, and sort the last mile blocks before block insertion.

@JackyWYX
Copy link
Contributor

After some more in-depth debugging, found that the issue is actually happens in writing explorer db data.

On explorer node, the data for explorer need 7~15s of Dump (api/service/explorer/storage.go:85). The reason behind this is that all transaction history of addresses are encoded and dumped into db which is way from efficient. I will focus on doing following things trying to fix:

  1. Optimize the multi-threading logic with the current db data structure to see how much we can help with the performance.
  2. Do not write entry every block. Buffer for several blocks and write once.
  3. Migrate the explorer db data structure. Can use iterator with the prefix of addr_{OneAccr}_{txHash}.

@LeoHChen
Copy link
Contributor Author

Inspection on the system load of non-archival explorer nodes shows the disk IOPS can't keep up. So, we will migrate non-archival explorer nodes to m5d.2xlarge. The progress will be tracked in the following issue.
https://github.com/harmony-one/harmony-ops-priv/issues/37

@LeoHChen
Copy link
Contributor Author

we have upgraded the non-archival explorer nodes to m5d.2xlarge, the iostat can go beyond 300 now.

iostat
Linux 4.14.232-176.381.amzn2.x86_64 (ip-172-31-61-224.us-west-2.compute.internal)       06/16/2021      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          23.89    0.00    2.10    0.39    0.00   73.61

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
nvme1n1         473.66        92.76     57667.85   14056667 8738529518
nvme0n1           0.73         2.33        16.87     352652    2556108

@LeoHChen
Copy link
Contributor Author

iostat of each RPC endpoint can go up to 750 tps. To use m5d.2xlarge is the right choices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority high priority issue with customer impact
Projects
Monitoring and Metrics
  
Awaiting triage
Development

No branches or pull requests

3 participants