Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lodestar sends same UNAVAILABLE block to EL at-infinitum #4251

Closed
dapplion opened this issue Jul 6, 2022 · 5 comments
Closed

Lodestar sends same UNAVAILABLE block to EL at-infinitum #4251

dapplion opened this issue Jul 6, 2022 · 5 comments
Labels
prio-low This is nice to have. scope-ux Issues for CLI UX or general consumer UX.

Comments

@dapplion
Copy link
Contributor

dapplion commented Jul 6, 2022

Describe the bug

In gc-merge-devnet-1 incident, Lodestar spent hours hammering the same block over and over to Nethermind. @g11tech is this expected behaviour?

Also metrics like lodestar_execution_engine_http_client_request_errors_total are not collected, review

Longer chunk of logs: lodestar_logs_47316_block.txt

Jul-03 06:31:46.129[CHAIN]           error: Block error slot=47311, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:31:58.417[CHAIN]           error: Block error slot=47311, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:32:04.396[CHAIN]           error: Block error slot=47311, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:32:25.161[CHAIN]           error: Error pushing notifyForkchoiceUpdate() headBlockHash=0x9d150824683912b2dcffee609d70892ca0e254eb5d0a95fbb3af760b81a6615c, finalizedBlockHash=0xc8d43bc8019581a21
77a99cbbee1e6764d4d5551a9f0b8051dca756d9916e334 Timeout request
Jul-03 06:32:26.511[CHAIN]           error: Block error slot=47313, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:32:38.737[CHAIN]           error: Block error slot=47313, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:32:51.170[CHAIN]           error: Block error slot=47313, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:33:04.369[CHAIN]           error: Error pushing notifyForkchoiceUpdate() headBlockHash=0x1d65020475ccd4e6ce36034300c12d6e499f82019bbf1fb462a495d4df58f0b1, finalizedBlockHash=0xc8d43bc8019581a21
77a99cbbee1e6764d4d5551a9f0b8051dca756d9916e334 Timeout request
Jul-03 06:33:05.428[CHAIN]           error: Block error slot=47314, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:33:19.460[CHAIN]           error: Block error slot=47315, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:33:41.260[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:33:53.806[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:34:06.049[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:34:18.298[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:34:30.721[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:34:42.963[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:34:47.195[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:34:53.023[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:34:58.874[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:35:01.544[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:35:08.373[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:35:13.809[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:35:25.997[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:35:35.114[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:35:37.610[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:35:49.983[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=UNAVAILABLE, errorMessage=Timeout request
Jul-03 06:35:51.604[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:35:59.460[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:36:01.934[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:04.614[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:13.623[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=ELERROR, errorMessage=JSON RPC error: Authentication e
rror, engine_newPayloadV1
Jul-03 06:36:16.322[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:19.097[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:24.209[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:27.143[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:30.091[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.
Jul-03 06:36:32.123[CHAIN]           error: Block error slot=47316, errCode=BLOCK_ERROR_EXECUTION_ERROR code=BLOCK_ERROR_EXECUTION_ERROR, execStatus=INVALID, errorMessage=Block 47941 (0x8fcc12fc15a72e45f
e3a2b909bf867542a27fafd9394be540f95be3c7e136c3e) is known to be a part of an invalid chain.

Expected behavior

Steps to Reproduce

Screenshots

Desktop (please complete the following information):

  • OS:
  • Version:
  • Branch:
  • Commit hash:
@g11tech
Copy link
Contributor

g11tech commented Jul 7, 2022

@dapplion yes this is the current expected behavior, if EL is unavailable or not responding or throwing error on a block, process block/process segment fails and it gets retried., and will continue to be retried especially if it's a part of canonical chain.

I think we introduced some retry logic which has some wait between retries and hence should pile less, but that PR by @dadepo has never got merged.
But it will only reduce pileup.

@g11tech
Copy link
Contributor

g11tech commented Jul 7, 2022

Metrics i will review and add 👍
I think that PR (retry PR) also had metrics regarding the requests.

@dapplion
Copy link
Contributor Author

dapplion commented Jul 7, 2022

but that PR by @dadepo has never got merged.

Which PR?

@g11tech
Copy link
Contributor

g11tech commented Jul 7, 2022

but that PR by @dadepo has never got merged.

Which PR?

this one: #3854, i think @dadepo was waiting for your final ✔️ . will rebase and push

@dapplion dapplion added prio-low This is nice to have. scope-ux Issues for CLI UX or general consumer UX. labels Jul 13, 2022
@dapplion
Copy link
Contributor Author

Closing issues for specific incidents on old versions of Lodestar, please re-open if it happens with latest stable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio-low This is nice to have. scope-ux Issues for CLI UX or general consumer UX.
Projects
None yet
Development

No branches or pull requests

2 participants