You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I run several instances of ssv. standalone or inside ethdocker. All as docker containers. Two times in the last days i have seen that different nodes dit not reboot. ( ssv crashes after loosing connection to CL. no big deal if it would restart, but it does not ).
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d36ac75e8957 bloxstaking/ssv-node:latest "make BUILD_PATH=/go…" 4 days ago Exited (2) 28 hours ago ssv_node
docker logs ssv_node
2024-02-19T11:16:11.109446Z ERROR node is not healthy {"node": "consensus client", "error": "failed to request syncing: failed to call GET endpoint: Get \"http://rocketpool2.nl.arpa:5052/eth/v1/node/syncing\": dial tcp 192.168.131.180:5052: connect: connection refused", "errorVerbose": "Get \"http://rocketpool2.nl.arpa:5052/eth/v1/node/syncing\": dial tcp 192.168.131.180:5052: connect: connection refused\nfailed to call GET endpoint\ngithub.com/attestantio/go-eth2-client/http.(*Service).get\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/http.go:66\ngithub.com/attestantio/go-eth2-client/http.(*Service).NodeSyncing\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/nodesyncing.go:30\ngithub.com/bloxapp/ssv/beacon/goclient.(*goClient).Healthy\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:203\ngithub.com/bloxapp/ssv/nodeprobe.(*Prober).probe.func1\n\t/go/src/github.com/bloxapp/ssv/nodeprobe/nodeprobe.go:96\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to request syncing\ngithub.com/attestantio/go-eth2-client/http.(*Service).NodeSyncing\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/nodesyncing.go:32\ngithub.com/bloxapp/ssv/beacon/goclient.(*goClient).Healthy\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:203\ngithub.com/bloxapp/ssv/nodeprobe.(*Prober).probe.func1\n\t/go/src/github.com/bloxapp/ssv/nodeprobe/nodeprobe.go:96\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"}
2024-02-19T11:16:11.109643Z ERROR not all nodes are healthy
2024-02-19T11:16:11.109662Z FATAL ethereum node(s) are either out of sync or down. Ensure the nodes are healthy to resume.
make: *** [Makefile:102: start-node] Error 1
make: *** wait: No child process. Stop.
journalctl -u docker
Feb 19 12:16:11 ssv2 dockerd[15728]: time="2024-02-19T12:16:11.134730871+01:00" level=info msg="ignoring event" container=d36ac75e89576178dc74d91cf2397afb0aeff2aae83f59f3bb115ea0c702abbe module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 19 12:16:11 ssv2 dockerd[15728]: time="2024-02-19T12:16:11.141990788+01:00" level=warning msg="ShouldRestart failed, container will not be restarted" container=d36ac75e89576178dc74d91cf2397afb0aeff2aae83f59f3bb115ea0c702abbe daemonShuttingDown=false error="restart canceled" execDuration=69h56m0.401903873s exitStatus="{2 2024-02-19 11:16:11.>
Feb 19 12:16:11 ssv2 dockerd[15728]: time="2024-02-19T12:16:11.142553134+01:00" level=warning msg="failed to close stdin: task d36ac75e89576178dc74d91cf2397afb0aeff2aae83f59f3bb115ea0c702abbe not found: not found"
log ethdocker
v1.2.3-e5a6d711958f5043615bf8f6a95005a6083e714f
prysm-cl-only.yml
besu.yml
docker ps -a
bb2eb0dda72a bloxstaking/ssv-node:latest "make BUILD_PATH=/go…" 5 days ago Exited (2) 17 hours ago eth-docker-ssv-node-1
ethd logs ssv-node
ssv-node-1 | 2024-02-23T02:09:30.684605Z error node is not healthy {"node": "consensus client", "error": "failed to request syncing: failed to call GET endpoint: Get \"http://consensus:5052/eth/v1/node/syncing\": context deadline exceeded", "errorVerbose": "Get \"http://consensus:5052/eth/v1/node/syncing\": context deadline exceeded\nfailed to call GET endpoint\ngithub.com/attestantio/go-eth2-client/http.(*Service).get\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/http.go:66\ngithub.com/attestantio/go-eth2-client/http.(*Service).NodeSyncing\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/nodesyncing.go:30\ngithub.com/bloxapp/ssv/beacon/goclient.(*goClient).Healthy\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:203\ngithub.com/bloxapp/ssv/nodeprobe.(*Prober).probe.func1\n\t/go/src/github.com/bloxapp/ssv/nodeprobe/nodeprobe.go:96\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to request syncing\ngithub.com/attestantio/go-eth2-client/http.(*Service).NodeSyncing\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/nodesyncing.go:32\ngithub.com/bloxapp/ssv/beacon/goclient.(*goClient).Healthy\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:203\ngithub.com/bloxapp/ssv/nodeprobe.(*Prober).probe.func1\n\t/go/src/github.com/bloxapp/ssv/nodeprobe/nodeprobe.go:96\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"}
ssv-node-1 | 2024-02-23T02:09:30.687323Z error not all nodes are healthy
ssv-node-1 | 2024-02-23T02:09:30.687356Z fatal ethereum node(s) are either out of sync or down. Ensure the nodes are healthy to resume.
ssv-node-1 | make: *** [Makefile:102: start-node] Error 1
ssv-node-1 | make: go: No such file or directory
ssv-node-1 | make: go: No such file or directory
ssv-node-1 | make: go: No such file or directory
ssv-node-1 | Build /go/bin/ssvnode
ssv-node-1 | Build /config/config.yaml
ssv-node-1 | Build
ssv-node-1 | Command --config=/config/config.yaml
ssv-node-1 | Running node on address: *.*.*.*)
ssv-node-1 | 2024-02-23T02:09:52.464863Z info starting SSV-Node:v1.2.3-e5a6d711958f5043615bf8f6a95005a6083e714f
ssv-node-1 | 2024-02-23T02:09:52.465278Z info setting ssv network {"network": "mainnet", "domain": "00000000", "nodeType": "light", "builderProposals(MEV)": "enabled", "beaconNetwork": "mainnet", "genesisEpoch": 218450, "registryContract": "0xDD9BC35aE942eF0cFa76930954a156B3fF30a4E1"}
ssv-node-1 | 2024-02-23T02:09:52.508341Z info applying migrations {"count": 4}
ssv-node-1 | 2024-02-23T02:09:52.508951Z info applied migrations successfully {"count": 0}
ssv-node-1 | 2024-02-23T02:09:52.848530Z info successfully setup operator keys {"pubkey": "***"}
ssv-node-1 | 2024-02-23T02:10:53.966163Z info consensus client: connecting {"address": "http://consensus:5052", "network": "mainnet"}
ssv-node-1 | 2024-02-23T02:11:12.449409Z fatal failed to create beacon go-client {"error": "failed to create http client: failed to confirm node connection: failed to fetch genesis: failed to request genesis: failed to call GET endpoint: Get \"http://consensus:5052/eth/v1/beacon/genesis\": context deadline exceeded", "errorVerbose": "Get \"http://consensus:5052/eth/v1/beacon/genesis\": context deadline exceeded\nfailed to call GET endpoint\ngithub.com/attestantio/go-eth2-client/http.(*Service).get\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/http.go:66\ngithub.com/attestantio/go-eth2-client/http.(*Service).Genesis\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/genesis.go:45\ngithub.com/attestantio/go-eth2-client/http.(*Service).fetchStaticValues\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:143\ngithub.com/attestantio/go-eth2-client/http.New\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:118\ngithub.com/bloxapp/ssv/beacon/goclient.New\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:154\ngithub.com/bloxapp/ssv/cli/operator.setupConsensusClient\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:545\ngithub.com/bloxapp/ssv/cli/operator.glob..func2\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:151\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:944\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992\ngithub.com/bloxapp/ssv/cli.Execute\n\t/go/src/github.com/bloxapp/ssv/cli/cli.go:27\nmain.main\n\t/go/src/github.com/bloxapp/ssv/cmd/ssvnode/main.go:20\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to request genesis\ngithub.com/attestantio/go-eth2-client/http.(*Service).Genesis\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/genesis.go:47\ngithub.com/attestantio/go-eth2-client/http.(*Service).fetchStaticValues\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:143\ngithub.com/attestantio/go-eth2-client/http.New\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:118\ngithub.com/bloxapp/ssv/beacon/goclient.New\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:154\ngithub.com/bloxapp/ssv/cli/operator.setupConsensusClient\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:545\ngithub.com/bloxapp/ssv/cli/operator.glob..func2\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:151\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:944\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992\ngithub.com/bloxapp/ssv/cli.Execute\n\t/go/src/github.com/bloxapp/ssv/cli/cli.go:27\nmain.main\n\t/go/src/github.com/bloxapp/ssv/cmd/ssvnode/main.go:20\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to fetch genesis\ngithub.com/attestantio/go-eth2-client/http.(*Service).fetchStaticValues\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:144\ngithub.com/attestantio/go-eth2-client/http.New\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:118\ngithub.com/bloxapp/ssv/beacon/goclient.New\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:154\ngithub.com/bloxapp/ssv/cli/operator.setupConsensusClient\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:545\ngithub.com/bloxapp/ssv/cli/operator.glob..func2\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:151\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:944\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992\ngithub.com/bloxapp/ssv/cli.Execute\n\t/go/src/github.com/bloxapp/ssv/cli/cli.go:27\nmain.main\n\t/go/src/github.com/bloxapp/ssv/cmd/ssvnode/main.go:20\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to confirm node connection\ngithub.com/attestantio/go-eth2-client/http.New\n\t/go/pkg/mod/github.com/attestantio/go-eth2-client@v0.16.3/http/service.go:119\ngithub.com/bloxapp/ssv/beacon/goclient.New\n\t/go/src/github.com/bloxapp/ssv/beacon/goclient/goclient.go:154\ngithub.com/bloxapp/ssv/cli/operator.setupConsensusClient\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:545\ngithub.com/bloxapp/ssv/cli/operator.glob..func2\n\t/go/src/github.com/bloxapp/ssv/cli/operator/node.go:151\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:944\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992\ngithub.com/bloxapp/ssv/cli.Execute\n\t/go/src/github.com/bloxapp/ssv/cli/cli.go:27\nmain.main\n\t/go/src/github.com/bloxapp/ssv/cmd/ssvnode/main.go:20\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to create http client", "address": "http://consensus:5052"}
ssv-node-1 | make: *** [Makefile:102: start-node] Error 1
journalctl -u docker
Feb 23 03:10:26 ethdocker2 dockerd[798]: time="2024-02-23T03:10:26.552099138+01:00" level=warning msg="Health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3 error: timed out starting health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3"
Feb 23 03:10:26 ethdocker2 dockerd[798]: time="2024-02-23T03:10:26.552377870+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:10:26 ethdocker2 dockerd[798]: time="2024-02-23T03:10:26.552398011+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:11:14 ethdocker2 dockerd[798]: time="2024-02-23T03:11:14.098400837+01:00" level=info msg="ignoring event" container=bb2eb0dda72a35b0ccaa4b2f412967b3e864c2d31f1ef2d3a0ea32e3f64acfbc module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:11:28.995205827+01:00" level=warning msg="Health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3 error: timed out starting health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:11:28.995544380+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:11:28.995557231+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:12:18.680412876+01:00" level=warning msg="Health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3 error: timed out starting health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:12:18.680774258+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:12:18.681322233+01:00" level=warning msg="Health check for container c520d5a80f2f8fe97bc13862feef84a64e45804058abab6ea7f03db302af9c6f error: timed out starting health check for container c520d5a80f2f8fe97bc13862feef84a64e45804058abab6ea7f03db302af9c6f"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:12:18.681615175+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:12:18.681685706+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:12:34 ethdocker2 dockerd[798]: time="2024-02-23T03:12:18.681985778+01:00" level=error msg="stream copy error: reading from a closed fifo"
Feb 23 03:13:29 ethdocker2 dockerd[798]: time="2024-02-23T03:13:29.537949336+01:00" level=warning msg="Health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3 error: timed out starting health check for container 35628ea1e4c062ae421aabc6ee308f425fda6e4bbab49eba8df7f08219b61cb3"
Feb 23 03:13:32 ethdocker2 dockerd[798]: time="2024-02-23T03:13:29.538173078+01:00" level=error msg="stream copy error: reading from a closed fifo"
The text was updated successfully, but these errors were encountered:
but "node is not healthy" is getting worse for me. one reboot every hour.. problem is on several nodes.. i updated ethdocker and all images to lastest version..
it might somehow be CPU usage related.. the problem got worse in the last two days when other tasks on my server push cpu usage above 80%
ssv-node-1 | {"level":"error","time":"2024-05-14T15:17:42.267391Z","msg":"node is not healthy","node":"consensus client","error":"failed to obtain node syncing status: failed to call GET endpoint: Get \"http://consensus:5052/eth/v1/node/syncing\": context deadline exceeded"}
ssv-node-1 | {"level":"error","time":"2024-05-14T15:17:42.267512Z","msg":"not all nodes are healthy"}
ssv-node-1 | {"level":"fatal","time":"2024-05-14T15:17:42.267523Z","msg":"ethereum node(s) are either out of sync or down. Ensure the nodes are healthy to resume."}
Describe the bug
I run several instances of ssv. standalone or inside ethdocker. All as docker containers. Two times in the last days i have seen that different nodes dit not reboot. ( ssv crashes after loosing connection to CL. no big deal if it would restart, but it does not ).
To Reproduce
Fresh Ubuntu 22
Install as outlined here https://docs.ssv.network/operator-user-guides/operator-node/installation
log standalone ssv
log ethdocker
The text was updated successfully, but these errors were encountered: