Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node restart breaks p2p connectivity #9347

Open
k1rill-fedoseev opened this issue Aug 9, 2021 · 6 comments
Open

Node restart breaks p2p connectivity #9347

k1rill-fedoseev opened this issue Aug 9, 2021 · 6 comments
Assignees
Labels
Bug Something isn't working Good First Issue Good for newcomers Help Wanted Extra attention is needed Networking P2P related items

Comments

@k1rill-fedoseev
Copy link
Contributor

k1rill-fedoseev commented Aug 9, 2021

馃悶 Bug Report

Description

When you restart the node with --p2p-priv-key option pointing to the same private key, it can no longer connect to the existing nodes in the network due to some handshake problems.

Has this worked before in a previous version?

No, AFAIK

馃敩 Minimal Reproduction

Please let us know how we can reproduce this issue. Include the exact method you used to run Prysm along with any flags used in your beacon chain and/or validator. Make sure you don't upload any confidential files or private keys.
-->

  • Setup private chain with N nodes with fixed private keys (via --p2p-priv-key) and 1 bootnode.
  • Setup and run necessary validators to test that chain is operating.
  • Stop one of the beacon nodes and restart it after some time passes.
  • Observe debug and trace level logs.

馃敟 Error

Node cannot sync with the rest of the chain in Waiting for enough suitable peers before syncing state with zero suitable peers.

021-08-09T07:30:58.545Z DEBUG swarm2 [limiter] freeing FD token; waiting: 0; consuming: 1
2021-08-09T07:30:58.545Z DEBUG swarm2 [limiter] freeing peer token; peer 16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT; addr: /ip4/134.122.127.194/tcp/13000; active for peer: 1; waiting on peer limit: 0
2021-08-09T07:30:58.545Z DEBUG swarm2 [limiter] clearing all peer dials: 16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT
2021-08-09T07:30:58.545Z DEBUG swarm2 [16Uiu2HAmBKp1bVEpuePbM9JizW2r7fUGrP4qamTreXrRps4HvVa2] opening stream to peer [16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT]
2021-08-09T07:30:58.545Z DEBUG swarm2 [16Uiu2HAmBKp1bVEpuePbM9JizW2r7fUGrP4qamTreXrRps4HvVa2] swarm dialing peer [16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT]
2021-08-09T07:30:58.545Z DEBUG swarm2 [16Uiu2HAmBKp1bVEpuePbM9JizW2r7fUGrP4qamTreXrRps4HvVa2] opening stream to peer [16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT]
2021-08-09T07:30:58.545Z DEBUG swarm2 [16Uiu2HAmBKp1bVEpuePbM9JizW2r7fUGrP4qamTreXrRps4HvVa2] swarm dialing peer [16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT]
2021-08-09T07:30:58.545Z DEBUG net/identify error opening identify stream {"error": "read tcp4 172.18.0.2:13000->134.122.127.194:13000: read: connection reset by peer"}
2021-08-09T07:30:58.545Z DEBUG net/identify error opening identify stream {"error": "read tcp4 172.18.0.2:13000->134.122.127.194:13000: read: connection reset by peer"}
2021-08-09T07:30:58.545Z DEBUG basichost host 16Uiu2HAmBKp1bVEpuePbM9JizW2r7fUGrP4qamTreXrRps4HvVa2 finished dialing 16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT
time="2021-08-09 07:30:58" level=trace msg="Handshake failed" error="max dial attempts exceeded" prefix=p2p
2021-08-09T07:30:58.545Z DEBUG pubsub opening new stream to peer: max dial attempts exceeded16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT
2021-08-09T07:30:58.545Z DEBUG pubsub PEERDOWN: Remove disconnected peer 16Uiu2HAm68etebwYHpMZTTiTxEa7zehC9jQkVxxDMXo1LGkMaEpT

馃實 Your Environment

Operating System:

Ubuntu 18.04

What version of Prysm are you running? (Which release)

v1.4.3

Anything else relevant (validator index / public key)?

@nisdas
Copy link
Member

nisdas commented Aug 10, 2021

Hey thanks for opening the issue @k1rill-fedoseev , do you mind stating which flags you are running the beacon nodes with ?

@nisdas nisdas added Networking P2P related items Bug Something isn't working Help Wanted Extra attention is needed Good First Issue Good for newcomers labels Aug 10, 2021
@k1rill-fedoseev
Copy link
Contributor Author

Sure, I was using docker-compose for setting up the network, here is the snippet from my compose file:

version: '3.8'
services:
  node:
    image: gcr.io/prysmaticlabs/prysm/beacon-chain:v1.4.3
    command: |
      --accept-terms-of-use
      --contract-deployment-block 9068790
      --http-web3provider https://rinkeby.infura.io/v3/$PROJECT_ID
      --bootstrap-node $BOOTNODE
      --config-file /sbc_test/config/config.yml
      --chain-config-file /sbc_test/config/config.yml
      --rpc-host 0.0.0.0
      --p2p-priv-key /sbc_test/config/nodekey.txt
      --p2p-local-ip 0.0.0.0
      --p2p-host-ip $IP_NODE
    ports:
      - '12000:12000/udp'
      - '13000:13000'
      - '4000:4000'

All nodes had different IPs and nodekey files.

@wangyifan
Copy link

Hi, I will take a look at this as my first issue of prysm. Could someone assign the ticket to me?

@prestonvanloon
Copy link
Member

Thanks @wangyifan. This is quite an old issue so it may not be relevant. If you could attempt the reproduction steps and find that it is no longer an issue, please add that information here and we can close the ticket. Thanks

@wangyifan
Copy link

Sure, I will post whatever I found.

@prestonvanloon
Copy link
Member

@wangyifan did you make any progress on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Good First Issue Good for newcomers Help Wanted Extra attention is needed Networking P2P related items
Projects
None yet
Development

No branches or pull requests

4 participants