Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

Peers Cannot Successfully Handshake #70

Closed
nisdas opened this issue Apr 13, 2020 · 5 comments · Fixed by libp2p/go-libp2p#1041
Closed

Peers Cannot Successfully Handshake #70

nisdas opened this issue Apr 13, 2020 · 5 comments · Fixed by libp2p/go-libp2p#1041

Comments

@nisdas
Copy link

nisdas commented Apr 13, 2020

In local testing with NOISE enabled, I get failed to negotiate security protocol: error reading handshake message: noise: message is too short often. For a background on this issue and more context: prysmaticlabs/prysm#5372 .

This happens often enough with two peers trying to connect and performing a NOISE handshake with each other. On further exploration this was because each peer regarded themselves as the initiator of the handshake in the same connection, which lead to malformed handshake responses being produced. This was because each peer read the initiating payload from the other peer as a response to their initial handshake. I am not sure how this happens, this would point to the issue being in Swarm rather than Noise. To clarify this only happens when each peer dials each other at the same time.

I explored this issue in SECIO too, and did notice that there were instances where both peers dialed each other the same time and shared the same connection, instead of having separate connections for each respective dial. However SECIO differs from NOISE in the fact that there is no fixed initiator or responder. Each peer is expected to write the initial proposal payload into the stream at the same time, and also read the corresponding peer's proposal from the same stream. Therefore SECIO wouldn't fail from the same issue as NOISE.

To reproduce this you can run the following test:

func createHost(t *testing.T, port int) (host.Host, net.IP) {
	ipAddr = net.ParseIP("127.0.0.1")
	listen, err := multiaddr.NewMultiaddr(fmt.Sprintf("/ip4/%s/tcp/%d", ipAddr, port))
	if err != nil {
		t.Fatalf("Failed to p2p listen: %v", err)
	}
	h, err := libp2p.New(context.Background(), []libp2p.Option{libp2p.ListenAddrs(listen), libp2p.Security(noise.ID, noise.New)}...)
	if err != nil {
		t.Fatal(err)
	}
	return h, ipAddr
}

func TestPeer_FailedHandshake(t *testing.T) {
	h1, ipaddr := createHost(t, 5000)
	defer h1.Close()

	h1Addr, err := multiaddr.NewMultiaddr(fmt.Sprintf("/ip4/%s/tcp/%d/p2p/%s", ipaddr, 5000, h1.ID()))
	if err != nil {
		t.Fatal(err)
	}
	addrInfo1, err := peer.AddrInfoFromP2pAddr(h1Addr)
	if err != nil {
		t.Fatal(err)
	}

	h2, ipaddr := createHost(t, 5001)
	defer h2.Close()

	h2Addr, err := multiaddr.NewMultiaddr(fmt.Sprintf("/ip4/%s/tcp/%d/p2p/%s", ipaddr, 5001, h2.ID()))
	if err != nil {
		t.Fatal(err)
	}
	addrInfo2, err := peer.AddrInfoFromP2pAddr(h2Addr)
	if err != nil {
		t.Fatal(err)
	}
	wg := new(sync.WaitGroup)
	go func() {
		wg.Add(1)
		for i := 0; i < 20; i++ {
			if err := h1.Connect(context.Background(), *addrInfo2); err != nil {
				t.Fatal(err)
			}
			if len(h1.Network().Peers()) != 1 {
				t.Error("Peer not connected, has 0 peers")
			}
			if err := h1.Network().ClosePeer(h2.ID()); err != nil {
				t.Fatal(err)
			}
			if len(h1.Network().Peers()) != 0 {
				t.Error("Peer not disconnected, has 1 peer")
			}
		}
		wg.Done()
	}()

// use another peer to constantly connect/disconnect with first peer.
	for i := 0; i < 20; i++ {
		if err := h2.Connect(context.Background(), *addrInfo1); err != nil {
			t.Fatal(err)
		}
		if len(h2.Network().Peers()) != 1 {
			t.Error("Peer not connected, has 0 peers")
		}
		if err := h2.Network().ClosePeer(h1.ID()); err != nil {
			t.Fatal(err)
		}
		if len(h2.Network().Peers()) != 0 {
			t.Error("Peer not disconnected, has 1 peer")
		}
	}
	wg.Wait()
}

You can split out noise for secio, and you will notice that the error disappears

@yusefnapora
Copy link
Contributor

@nisdas thank you for diagnosing and reporting this! We should be able to detect when a simultaneous connection is happening and automatically assign the initiator / responder role based on some deterministic criteria, e.g. by comparing the hash of the initial messages.

I'll start working on that today and keep this issue updated as things progress.

@vyzo
Copy link

vyzo commented Apr 13, 2020

This is not sufficient, this has to be propagated to the muxer as well -- yamux in particular.

@vyzo
Copy link

vyzo commented Apr 13, 2020

see also libp2p/go-libp2p#712

@nisdas
Copy link
Author

nisdas commented Apr 18, 2020

Hey @yusefnapora , any update on this ?

@d1eselboy
Copy link

We face the same problem in Anytype. Do you have any update on this? @yusefnapora @jbenet

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants