Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When one side connect actively, the other side will always keep trying a new connection #1123

Closed
Tommo-L opened this issue Sep 27, 2019 · 14 comments · Fixed by #1685
Closed

Comments

@Tommo-L
Copy link
Contributor

Tommo-L commented Sep 27, 2019

Describe the bug

If neo node A and B both listen port 40333, and A connected B firstly. At this time, B will always try to create a new connection from B to A with timer.
But when the new connection created, and B sent the VersionPlayload to A, A will close the connection at once, as A have connected B.

if (LocalNode.Singleton.RemoteNodes.Values.Where(p => p != this).Any(p => p.Remote.Address.Equals(Remote.Address) && p.Version?.Nonce == version.Nonce))
{
Disconnect(true);
return;
}
SendMessage(Message.Create(MessageCommand.Verack));

Assumption: The number of connected nodes is less than MinDesiredConnections.

To Reproduce
Steps to reproduce the behavior:

Evn.: 4 four nodes in different hosts.

  1. add log in RemoteNode
 public RemoteNode(NeoSystem system, object connection, IPEndPoint remote, IPEndPoint local)
            : base(connection, remote, local)
        {
            System.Console.WriteLine("add RemoteNode " + remote.ToString() + "  remotes: " + LocalNode.Singleton.RemoteNodes.Count);  // <---- here

            this.system = system;
            this.protocol = Context.ActorOf(ProtocolHandler.Props(system));
            LocalNode.Singleton.RemoteNodes.TryAdd(Self, this);
         ...............................

add log in RemoteNode.PostStop

        protected override void PostStop()
        {
            LocalNode.Singleton.RemoteNodes.TryRemove(Self, out _);
            base.PostStop();

            System.Console.WriteLine("remove RemoteNode " + Remote.ToString() + "  remotes: " + LocalNode.Singleton.RemoteNodes.Count);  // <------- 
        }

add log in Peer.OnTimer

 private void OnTimer()
        {
            if (ConnectedPeerActors.Count >= MinDesiredConnections) return;
            if (UnconnectedPeers.Count == 0)
                NeedMorePeers(MinDesiredConnections - ConnectedPeerActors.Count);
            foreach(IPEndPoint unconnected in UnconnectedPeers)
            {
                Console.WriteLine("--- unconnected " + unconnected.ToString());  // <--- here
            }
            foreach (IPEndPoint connected in ConnectedPeers.Values)
            {
                Console.WriteLine("--- connected " + connected.ToString());   // <--- here
            }

            IPEndPoint[] endpoints = UnconnectedPeers.Take(MinDesiredConnections - ConnectedPeerActors.Count).ToArray();
            ImmutableInterlocked.Update(ref UnconnectedPeers, p => p.Except(endpoints));
            foreach (IPEndPoint endpoint in endpoints.Except(ConnectedPeers))
            {
                Console.WriteLine("start to connect " + endpoint.ToString() + " ..."); // <--- here
                ConnectToPeer(endpoint);
            }
        }
  1. setup a privatenet and watch the logs

Expected behavior

When A connected B, B dose not need to create a new connection between B and A unless the original connection crashed.

Screenshots

image

B's ip is 192.168.1.147, and A's ip is 192.168.1.149. A connected B firstly. As you can see, B will try to create a connection with timer.

Platform:

  • OS: Centos 7.0
  • Version neo-cli 3.0.0 preview1

How to solve?

The reason is that A dosen't use the listener port 40333 but other port to connect B, but B will treat the {A's ip}:40333 as a new node, and try to connect it.

When A connect B, B must also add the {A's ip}:{A's listener port} in its connected peers.

@vncoelho
Copy link
Member

It is great to see you are investigating this. I will take a carefull look.

@Tommo-L
Copy link
Contributor Author

Tommo-L commented Sep 27, 2019

fixed by #1124

@erikzhang
Copy link
Member

I think this is as expected. Why do we want to allow duplicate connections?

@Tommo-L
Copy link
Contributor Author

Tommo-L commented Sep 27, 2019

I mean, there is not need for B to connect A again, when A has connected B.
Currently, the problem is that B is always trying to connect A.

@shargon
Copy link
Member

shargon commented Sep 29, 2019

If neo node A and B both listen port 40333, and A connected B firstly. At this time, B will always try to create a new connection from B to A with timer.
But when the new connection created, and B sent the VersionPlayload to A, A will close the connection at once, as A have connected B.

I think that we should use Reject message is certain cases, and this is one of them.

Why A don't send a Reject message to say B, stop do this?

@erikzhang
Copy link
Member

Reject is for transactions or blocks.

@Tommo-L
Copy link
Contributor Author

Tommo-L commented Sep 29, 2019

I think there is no need to send a Reject message, as B has already knew A's listener port, which is contained in A's VersionPayload.

@shargon
Copy link
Member

shargon commented Sep 30, 2019

Why we don't use unique Id Guid instead of Ports or nonce?

@shargon
Copy link
Member

shargon commented Oct 1, 2019

I am configuring the scenario and the text is wrong @Tommo-L

foreach (IPEndPoint connected in ConnectedPeers)
            {
                Console.WriteLine("--- connected " + connected.ToString());   // <--- here
            }

should be

foreach (IPEndPoint connected in ConnectedPeers.Values)
            {
                Console.WriteLine("--- connected " + connected.ToString());   // <--- here
            }

@shargon
Copy link
Member

shargon commented Oct 1, 2019

This is my log

image

@Tommo-L
Copy link
Contributor Author

Tommo-L commented Oct 1, 2019

Two nodes aren't enough, you need 3 nodes at least. In my environment, I have 4 nodes in different hosts.

Why two nodes aren't enough, the work flow as following:
Step1: A ---- connected ----> B
Step2: B timeout#1, B will try to connect A, but A will close it at once. And B will also remove A from UnconnectedPeers.

neo/neo/Network/P2P/Peer.cs

Lines 229 to 230 in c4023a4

IPEndPoint[] endpoints = UnconnectedPeers.Take(MinDesiredConnections - ConnectedPeers.Count).ToArray();
ImmutableInterlocked.Update(ref UnconnectedPeers, p => p.Except(endpoints));

Step3: B timeout#2, but this time, UnconnectedPeers is empty, then B will broadcast GetAddr message to get new unconnected peers.

neo/neo/Network/P2P/Peer.cs

Lines 224 to 228 in c4023a4

private void OnTimer()
{
if (ConnectedPeers.Count >= MinDesiredConnections) return;
if (UnconnectedPeers.Count == 0)
NeedMorePeers(MinDesiredConnections - ConnectedPeers.Count);

protected override void NeedMorePeers(int count)
{
count = Math.Max(count, 5);
if (ConnectedPeers.Count > 0)
{
BroadcastMessage(MessageCommand.GetAddr);
}

Step4: However, when A received GetAddr message, A will send the addresses of connected nodes, which is the address of B.

private void OnGetAddrMessageReceived()
{
Random rand = new Random();
IEnumerable<RemoteNode> peers = LocalNode.Singleton.RemoteNodes.Values
.Where(p => p.ListenerTcpPort > 0)
.GroupBy(p => p.Remote.Address, (k, g) => g.First())
.OrderBy(p => rand.Next())
.Take(AddrPayload.MaxCountToSend);
NetworkAddressWithTime[] networkAddresses = peers.Select(p => NetworkAddressWithTime.Create(p.Listener.Address, p.Version.Timestamp, p.Version.Capabilities)).ToArray();
if (networkAddresses.Length == 0) return;
Context.Parent.Tell(Message.Create(MessageCommand.Addr, AddrPayload.Create(networkAddresses)));

Step5: B received the address list from A, it's only the address of B. And it will be filter by B.

private void OnAddrMessageReceived(AddrPayload payload)
{
system.LocalNode.Tell(new Peer.Peers
{
EndPoints = payload.AddressList.Select(p => p.EndPoint).Where(p => p.Port > 0)
});
}

protected void AddPeers(IEnumerable<IPEndPoint> peers)
{
if (UnconnectedPeers.Count < UnconnectedMax)
{
peers = peers.Where(p => p.Port != ListenerTcpPort || !localAddresses.Contains(p.Address));
ImmutableInterlocked.Update(ref UnconnectedPeers, p => p.Union(peers));
}
}

Finally, the UnconnectedPeers of B is still empty. This is why two nodes works fine.

@Tommo-L
Copy link
Contributor Author

Tommo-L commented Oct 1, 2019

Another way is to add a ConnectFailedPeers list?

@shargon
Copy link
Member

shargon commented Oct 1, 2019

I think that is better with ConnectFailedPeers every 10 minutes we can purge this cache.

@Tommo-L
Copy link
Contributor Author

Tommo-L commented Oct 11, 2019

fixed in #1154, plz review it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants