Local nodes keep disconnecting. #4272

arkpar · 2019-12-02T17:12:44Z

Two nodes running on localhost keep disconnecting from each other. Disconnect is initiated by libp2p and not by sync or reputation change apparently. Logs have no useful information on why the disconnect has happened.

Node 1 started as:

polkadot -d /tmp/polkadot/ -lpeerset=trace,sync=trace,sub-libp2p=trace,libp2p=trace --out-peers 0

Node2 started as:

polkadot -d /tmp/polkadot2 --reserved-nodes /ip4/127.0.0.1/tcp/30333/p2p/QmYiJFSrLQdvWJLLF64RnWGvk92zSjaN49EP9rtNLDbNEo --reserved-only -l peerset=trace,sub-libp2p=trace,libp2p=trace

Nodes stay connected for about 10 seconds before disconnect and reconnect happens.

Nodes should stay connected.
Logs should have disconnect reason.

The text was updated successfully, but these errors were encountered:

tomaka · 2019-12-03T11:13:16Z

For what it's worth, disabling the discovery mechanism fixes the issue. It's still unclear to me what is happening.

romanb · 2020-01-13T18:14:03Z

What seems to happen is the following:

Node 2 connects to node 1.
Both nodes also perform regular (Kusama?) DHT discovery queries in regular (exponentially increasing) intervals, starting with seconds and capped at 60 seconds.
It appears that local nodes connecting to the (Kusama?) DHT put their local address into the DHT, i.e. /ip4/127.0.0.1/tcp/30333. This is obviously necessary for nodes in the same local network to discover and talk to each other through the DHT, but these are also seen by remote nodes.
As a result, Node 2 will discover a node with a different peer ID from Node 1 but with address /ip4/127.0.0.1/tcp/30333. It will try to connect to it as part of the DHT lookups.
Thus Node 2 will make another connection attempt to Node 1, thinking it is some other node (i.e. expecting a different peer ID this time). It still has the existing connection to Node 1 for the moment.
Node 1 receives the incoming connection from Node 2, replacing the existing connection (i.e. dropping it) due to the single-connection-per-peer policy. The reason(s) for always taking the new connection over the old, even if a node has the role of listener in both connections (i.e. it is not a "simultaneous connect" scenario) are not entirely clear to me (@tomaka?).
Node 2 finishes setting up the new connection accepted by Node 1, but then discovers that the expected peer ID does not match the actual peer ID, so it drops this connection (MITM protection). (Node 2 quickly gets a BrokenPipe error on the old connection through the StreamMuxer usually. )
Node 1 gets a ConnectionReset error on the new connection.

This sequence more or less repeats ad infinitum, since Node 2's discovery will continue to encounter these peer IDs differing from that of Node 1 with address /ip4/127.0.0.1/tcp/30333 to which it will try to connect during DHT lookups.

If I'm not mistaken, this seems to be an interesting way for a node C to directly influence the connectivity between some nodes A and B, e.g. by advertising public addresses of A in the DHT under its own peer ID. When node B picks these up during lookups, it would disturb the connection between A and B in the above manner. This may be primarily a pitfall of the single-connection-per-node policy, though it may possibly be prevented by establishing a preference for the old (existing) connection, if a node receives a second connection as a listener when it is already a listener in the existing connection, but I'm not entirely clear about all the possible consequences at the moment.

tomaka · 2020-01-13T18:24:30Z

The reason(s) for always taking the new connection over the old, even if a node has the role of listener in both connections (i.e. it is not a "simultaneous connect" scenario) are not entirely clear to me (@tomaka?).

The reason is that typically when a node opens a new connection, it's because the old one is dead.

Example situation: Node 1 and Node 2 are connected. Node 2 loses its Internet connection, realizes it, and kills all existing sockets. No FIN is actually being sent because of no Internet access. Node 2 then gains back its Internet connection and tries to re-connect to Node 1. Node 1 isn't aware that the previous connection is dead.
In this situation, the new connection is the right choice.

tomaka · 2020-01-13T18:26:56Z

In my opinion the solution is to handle multiple simultaneous connections per node (libp2p/rust-libp2p#912).
There have already been several tricky issues caused by the decision of enforcing a unique connection, proving that it probably wasn't a good idea.

arkpar · 2020-01-14T13:05:50Z

Node 1 isn't aware that the previous connection is dead.

Doesn't libp2p have keep-alive or ping protocol to handle this?

tomaka · 2020-01-14T14:20:29Z

Doesn't libp2p have keep-alive or ping protocol to handle this?

It does, but it takes something like 30 seconds to trigger.

I'm not actually sure that my scenario above is realistic, but the general idea is that we expect that when a node opens a second connection it is because the existing one is unusable.

romanb · 2020-01-16T12:05:24Z

Doesn't libp2p have keep-alive or ping protocol to handle this?

It does, but it takes something like 30 seconds to trigger.

I'm not actually sure that my scenario above is realistic, but the general idea is that we expect that when a node opens a second connection it is because the existing one is unusable.

Note though that even when permitting multiple connections per peer, which I'm currently looking into, you will want to have a configurable limit (per peer). In a sense, the current single-connection-per-peer policy can be seen as a hard-coded limit of 1. Whatever the limit, I don't think it is a good idea to enforce the limit by dropping existing connections in favor of new ones at the lower networking layers. Rather, timely detection of broken connections is up to application protocols (or by configuring timeouts on a lower-layer protocol), and in particular what "timely" is supposed to be exactly as per the requirements of the protocol. The ping protocol can be aptly configured and used for this purpose, if desired.

Instead of trying to enforce a single connection per peer, which involves quite a bit of additional complexity e.g. to prioritise simultaneously opened connections and can have other undesirable consequences [1], we now make multiple connections per peer a feature. The gist of these changes is as follows: The concept of a "node" with an implicit 1-1 correspondence to a connection has been replaced with the "first-class" concept of a "connection". The code from `src/nodes` has moved (with varying degrees of modification) to `src/connection`. A `HandledNode` has become a `Connection`, a `NodeHandler` a `ConnectionHandler`, the `CollectionStream` was the basis for the new `connection::Pool`, and so forth. Conceptually, a `Network` contains a `connection::Pool` which in turn internally employs the `connection::Manager` for handling the background `connection::manager::Task`s, one per connection, as before. These are all considered implementation details. On the public API, `Peer`s are managed as before through the `Network`, except now the API has changed with the shift of focus to (potentially multiple) connections per peer. The `NetworkEvent`s have accordingly also undergone changes. The Swarm APIs remain largely unchanged, except for the fact that `inject_replaced` is no longer called. It may now practically happen that multiple `ProtocolsHandler`s are associated with a single `NetworkBehaviour`, one per connection. If implementations of `NetworkBehaviour` rely somehow on communicating with exactly one `ProtocolsHandler`, this may cause issues, but it is unlikely. [1]: paritytech/substrate#4272

arkpar · 2020-02-07T12:22:49Z

How exactly allowing multiple connections will solve this issue? This is a fairly straightforward scenario, where none of the peers misbehave or lose connectivity. Why would we want multiple connections here?

As a result, Node 2 will discover a node with a different peer ID from Node 1 but with address /ip4/127.0.0.1/tcp/30333. It will try to connect to it as part of the DHT lookups.

It seems to me that it should not attempt dialing an address that's already connected in the first place.

romanb · 2020-02-08T12:03:56Z

How exactly allowing multiple connections will solve this issue? This is a fairly straightforward scenario, where none of the peers misbehave or lose connectivity. Why would we want multiple connections here?

As I explained in an earlier comment, the immediate cause of this issue is that the "listener" closes its existing connection, preferring the new over the old connection in the attempt to enforce a single connection per peer (the "dialer" then later, upon discovering the peer ID mismatch closes the new connection as well, and the connect/disconnect dance begins in this way). While my first reaction was that it doesn't seem right that new connections are preferred over old ones in this scheme, and I'd rather swap that around, @tomaka had some concerns on doing that. In any case, removing the single-connection-per-peer policy is a strictly more general and desirable solution, not just in light of this issue. In this particular scenario the "listener" then no longer has to make a choice between these connections, the old connection remains unaffected, the "dialer" eventually closes its new connection attempt upon discovering the peer ID mismatch.

As a result, Node 2 will discover a node with a different peer ID from Node 1 but with address /ip4/127.0.0.1/tcp/30333. It will try to connect to it as part of the DHT lookups.

It seems to me that it should not attempt dialing an address that's already connected in the first place.

In general and at the level of libp2p, I don't think it is desirable to disallow multiple connections to the same address. Of course - and I think that is what you are referring to - in the context of a specific protocol, like Kademlia, one could argue that connections should be uniquely identified by such an address. However, the (logical) overlay network of Kademlia only operates opaquely on a uniformly distributed keyspace, which also contains the node / peer IDs. Kademlia only uniquely identifies peers by these IDs and addresses to connect to are a secondary implementation artifact. In the scenario here Kademlia sees a different peer ID for the same address, i.e. another peer that supposedly also has that address (among others, possibly). While you could argue that it should disregard the peer ID in this case, seeing that it already has a connection to the same address, even though with a different peer ID, I'm really not sure this is a good idea. If multiple peers are seen advertising the same address, who is to decide which is "right", i.e. which to connect to, respectively which connection to keep and which others to ignore.

arkpar · 2020-02-10T11:12:24Z

Example situation: Node 1 and Node 2 are connected. Node 2 loses its Internet connection, realizes it, and kills all existing sockets. No FIN is actually being sent because of no Internet access. Node 2 then gains back its Internet connection and tries to re-connect to Node 1. Node 1 isn't aware that the previous connection is dead.
In this situation, the new connection is the right choice.

I'd argue that new connection should not replace existing. Dead connections will eventually drop because we have keep-alive or ping protocols. Waiting for 30 seconds to restore connectivity is fine for substrate.

If multiple peers are seen advertising the same address, who is to decide which is "right", i.e. which to connect to, respectively which connection to keep and which others to ignore.

You don't drop existing connections. Otherwise it's an attack vector.

I'd like to clarify that "multiple connections" being discussed are to support connections to the same address with different node IDs. Multiple connections to the same address/node_id still won't be allowed, right?

Regarding multiple connections to the same node id, we probably don't want that in substrate/polkadot. Proposed use case sounds like: "Let's allow the second connection because the first one might be actually dead" sounds like a hack. What if the first one never closes after all? It look like we are struggling with managing connections even now, when duplicates are not allowed. This looks like it will introduce a lot of unneeded complexity for no good reason.

Additionally, can there be an additional authentication mechanism introduced to Kademlia layer? Devp2p discovery would not propagate unconfirmed addresses. "Confirmed" here means that there was a signed UDP ping/pong exchange with that address first.

romanb · 2020-02-10T12:07:56Z

Example situation: Node 1 and Node 2 are connected. Node 2 loses its Internet connection, realizes it, and kills all existing sockets. No FIN is actually being sent because of no Internet access. Node 2 then gains back its Internet connection and tries to re-connect to Node 1. Node 1 isn't aware that the previous connection is dead.
In this situation, the new connection is the right choice.

I'd argue that new connection should not replace existing. Dead connections will eventually drop because we have keep-alive or ping protocols. Waiting for 30 seconds to restore connectivity is fine for substrate.

I agree, hence my first reaction was to change that, as I mentioned at the end of my first comment. The same thing (i.e. not dropping the existing connection) also happens with libp2p-core permitting multiple connections.

If multiple peers are seen advertising the same address, who is to decide which is "right", i.e. which to connect to, respectively which connection to keep and which others to ignore.

You don't drop existing connections. Otherwise it's an attack vector.

Sure, I hinted at the same thing at the end of my first comment. I think we are on the same page here.

I'd like to clarify that "multiple connections" being discussed are to support connections to the same address with different node IDs. Multiple connections to the same address/node_id still won't be allowed, right?

I see no reason for a general-purpose networking library like libp2p to disallow that, so yes, that will be allowed.

Regarding multiple connections to the same node id, we probably don't want that in substrate/polkadot. Proposed use case sounds like: "Let's allow the second connection because the first one might be actually dead" sounds like a hack. What if the first one never closes after all? It look like we are struggling with managing connections even now, when duplicates are not allowed. This looks like it will introduce a lot of unneeded complexity for no good reason.

It's fine if substrate/polkadot do not intentionally make use of multiple connections per peer. Indeed, in libp2p/rust-libp2p#1440 even libp2p-swarm retains these semantics. Nevertheless, two peers may connect to each other "simultaneously" and that is the part where trying to enforce a single connection per peer at all times adds complexity that is removed in libp2p/rust-libp2p#1440. If even a temporary second connection is undesirable for a specific application protocol, it is up to that protocol to decide which connection to close and when.

Additionally, can there be an additional authentication mechanism introduced to Kademlia layer? Devp2p discovery would not propagate unconfirmed addresses. "Confirmed" here means that there was a signed UDP ping/pong exchange with that address first.

That would need to be laid out in more detail in order for me to make an informed comment. In general I expressed my desire in the past to allow better curation of Kademlia's k-buckets through the public API offered by libp2p-kad, i.e. to provide more control over which peers and addresses are in the routing table (and thus advertised to others) at any time. This may or may not already be sufficient to implement such a use-case. There is also some related work proposed in libp2p/rust-libp2p#1352, though that is primarily a means for prioritizing entries in already full k-buckets.

Instead of trying to enforce a single connection per peer, which involves quite a bit of additional complexity e.g. to prioritise simultaneously opened connections and can have other undesirable consequences [1], we now make multiple connections per peer a feature. The gist of these changes is as follows: The concept of a "node" with an implicit 1-1 correspondence to a connection has been replaced with the "first-class" concept of a "connection". The code from `src/nodes` has moved (with varying degrees of modification) to `src/connection`. A `HandledNode` has become a `Connection`, a `NodeHandler` a `ConnectionHandler`, the `CollectionStream` was the basis for the new `connection::Pool`, and so forth. Conceptually, a `Network` contains a `connection::Pool` which in turn internally employs the `connection::Manager` for handling the background `connection::manager::Task`s, one per connection, as before. These are all considered implementation details. On the public API, `Peer`s are managed as before through the `Network`, except now the API has changed with the shift of focus to (potentially multiple) connections per peer. The `NetworkEvent`s have accordingly also undergone changes. The Swarm APIs remain largely unchanged, except for the fact that `inject_replaced` is no longer called. It may now practically happen that multiple `ProtocolsHandler`s are associated with a single `NetworkBehaviour`, one per connection. If implementations of `NetworkBehaviour` rely somehow on communicating with exactly one `ProtocolsHandler`, this may cause issues, but it is unlikely. [1]: paritytech/substrate#4272

ghost · 2020-02-23T19:15:59Z

We are seeing this behavior in our private network as well due to IP reuse of the nodes, here's a way of producing it using Docker:

First create an internal docker network for this:

docker network create \
  --internal \
  --subnet 172.19.0.0/16 \
  --opt "com.docker.network.bridge.name=substrate" \
  substrate

Start nodes alice, bob, and charlie:

docker run --rm --name alice --network substrate --ip 172.19.1.1 \
  parity/substrate:2.0.0-646e7fb \
  --chain local \
  --validator \
  --node-key 0000000000000000000000000000000000000000000000000000000000000001 \
  --alice \
  --no-mdns

docker run --rm --name bob --network substrate --ip 172.19.1.2 \
  parity/substrate:2.0.0-646e7fb \
  --chain local \
  --validator \
  --node-key 0000000000000000000000000000000000000000000000000000000000000002 \
  --bob \
  --no-mdns \
  --bootnodes /ip4/172.19.1.1/tcp/30333/p2p/QmRpheLN4JWdAnY7HGJfWFNbfkQCb6tFf4vvA6hgjMZKrR

docker run --rm --name charlie --network substrate --ip 172.19.1.3 \
  parity/substrate:2.0.0-646e7fb \
  --chain local \
  --validator \
  --node-key 0000000000000000000000000000000000000000000000000000000000000003 \
  --charlie \
  --no-mdns \
  --bootnodes /ip4/172.19.1.1/tcp/30333/p2p/QmRpheLN4JWdAnY7HGJfWFNbfkQCb6tFf4vvA6hgjMZKrR

Ctrl-C to Stop bob and charlie.
Start charlie but this time using bob's old IP address (172.19.1.2):

docker run --rm --name charlie --network substrate --ip 172.19.1.2 \
  parity/substrate:2.0.0-646e7fb \
  --chain local \
  --validator \
  --node-key 0000000000000000000000000000000000000000000000000000000000000003 \
  --charlie \
  --no-mdns \
  --bootnodes /ip4/172.19.1.1/tcp/30333/p2p/QmRpheLN4JWdAnY7HGJfWFNbfkQCb6tFf4vvA6hgjMZKrR

Then charlie will repeatedly try to connect to alice which the connection will stay connected for very short moment then gets dropped.

And alice will repeatedly try to find bob at 172.19.1.2 but every time it sees charlie's peer ID instead so the connection then gets dropped. This is a problem on its own too as bob will never be at that IP anymore.

* Allow multiple connections per peer in libp2p-core. Instead of trying to enforce a single connection per peer, which involves quite a bit of additional complexity e.g. to prioritise simultaneously opened connections and can have other undesirable consequences [1], we now make multiple connections per peer a feature. The gist of these changes is as follows: The concept of a "node" with an implicit 1-1 correspondence to a connection has been replaced with the "first-class" concept of a "connection". The code from `src/nodes` has moved (with varying degrees of modification) to `src/connection`. A `HandledNode` has become a `Connection`, a `NodeHandler` a `ConnectionHandler`, the `CollectionStream` was the basis for the new `connection::Pool`, and so forth. Conceptually, a `Network` contains a `connection::Pool` which in turn internally employs the `connection::Manager` for handling the background `connection::manager::Task`s, one per connection, as before. These are all considered implementation details. On the public API, `Peer`s are managed as before through the `Network`, except now the API has changed with the shift of focus to (potentially multiple) connections per peer. The `NetworkEvent`s have accordingly also undergone changes. The Swarm APIs remain largely unchanged, except for the fact that `inject_replaced` is no longer called. It may now practically happen that multiple `ProtocolsHandler`s are associated with a single `NetworkBehaviour`, one per connection. If implementations of `NetworkBehaviour` rely somehow on communicating with exactly one `ProtocolsHandler`, this may cause issues, but it is unlikely. [1]: paritytech/substrate#4272 * Fix intra-rustdoc links. * Update core/src/connection/pool.rs Co-Authored-By: Max Inden <mail@max-inden.de> * Address some review feedback and fix doc links. * Allow responses to be sent on the same connection. * Remove unnecessary remainders of inject_replaced. * Update swarm/src/behaviour.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update swarm/src/lib.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update core/src/connection/manager.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update core/src/connection/manager.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update core/src/connection/pool.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Incorporate more review feedback. * Move module declaration below imports. * Update core/src/connection/manager.rs Co-Authored-By: Toralf Wittner <tw@dtex.org> * Update core/src/connection/manager.rs Co-Authored-By: Toralf Wittner <tw@dtex.org> * Simplify as per review. * Fix rustoc link. * Add try_notify_handler and simplify. * Relocate DialingConnection and DialingAttempt. For better visibility constraints. * Small cleanup. * Small cleanup. More robust EstablishedConnectionIter. * Clarify semantics of `DialingPeer::connect`. * Don't call inject_disconnected on InvalidPeerId. To preserve the previous behavior and ensure calls to `inject_disconnected` are always paired with calls to `inject_connected`. * Provide public ConnectionId constructor. Mainly needed for testing purposes, e.g. in substrate. * Move the established connection limit check to the right place. * Clean up connection error handling. Separate connection errors into those occuring during connection setup or upon rejecting a newly established connection (the `PendingConnectionError`) and those errors occurring on previously established connections, i.e. for which a `ConnectionEstablished` event has been emitted by the connection pool earlier. * Revert change in log level and clarify an invariant. * Remove inject_replaced entirely. * Allow notifying all connection handlers. Thereby simplify by introducing a new enum `NotifyHandler`, used with a single constructor `NetworkBehaviourAction::NotifyHandler`. * Finishing touches. Small API simplifications and code deduplication. Some more useful debug logging. Co-authored-by: Max Inden <mail@max-inden.de> Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com> Co-authored-by: Toralf Wittner <tw@dtex.org>

tomaka · 2020-04-09T09:08:01Z

Should have been fixed by #5278, although I didn't verify that it actually is.

arkpar · 2020-04-09T10:54:02Z

It is now much worse

020-04-09 11:55:41.460 main-tokio- TRACE sync  Connecting QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ
2020-04-09 11:55:41.460 main-tokio- TRACE sync  New peer QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ Status { version: 6, min_supported_version: 3, roles: FULL, best_number: 1181553, best_hash: 0x06ba23f8e56cd2b99ef5998bb5eab05bd6ed81afc76699b9f313abcaf59e92d1, genesis_hash: 0xb0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe, chain_status: [] }
2020-04-09 11:55:41.460 main-tokio- DEBUG sync  Connected QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ
2020-04-09 11:55:45.808 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:45.824 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:51.164 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:51.264 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:53.20 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:53.30 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:53.148 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:56.144 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:55:56.233 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:56:03.729 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:56:03.872 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:56:03.904 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected
2020-04-09 11:56:05.132 main-tokio- TRACE sync  QmX6ck5cwxsiSUbWrCZNKUeY188AAb4dvreYEFh6BtMcPQ disconnected

The peer is immediately reported as disconnected after connection. After that TCP connection stays open, and block requests from that peer come through. So they still consider the connection to be active.

Also, there are multiple disconnect notifications.

Update: Apparently this is resolved by #5595

h4x3rotab · 2020-12-11T08:22:18Z

Does this affect nodes behind NAT? E.g. if we have two nodes running behind a NAT router, they share the same IP address, but of course with different port number and peer id. Is there any reference how the DHT stores the peer id?

tomaka · 2021-05-03T12:08:02Z

Closing as stale and probably resolved.

* Allow multiple connections per peer in libp2p-core. Instead of trying to enforce a single connection per peer, which involves quite a bit of additional complexity e.g. to prioritise simultaneously opened connections and can have other undesirable consequences [1], we now make multiple connections per peer a feature. The gist of these changes is as follows: The concept of a "node" with an implicit 1-1 correspondence to a connection has been replaced with the "first-class" concept of a "connection". The code from `src/nodes` has moved (with varying degrees of modification) to `src/connection`. A `HandledNode` has become a `Connection`, a `NodeHandler` a `ConnectionHandler`, the `CollectionStream` was the basis for the new `connection::Pool`, and so forth. Conceptually, a `Network` contains a `connection::Pool` which in turn internally employs the `connection::Manager` for handling the background `connection::manager::Task`s, one per connection, as before. These are all considered implementation details. On the public API, `Peer`s are managed as before through the `Network`, except now the API has changed with the shift of focus to (potentially multiple) connections per peer. The `NetworkEvent`s have accordingly also undergone changes. The Swarm APIs remain largely unchanged, except for the fact that `inject_replaced` is no longer called. It may now practically happen that multiple `ProtocolsHandler`s are associated with a single `NetworkBehaviour`, one per connection. If implementations of `NetworkBehaviour` rely somehow on communicating with exactly one `ProtocolsHandler`, this may cause issues, but it is unlikely. [1]: paritytech/substrate#4272 * Fix intra-rustdoc links. * Update core/src/connection/pool.rs Co-Authored-By: Max Inden <mail@max-inden.de> * Address some review feedback and fix doc links. * Allow responses to be sent on the same connection. * Remove unnecessary remainders of inject_replaced. * Update swarm/src/behaviour.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update swarm/src/lib.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update core/src/connection/manager.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update core/src/connection/manager.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Update core/src/connection/pool.rs Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com> * Incorporate more review feedback. * Move module declaration below imports. * Update core/src/connection/manager.rs Co-Authored-By: Toralf Wittner <tw@dtex.org> * Update core/src/connection/manager.rs Co-Authored-By: Toralf Wittner <tw@dtex.org> * Simplify as per review. * Fix rustoc link. * Add try_notify_handler and simplify. * Relocate DialingConnection and DialingAttempt. For better visibility constraints. * Small cleanup. * Small cleanup. More robust EstablishedConnectionIter. * Clarify semantics of `DialingPeer::connect`. * Don't call inject_disconnected on InvalidPeerId. To preserve the previous behavior and ensure calls to `inject_disconnected` are always paired with calls to `inject_connected`. * Provide public ConnectionId constructor. Mainly needed for testing purposes, e.g. in substrate. * Move the established connection limit check to the right place. * Clean up connection error handling. Separate connection errors into those occuring during connection setup or upon rejecting a newly established connection (the `PendingConnectionError`) and those errors occurring on previously established connections, i.e. for which a `ConnectionEstablished` event has been emitted by the connection pool earlier. * Revert change in log level and clarify an invariant. * Remove inject_replaced entirely. * Allow notifying all connection handlers. Thereby simplify by introducing a new enum `NotifyHandler`, used with a single constructor `NetworkBehaviourAction::NotifyHandler`. * Finishing touches. Small API simplifications and code deduplication. Some more useful debug logging. Co-authored-by: Max Inden <mail@max-inden.de> Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com> Co-authored-by: Toralf Wittner <tw@dtex.org>

arkpar added the I3-bug The node fails to follow expected behavior. label Dec 2, 2019

arkpar assigned tomaka Dec 2, 2019

romanb self-assigned this Jan 9, 2020

tomaka mentioned this issue Jan 14, 2020

Allow multiple simultaneous connections to the same peer libp2p/rust-libp2p#912

Closed

romanb mentioned this issue Feb 7, 2020

Multiple connections per peer libp2p/rust-libp2p#1440

Merged

tomaka mentioned this issue Feb 21, 2020

What's the recommended way of dealing with node connectivity/identity issue caused by IP reused in a private network? #5015

Closed

romanb mentioned this issue Feb 26, 2020

Adapt to rust-libp2p#1440. #5066

Closed

romanb mentioned this issue Mar 17, 2020

libp2p-next #5278

Merged

romanb removed their assignment May 7, 2020

tomaka mentioned this issue May 7, 2020

Disconnect node after sync request cancelled #5945

Merged

h4x3rotab mentioned this issue Dec 16, 2020

[Not Confirmed] Nodes behind NAT doesn't coexist on DHT Phala-Network/phala-blockchain#106

Open

tomaka closed this as completed May 3, 2021

tomaka mentioned this issue Jun 14, 2022

Add fake out requests system in peers.rs paritytech/smoldot#2369

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local nodes keep disconnecting. #4272

Local nodes keep disconnecting. #4272

arkpar commented Dec 2, 2019 •

edited

Loading

tomaka commented Dec 3, 2019

romanb commented Jan 13, 2020 •

edited

Loading

tomaka commented Jan 13, 2020 •

edited

Loading

tomaka commented Jan 13, 2020 •

edited

Loading

arkpar commented Jan 14, 2020

tomaka commented Jan 14, 2020

romanb commented Jan 16, 2020 •

edited

Loading

arkpar commented Feb 7, 2020

romanb commented Feb 8, 2020 •

edited

Loading

arkpar commented Feb 10, 2020

romanb commented Feb 10, 2020

ghost commented Feb 23, 2020

tomaka commented Apr 9, 2020

arkpar commented Apr 9, 2020 •

edited

Loading

h4x3rotab commented Dec 11, 2020

tomaka commented May 3, 2021

Local nodes keep disconnecting. #4272

Local nodes keep disconnecting. #4272

Comments

arkpar commented Dec 2, 2019 • edited Loading

tomaka commented Dec 3, 2019

romanb commented Jan 13, 2020 • edited Loading

tomaka commented Jan 13, 2020 • edited Loading

tomaka commented Jan 13, 2020 • edited Loading

arkpar commented Jan 14, 2020

tomaka commented Jan 14, 2020

romanb commented Jan 16, 2020 • edited Loading

arkpar commented Feb 7, 2020

romanb commented Feb 8, 2020 • edited Loading

arkpar commented Feb 10, 2020

romanb commented Feb 10, 2020

ghost commented Feb 23, 2020

tomaka commented Apr 9, 2020

arkpar commented Apr 9, 2020 • edited Loading

h4x3rotab commented Dec 11, 2020

tomaka commented May 3, 2021

arkpar commented Dec 2, 2019 •

edited

Loading

romanb commented Jan 13, 2020 •

edited

Loading

tomaka commented Jan 13, 2020 •

edited

Loading

tomaka commented Jan 13, 2020 •

edited

Loading

romanb commented Jan 16, 2020 •

edited

Loading

romanb commented Feb 8, 2020 •

edited

Loading

arkpar commented Apr 9, 2020 •

edited

Loading