Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ping): don't close connections upon failures #3947

Merged
merged 26 commits into from May 24, 2023

Conversation

thomaseizinger
Copy link
Contributor

@thomaseizinger thomaseizinger commented May 15, 2023

Description

Previously, the libp2p-ping module came with a policy to close a connection after X failed pings. This is only one of many possible policies on how users would want to do connection management.

We remove this policy without a replacement. If users wish to restore this functionality, they can easily implement such policy themselves: The default value of max_failures was 1. To restore the previous functionality users can simply close the connection upon the first received ping error.

In this same patch, we also simplify the API of ping::Event by removing the layer of ping::Success and instead reporting the RTT to the peer directly.

Related: #3591.

Notes & open questions

Patch-by-patch review is recommended.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

misc/metrics/CHANGELOG.md Outdated Show resolved Hide resolved
misc/metrics/CHANGELOG.md Show resolved Hide resolved
protocols/ping/CHANGELOG.md Outdated Show resolved Hide resolved
@mxinden
Copy link
Member

mxinden commented May 16, 2023

Do I understand correctly, that we expect all Transport implementations, e.g. TCP and QUIC, to close a malfunctioning connection?

If yes, fine with me to proceed with this pull request.

If not, I would expect all users to want some mechanism along the lines of what libp2p-ping provides today, namely to close malfunctioning connections, and thus I would advocate for keeping the mechanism in-place here.

@thomaseizinger
Copy link
Contributor Author

If not, I would expect all users to want some mechanism along the lines of what libp2p-ping provides today, namely to close malfunctioning connections, and thus I would advocate for keeping the mechanism in-place here.

Define malfunctioning?

I don't think libp2p-ping is a suitable way of identifying a malfunctioning connection:

  1. A remote peer is not guaranteed to support libp2p-ping, hence we cannot rely on it actually operating.
  2. A remote peer may deprioritize ping messages, i.e. not treat them with the highest priority and thus run into a timeout. That doesn't mean that the underlying connection is faulty.
  3. A remote peer may have a bug in the ping implementation but implement other protocols correctly.

I tried to already make a point that equating a working ping with a working connection is a policy and I believe that users should be in charge of policy. Do you not agree with that?

For example, another policy could be to disconnect all peers with a latency higher than 500ms or all that don't have a latency in the 95th percentile of all currently active connections.

Do I understand correctly, that we expect all Transport implementations, e.g. TCP and QUIC, to close a malfunctioning connection?

If they can reliably detect a malfunctioning connection, then yes absolutely.

@thomaseizinger
Copy link
Contributor Author

For example, another policy could be to disconnect all peers with a latency higher than 500ms or all that don't have a latency in the 95th percentile of all currently active connections.

I am happy to add some more docs to libp2p-ping to explain that.

@mergify

This comment was marked as resolved.

@mergify

This comment was marked as resolved.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with proceeding here. Thanks for the details. Just one thing on how the user should close a specific connection.

protocols/ping/CHANGELOG.md Show resolved Hide resolved
Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge from my end.

Comment on lines +278 to +279
// Note: For backward-compatibility the first failure is always "free"
// and silent. This allows peers who use a new substream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backwards compatibility still relevant? This implementation should be compatible with any recent other implementation adhering to the specification, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could reword it but the functionality still needs to be there. JS for example uses a new stream per ping and we should not report that as an error.

@mergify mergify bot merged commit 25bc30f into master May 24, 2023
63 checks passed
@mergify mergify bot deleted the feat/no-close-connection-ping-failures branch May 24, 2023 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants