Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TCP keepalive to detect broken sockets #681

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joeyoravec
Copy link

@joeyoravec joeyoravec commented Apr 25, 2024

fixes #674

vsomeip does not enforce a single socket between endpoints; it does not actively close an existing socket after a new one is established.

It is possible that:

  • socket is established from nodeA to nodeB
  • ethernet gets unplugged
  • nodeA tries to transmit, times out, closes the socket
  • ethernet is reconnected
  • nodeA establishes a new socket to node B

nodeB never received an RST while the cable was unplugged, still thinks the old socket is established, and effectively "leaks" until TCP keepalive detects the break. At default values this takes 2 hr 10 min. This can be mitigated by reducing the keepalive settings.

The implementation of TCP keepalive is different between OS:

  • Linux can configure all 3 settings on a per-socket basis
  • QNX can configure 1 setting per-socket and 2 settings globally

QNX would need to set the global values using sysctl, perhaps at startup.

vsomeip does not enforce a single socket between endpoints; it does not
actively close an existing socket after a new one is established.

It is possible that:
- socket is established from nodeA to nodeB
- ethernet gets unplugged
- nodeA tries to transmit, times out, closes the socket
- ethernet is reconnected
- nodeA establishes a new socket to node B

nodeB never received an RST while the cable was unplugged, still thinks
the old socket is established, and effectively "leaks" until TCP keepalive
detects the break. At default values this takes 2 hr 10 min. This can be
mitigated by reducing the keepalive settings.

The implementation of TCP keepalive is different between OS:
- Linux can configure all 3 settings on a per-socket basis
- QNX can configure 1 setting per-socket and 2 settings globally

QNX would need to set the global values using sysctl, perhaps at startup.
@goncaloalmeida
Copy link
Collaborator

@joeyoravec is this ready to review or are you still working on this?

@joeyoravec
Copy link
Author

@joeyoravec is this ready to review or are you still working on this?

I've left this in draft so it's clear that this is not ready for merge. Yes this "works" and is worth reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: vsomeip leaks TCP sockets
2 participants