Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor and add socket activity check for internet #312

Merged
merged 6 commits into from
Aug 17, 2021
Merged

Conversation

fhunleth
Copy link
Member

This commit series deals with improving internet connectivity checking. This is
not the last PR on this topic, but it's an improvement to where VintageNet has
been on this topic.

Internet connectivity checking is important for multi-homed devices so that the
routing tables can be set up to prefer faster and cheaper interfaces. When those
interfaces lose internet connectivity, the routing tables should be updated so
reconnects go to the new best interface.

The first two commits move code around so that it's easier to improve internet
connectivity checking. All of the feedback on connectivity checking has been
nontrivial enough that mixing it in with the general network interface code is
just too much. The "ping" code is now called TCPPing to accurately reflect
what it does. Moving the "3-strikes" ping failure state machine to its own
module was also needed since it was complicated enough that I was worried about
messing it up accidentally. Now the state machine is completely pure code and
easier to test.

The third and fourth commits add a socket inspector. The socket inspector looks
at TCP connections going through the network interface of interest and going
somewhere off-LAN. If the tx and rx counters on those sockets both go up between
checks, then there was internet connectivity over the past sampling period. This
means that a ping of an external host can be skipped. The keepalive messages on
MQTT and websocket connections will satisfy this check. This has two advantages:
less bandwidth usage and it works on tightly firewalled installs.

The final commit makes the internet connectivity checker try the socket
inspector first and if that fails, it tries to connect to an external host like
before.

These updates are backwards compatible, but they'll log a warning to update code
that refers to VintageNet.Interface.InternetConnectivityChecker to
VintageNet.Connectivity.InternetChecker.

  • Move connectivity check code to its own folder
  • Move check handling logic into its own module
  • Add internet connectivity inspector
  • inspector: support the socket API too
  • Use the connectivity inspector as a first internet check

Deprecation warnings are emitted for anyone using the old locations.
They'll still work.

The reason for moving the code out of interface is that connectivity
checking is becoming complicated and will have some additions soon. It
feels easier to manage outside of the already fairly complicated
interface handling code.
This also simplifies what happens when the connectivity check fails. All
errors are handled the same and there are no shortcuts to `:lan` even if
it looks like the interface is removed. The expectation is that the
interface will be properly removed shortly if it's for real and
everything else is suspect.

Moving the check handling logic into its own module is intended to make
it easier to adjust it and to simplify addng the Inspector check to the
InternetChecker.
@fhunleth
Copy link
Member Author

FYI - I need to fix some warnings and test failures. Hopefully we'll have green lights soon. I've been verifying this on test devices. It's a big update, so depending on the comments, I'm ready to break it up into smaller chunks to help reviews if needed.

@fhunleth fhunleth force-pushed the better-icc branch 2 times, most recently from 72ee7cd to 67475f1 Compare August 17, 2021 01:21
Copy link
Member

@jjcarstens jjcarstens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lib/vintage_net/connectivity/inspector.ex Outdated Show resolved Hide resolved
This is a new module that can determine whether a network interface is
connected to the internet by watching traffic on TCP sockets known to
Erlang/OTP. If a device is using long-lived TCP connections (like for
MQTT) that send keepalive messages, then it can declare
internet-connectivity just by watching to the tx and rx counters on the
long-lived connection. This helps reduce use of metered connections and
benefits devices behind restrictive firewalls that drop the pings.

This commit only adds the module. It is hooked up in a subsequent commit.
This majorly reduces internet traffic for checking status on those
network interfaces with long-lived TCP connections. Basically, if a TCP
connection on the interface that's to connecting to a host on a
different LAN has sent and received bytes since the last internet check,
then the check is skipped. Internet is available. AWS IoT and NervesHub
connections are sufficient since their keepalive messages will satisfy
this check.
The connectivity checker is pickier now on whether the interface is up
or down so the unit tests can't take as many shortcuts.

This also fixes an issue where `ifup` would be called and then a
connectivity check done immediately afterwards. The intended behavior is
for the CheckLogic to decide rather than this to be assumed. The result
of this is that the check will be done 500 ms later. This seems good
since it's short and 500 ms allows for a little time for the interface
to settle after it just declared that it was up.
@fhunleth
Copy link
Member Author

@jjcarstens confirmed that this makes a very tightly firewalled device report internet connectivity now.

@fhunleth fhunleth merged commit 596a33f into main Aug 17, 2021
@fhunleth fhunleth deleted the better-icc branch August 17, 2021 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants