Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue with false internet connection test #2305

Merged
merged 2 commits into from Sep 17, 2019

Conversation

@chrisveilleux
Copy link
Member

commented Sep 13, 2019

Description

The ability to reach a name server was used to determine internet connectivity. This occasionally returned false positives. Changed to check ability to get to google.com

How to test

This issue shows up in boot only occasionally it is not easy to reproduce.

Contributor license agreement signed?

CLA [yes]

Copy link
Contributor

left a comment

imports!

@forslund

This comment has been minimized.

Copy link
Member

commented Sep 14, 2019

Just connecting to google isn't a strong test. To handle capture portals the actual data received need to be verified.

If there is a problem with the dns test causing false positives (internet connected but no DNS yet?) I suggest we just use ncsi test where the retrieved test data is compared to a known entity.

Also please don't do refactoring in the same commit as logic changes.

@davidwagnerkc

This comment has been minimized.

Copy link
Contributor

commented Sep 16, 2019

Just using the ncsi test sounds reasonable to me. @chrisveilleux ?

@davidwagnerkc

This comment has been minimized.

Copy link
Contributor

commented Sep 16, 2019

@forslund @chrisveilleux

What about replacing 8.8.8.8 check with something like this.

It is faster and more efficient to only request headers rather than content with GET (https://stackoverflow.com/a/29854274/2395133)

and it checks for valid header response content. From what I have read no header fields are strictly required by the standard, but reasonable that one of these most common fields will be returned (all of them were returned in request I made)

res = requests.head('http://www.google.com', timeout=2)
connected = any(field in res.headers for field in ('Server', 'Date', 'Content-Type', 'Content-Length'))
@davidwagnerkc davidwagnerkc self-requested a review Sep 16, 2019
@davidwagnerkc davidwagnerkc dismissed their stale review Sep 16, 2019

Need to reach consensus on solution first.

@forslund

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

The problem would still be capture portals, they either use dns manipulation or directly intercepts HTTP packages.

The dns check circumvents those issues by using direct socket connections and uses an IP. the ncis test validates the data received to make sure the target uri is what's expected.

Using the dns_connection check together with the headers only check should be relatively fast and even more reliable... Except for networks blocking external dns connections... (which is why the ncis check was added).

    (is_connected_dns() and is_connected_google_header()) or is_connected_ncis()` 

maybe?

@chrisveilleux

This comment has been minimized.

Copy link
Member Author

commented Sep 16, 2019

Why are we concerned about captive portals? Could you give me a use case @forslund?

It occurs to me that what we really need is access to our API. Even if there are skills that require the internet, you aren't going to get very far in your Mycroft experience without hitting our API. Is there any reason we can't change this to hit a new API endpoint that returns a string value we can test? Is there another reason we need to be able to test internet connectivity independently of connectivity to our API?

The problem we are trying to fix here is that this function indicates that we are connected to the internet and then API requests fail because they cannot connect to our API. This breaks the boot up process. Maybe there is another place to address the issue with connecting to our API but if our code says we are connected to the internet, we'd better be connected to the internet.

@forslund

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

What's the use of not falsely getting a connected state?

People trying out core on a network with a captive portal shouldn't get the impression that they're connected when in fact they're just connected to the captive portal. It was originally prepared for the handling of the captive portal case...which is still on the todo list...

I've had use for it several times when working on the road.

The endpoint was suggested a couple of years back and I'm not at all against it.

chrisveilleux and others added 2 commits Sep 13, 2019
Change logic that checked for an internet connection after an issue occurred during initial boot where the old connection check gave a false positive.

Reordered imports to be PEP8 compliant. Minor refactoring to remove issues identified in PyCharm
This handles capture portals as well.

The standard connection logic is now

Check outside ip is reachable and after that check that www.google.com
is resolvable and connectable.

TODO: create endpoint on backend with known response to perform check
upon.
@forslund forslund force-pushed the bug/false-internet-connection branch from bea7a83 to 29dbbe1 Sep 17, 2019
@forslund

This comment has been minimized.

Copy link
Member

commented Sep 17, 2019

Merging this with the addition of checking the connect to DNS as well. The time for the connection to DNS server is insignificant compared to the time the google check takes.

We need to revisit this when we have a dedicated endpoint to check this.

@forslund forslund merged commit 7ba1da5 into dev Sep 17, 2019
3 checks passed
3 checks passed
:-) Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.01%) to 53.252%
Details
@forslund forslund deleted the bug/false-internet-connection branch Sep 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.