Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: run unit tests without networking #6932

Closed
wants to merge 5 commits into from

Conversation

zyga
Copy link
Collaborator

@zyga zyga commented May 29, 2019

While preparing the 2.39.1 package update for openSUSE I noticed that
some unit tests fail in non-obvious ways. Some further debugging and
code changes uncovered that the tests are attempting to talk to
api.snapcraft.io

As a quick way to see what is broken, run all unit tests with the
network namespace unshared. This gives us a system without any network
interfaces, apart from lo, where any attempt to use TCP/IP or even
plain DNS will fail.

Currently there are two test failures:

FAIL: retry_test.go:406: retrySuite.TestRetryDoesNotFailForPermanentDNSErrors

retry_test.go:424:
    // we try exactly once, a non-existing server is a permanent error
    c.Assert(n, Equals, 1)
... obtained int = 5
... expected int = 1
FAIL: managers_test.go:3450: mgrsSuite.TestHappyDeviceRegistrationWithPrepareDeviceHook

managers_test.go:3512:
    c.Assert(err, IsNil)
... value *errors.errorString = &errors.errorString{s:"Settle is not converging"} ("Settle is not converging")

The branch also contains fixes for said issues, it was confirmed to pass in "osc build"

Signed-off-by: Zygmunt Krynicki me@zygoon.pl

@zyga zyga force-pushed the fix/tests-using-network branch from e36f060 to 5b8876b Compare May 29, 2019 15:04
tests/unit/go/task.yaml Outdated Show resolved Hide resolved
@zyga zyga force-pushed the fix/tests-using-network branch 2 times, most recently from 40d2fa5 to 3d47147 Compare May 29, 2019 15:09
While preparing the 2.39.1 package update for openSUSE I noticed that
some unit tests fail in non-obvious ways. Some further debugging and
code changes uncovered that the tests are attempting to talk to
api.snapcraft.io

As a quick way to see what is broken, run all unit tests with the
network namespace unshared. This gives us a system without any network
interfaces, where any attempt to use TCP/IP or even plain DNS will fail.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
@zyga zyga force-pushed the fix/tests-using-network branch from 3d47147 to bdb8c28 Compare May 29, 2019 15:18
@pedronis
Copy link
Collaborator

This should cover the 2nd one I think:

diff --git a/overlord/managers_test.go b/overlord/managers_test.go
index e95c4c059..afc594544 100644
--- a/overlord/managers_test.go
+++ b/overlord/managers_test.go
@@ -3448,6 +3448,10 @@ version: 2.0`
 }
 
 func (ms *mgrsSuite) TestHappyDeviceRegistrationWithPrepareDeviceHook(c *C) {
+       // just to 404 locally eager account-key requests
+       mockStoreServer := ms.mockStore(c)
+       defer mockStoreServer.Close()
+
        model := ms.brands.Model("my-brand", "my-model", modelDefaults, map[string]interface{}{
                "gadget": "gadget",
        })

zyga added 2 commits May 30, 2019 12:20
The patch itself is from Samuele, this prevents the unit tests from
hitting the snapcraft web APIs. In certain environments unit tests run
without any network and using anything but local services is disallowed.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Running our test suite on an offline system, or more conveniently, in a
network namespace with only loopback device, I ran across this error:

    FAIL: retry_test.go:406: retrySuite.TestRetryDoesNotFailForPermanentDNSErrors

    retry_test.go:424:
	// we try exactly once, a non-existing server is a permanent error
	c.Assert(n, Equals, 1)
    ... obtained int = 5
    ... expected int = 1

The retry test would attempt to perform a network operation five times
even though the system has no chance of resolving any names because it
is offline.

Inspection of the returned errors indicates the following error chain:

    outer:  &url.Error{Op:"Get", URL:"http://nonexistingserver909123.com/", Err:(*net.OpError)(0xc000200140)}
    nested: &net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:net.Addr(nil), Err:(*net.DNSError)(0xc0001b8080)}
    nexted: &net.DNSError{Err:"dial udp 172.16.124.2:53: connect: network is unreachable", Name:"nonexistingserver909123.com", Server:"172.16.124.2:53", IsTimeout:false, IsTemporary:true}

Crucially the innermost error is marked as temporary.

The logic we have first checks if an error is temporary, and if so,
immediately schedules retry, and only then checks if the system is
offline or if DNS is unavailable.

I admit that the retry logic is complex but on a quick look swapping
those checks around is the right thigh to do. When we're offline there's
no point in retrying.

The test now passes and there are no regressions in any systems in
spread.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
@zyga zyga marked this pull request as ready for review May 30, 2019 10:53
zyga added 2 commits May 30, 2019 13:36
When we detect if the DNS server is unavailable, in response to a HTTP
request error, we unpack the returned error object, hoping to reach
net.DNSError. When the error message contained therein contains the text

    Temporary failure in name resolution

We understand that this really conveys the fact we cannot talk to the
DNS sever. This error message, however, depends on the golang DNS
backend, either native or cgo. Similar check in ShouldRetryError
carries a separate test for another message:

    connection refused

When running unit tests inside the open build service the network setup
is such that we hit this case exactly, with the returned errors being:

    [   44s] nesting 2: &net.DNSError{Err:"read udp [::1]:58819->[::1]:53: read: connection refused", Name:"nonexistingserver909123.com", Server:"[::1]:53", IsTimeout:false, IsTemporary:true}

This patch extends the check to handle both situations.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
GOPATH=/tmp/static-unit-tests ./run-checks --unit
SCRIPT
chmod +x script
unshare -n sh -c "ifconfig lo up && su -l -c $(pwd)/script test"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 14.04:

+ unshare -n sh -c 'ifconfig lo up && su -l -c /home/gopath/src/github.com/snapcore/snapd/tests/unit/go/script test'
unshare: invalid option -- 'c'

This should probably be unshare -n -- sh -c ..

@@ -178,7 +178,7 @@ func isDnsUnavailable(err error) bool {
// not exposed in net.DNSError and in go-1.10 it is not even
// a temporary error so there is no way to distiguish it other
// than a fugly string compare on a (potentially) localized string
return strings.Contains(dnsErr.Err, "Temporary failure in name resolution")
return strings.Contains(dnsErr.Err, "connection refused") || strings.Contains(dnsErr.Err, "Temporary failure in name resolution")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change we now have the exact same conditional as the one inside shouldRetryError(); that means that with the re-ordered conditionals below we now treat this as a parmanent error and not an error that should be retried - therefore the network-retry spread test now fails. I'm investigating further how to untangle this, we basically need to make our mind here..

@pedronis pedronis self-requested a review May 31, 2019 10:42
@zyga
Copy link
Collaborator Author

zyga commented Jun 3, 2019

I agree this requires more work. I think the unshared -n tests and the fix from @pedronis are low hanging fruit here. I'm more worried that the retry logic is going to drag on and this won't be mergeable for a while.

@zyga
Copy link
Collaborator Author

zyga commented Jun 5, 2019

I think this can be closed, I will reopen with just the networking tweak to tests after Paweł's fix from #6949 is merged.

@zyga zyga closed this Jun 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants