New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: run unit tests without networking #6932
Conversation
e36f060
to
5b8876b
Compare
40d2fa5
to
3d47147
Compare
While preparing the 2.39.1 package update for openSUSE I noticed that some unit tests fail in non-obvious ways. Some further debugging and code changes uncovered that the tests are attempting to talk to api.snapcraft.io As a quick way to see what is broken, run all unit tests with the network namespace unshared. This gives us a system without any network interfaces, where any attempt to use TCP/IP or even plain DNS will fail. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
3d47147
to
bdb8c28
Compare
This should cover the 2nd one I think:
|
The patch itself is from Samuele, this prevents the unit tests from hitting the snapcraft web APIs. In certain environments unit tests run without any network and using anything but local services is disallowed. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Running our test suite on an offline system, or more conveniently, in a network namespace with only loopback device, I ran across this error: FAIL: retry_test.go:406: retrySuite.TestRetryDoesNotFailForPermanentDNSErrors retry_test.go:424: // we try exactly once, a non-existing server is a permanent error c.Assert(n, Equals, 1) ... obtained int = 5 ... expected int = 1 The retry test would attempt to perform a network operation five times even though the system has no chance of resolving any names because it is offline. Inspection of the returned errors indicates the following error chain: outer: &url.Error{Op:"Get", URL:"http://nonexistingserver909123.com/", Err:(*net.OpError)(0xc000200140)} nested: &net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:net.Addr(nil), Err:(*net.DNSError)(0xc0001b8080)} nexted: &net.DNSError{Err:"dial udp 172.16.124.2:53: connect: network is unreachable", Name:"nonexistingserver909123.com", Server:"172.16.124.2:53", IsTimeout:false, IsTemporary:true} Crucially the innermost error is marked as temporary. The logic we have first checks if an error is temporary, and if so, immediately schedules retry, and only then checks if the system is offline or if DNS is unavailable. I admit that the retry logic is complex but on a quick look swapping those checks around is the right thigh to do. When we're offline there's no point in retrying. The test now passes and there are no regressions in any systems in spread. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
When we detect if the DNS server is unavailable, in response to a HTTP request error, we unpack the returned error object, hoping to reach net.DNSError. When the error message contained therein contains the text Temporary failure in name resolution We understand that this really conveys the fact we cannot talk to the DNS sever. This error message, however, depends on the golang DNS backend, either native or cgo. Similar check in ShouldRetryError carries a separate test for another message: connection refused When running unit tests inside the open build service the network setup is such that we hit this case exactly, with the returned errors being: [ 44s] nesting 2: &net.DNSError{Err:"read udp [::1]:58819->[::1]:53: read: connection refused", Name:"nonexistingserver909123.com", Server:"[::1]:53", IsTimeout:false, IsTemporary:true} This patch extends the check to handle both situations. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
GOPATH=/tmp/static-unit-tests ./run-checks --unit | ||
SCRIPT | ||
chmod +x script | ||
unshare -n sh -c "ifconfig lo up && su -l -c $(pwd)/script test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 14.04:
+ unshare -n sh -c 'ifconfig lo up && su -l -c /home/gopath/src/github.com/snapcore/snapd/tests/unit/go/script test'
unshare: invalid option -- 'c'
This should probably be unshare -n -- sh -c ..
@@ -178,7 +178,7 @@ func isDnsUnavailable(err error) bool { | |||
// not exposed in net.DNSError and in go-1.10 it is not even | |||
// a temporary error so there is no way to distiguish it other | |||
// than a fugly string compare on a (potentially) localized string | |||
return strings.Contains(dnsErr.Err, "Temporary failure in name resolution") | |||
return strings.Contains(dnsErr.Err, "connection refused") || strings.Contains(dnsErr.Err, "Temporary failure in name resolution") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change we now have the exact same conditional as the one inside shouldRetryError()
; that means that with the re-ordered conditionals below we now treat this as a parmanent error and not an error that should be retried - therefore the network-retry spread test now fails. I'm investigating further how to untangle this, we basically need to make our mind here..
I agree this requires more work. I think the unshared -n tests and the fix from @pedronis are low hanging fruit here. I'm more worried that the retry logic is going to drag on and this won't be mergeable for a while. |
I think this can be closed, I will reopen with just the networking tweak to tests after Paweł's fix from #6949 is merged. |
While preparing the 2.39.1 package update for openSUSE I noticed that
some unit tests fail in non-obvious ways. Some further debugging and
code changes uncovered that the tests are attempting to talk to
api.snapcraft.io
As a quick way to see what is broken, run all unit tests with the
network namespace unshared. This gives us a system without any network
interfaces, apart from lo, where any attempt to use TCP/IP or even
plain DNS will fail.
Currently there are two test failures:
The branch also contains fixes for said issues, it was confirmed to pass in "osc build"
Signed-off-by: Zygmunt Krynicki me@zygoon.pl