Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TestLookupPTR failures on windows-amd64-longtest builder #38111

Open
bcmills opened this issue Mar 27, 2020 · 13 comments
Open

net: TestLookupPTR failures on windows-amd64-longtest builder #38111

bcmills opened this issue Mar 27, 2020 · 13 comments
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@bcmills
Copy link
Member

bcmills commented Mar 27, 2020

#!watchflakes
post <- builder == "windows-amd64-longtest" && pkg == "net" && test == "TestLookupPTR"

It's not obvious to me whether the test is flaky, the builder has a flaky network connection, or the test is detecting something that is actually wrong.

CC @mikioh @bradfitz @ianlancetaylor (for net), @dmitshur, @toothrot, @cagedmantis, @andybons (for builders)

2020-03-26T17:48:20-918d4d4/windows-amd64-longtest
2020-03-06T14:41:22-2b0f481/windows-amd64-longtest
2020-01-09T18:00:06-8ac98e7/windows-amd64-longtest
2019-12-30T09:34:53-4b43700/windows-amd64-longtest
2019-11-13T08:07:51-bf5f641/windows-amd64-longtest

@bcmills bcmills added Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows labels Mar 27, 2020
@bcmills bcmills added this to the Backlog milestone Mar 27, 2020
@bcmills
Copy link
Member Author

bcmills commented Mar 27, 2020

There are at least two different failure modes:

--- FAIL: TestLookupPTR (3.92s)
    lookup_windows_test.go:167: skipping failed lookup 1.1.1.1 test: exit status 1: 
        Pinging one.one.one.one [1.1.1.1] with 32 bytes of data:
        Request timed out.
        
        Ping statistics for 1.1.1.1:
            Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),
    lookup_windows_test.go:172: different results 1.1.1.1:	exp:null	got:["one.one.one.one."]
FAIL
FAIL	net	66.044s
--- FAIL: TestLookupPTR (0.07s)
    lookup_windows_test.go:160: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:163: no results
    lookup_windows_test.go:172: different results 1.1.1.1:	exp:[]	got:null
FAIL
FAIL	net	57.951s

@bcmills
Copy link
Member Author

bcmills commented Jun 15, 2022

Still ongoing at a low rate:

greplogs -l -e 'FAIL: TestLookupPTR .*(?:\n .*)*different results 1.1.1.1:\s+exp:null\s+got:\["one\.one\.one\.one\."\]' --since=2021-02-20
2021-11-18T02:16:39-353cb71/windows-amd64-longtest
2021-04-09T09:01:07-4d7d7a4/windows-amd64-longtest
2021-03-17T17:13:50-0bd308f/windows-amd64-longtest

greplogs -l -e 'FAIL: TestLookupPTR .*(?:\n .*)*DNS server failure' --since=2021-02-20
2022-06-13T19:06:49-4703546/windows-amd64-longtest
2022-06-13T16:53:11-7eeec1f/windows-amd64-longtest

@bcmills
Copy link
Member Author

bcmills commented Jun 27, 2022

One more this week. The mode=go line in the failure log makes me wonder whether this recent spate of failures is related to #33097, although this most recent failure is after CL 413458.

@ianlancetaylor, @bradfitz: could you double-check whether something relating to the PreferGo and netdns=go changes in 1.19 could be causing an increase in TestLookupPTR failures?

greplogs -l -e 'FAIL: TestLookupPTR .*(?:\n .*)*DNS server failure' --since=2022-06-14
2022-06-22T23:27:17-2e773a3/windows-amd64-longtest

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jun 27, 2022

Note that the mode=go appears on a likely unrelated failure of TestLookupDotsWithRemoteSource. It does not appear in a failure of TestLookupPTR. It's just that both tests happened to fail in that run.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jun 27, 2022

Most of the errors are of the form

--- FAIL: TestLookupPTR (3.90s)
    lookup_windows_test.go:167: skipping failed lookup 1.1.1.1 test: exit status 1: 
        Pinging one.one.one.one [1.1.1.1] with 32 bytes of data:
        Request timed out.
        
        Ping statistics for 1.1.1.1:
            Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),
    lookup_windows_test.go:172: different results 1.1.1.1:	exp:null	got:["one.one.one.one."]

The test actually runs the ping program. If the ping program fails, the test says skipping failed lookup. But it doesn't actually skip the test. All of those errors are flakes.

There are also some errors in which the DnsQuery Windows function fails with DNS_ERROR_RCODE_SERVER_FAILURE. The documentation for that error is "DNS server failure." I don't know what is going on there. As far as I can tell the code that is failing is unchanged since 1.18. This case only happens when using netdns=cgo, the default in 1.18 and earlier on Windows.

@gopherbot
Copy link

gopherbot commented Jun 27, 2022

Change https://go.dev/cl/414634 mentions this issue: net: really skip Windows TestLookupPTR test if we say we are skipping it

gopherbot pushed a commit that referenced this issue Jun 27, 2022
For #38111

Change-Id: I2651687367af68ee070ea91106f4bc18adab2762
Reviewed-on: https://go-review.googlesource.com/c/go/+/414634
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
jproberts pushed a commit to jproberts/go that referenced this issue Aug 10, 2022
For golang#38111

Change-Id: I2651687367af68ee070ea91106f4bc18adab2762
Reviewed-on: https://go-review.googlesource.com/c/go/+/414634
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@bcmills
Copy link
Member Author

bcmills commented Sep 2, 2022

https://storage.googleapis.com/go-build-log/a47db80f/windows-amd64-longtest_2ecbeb5f.log:

--- FAIL: TestLookupPTR (0.08s)
    lookup_windows_test.go:174: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:177: no results
    lookup_windows_test.go:187: different results 1.1.1.1:	exp:[]	got:null
FAIL
FAIL	net	47.405s

@bcmills
Copy link
Member Author

bcmills commented Sep 2, 2022

(CC @golang/windows)

@gopherbot
Copy link

gopherbot commented Sep 2, 2022

Change https://go.dev/cl/427915 mentions this issue: net: skip TestLookupPTR when LookupAddr fails with "DNS server failure"

gopherbot pushed a commit that referenced this issue Sep 3, 2022
For #38111.

Change-Id: I43bdd756bde0adcd156cf9750b49b3b989304df7
Reviewed-on: https://go-review.googlesource.com/c/go/+/427915
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
@mknyszek
Copy link
Contributor

mknyszek commented Sep 6, 2022

Another recent flake:

2022-09-02T02:09:03-a31c062/windows-amd64-longtest

@gopherbot
Copy link

gopherbot commented Sep 21, 2022

Found new dashboard test flakes for:

#!watchflakes
post <- builder == "windows-amd64-longtest" && pkg == "net" && test == "TestLookupPTR"
2022-07-29 22:09 windows-amd64-longtest go@9a2001a8 net.TestLookupPTR (log)
--- FAIL: TestLookupPTR (0.10s)
    lookup_windows_test.go:174: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:177: no results
    lookup_windows_test.go:187: different results 1.1.1.1:	exp:[]	got:null
2022-08-17 16:26 windows-amd64-longtest go@014f0e82 net.TestLookupPTR (log)
--- FAIL: TestLookupPTR (0.09s)
    lookup_windows_test.go:174: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:177: no results
    lookup_windows_test.go:187: different results 1.1.1.1:	exp:[]	got:null
2022-08-26 18:28 windows-amd64-longtest go@bf812b32 net.TestLookupPTR (log)
--- FAIL: TestLookupPTR (0.08s)
    lookup_windows_test.go:160: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:163: no results
    lookup_windows_test.go:172: different results 1.1.1.1:	exp:[]	got:null
2022-08-31 15:30 windows-amd64-longtest go@d01200e7 net.TestLookupPTR (log)
--- FAIL: TestLookupPTR (0.09s)
    lookup_windows_test.go:174: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:177: no results
    lookup_windows_test.go:187: different results 1.1.1.1:	exp:[]	got:null
2022-08-31 16:59 windows-amd64-longtest go@0d6a7f9d net.TestLookupPTR (log)
--- FAIL: TestLookupPTR (0.18s)
    lookup_windows_test.go:174: failed 1.1.1.1: lookup 1.1.1.1: dnsquery: DNS server failure.
    lookup_windows_test.go:177: no results
    lookup_windows_test.go:187: different results 1.1.1.1:	exp:[]	got:null
2022-09-02 02:09 windows-amd64-longtest go@a31c062c net.TestLookupPTR (log)
--- FAIL: TestLookupPTR (0.06s)
    lookup_windows_test.go:174: failed 8.8.8.8: lookup 8.8.8.8: dnsquery: DNS server failure.
    lookup_windows_test.go:177: no results
    lookup_windows_test.go:187: different results 8.8.8.8:	exp:[]	got:null

watchflakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
Status: Active
Development

No branches or pull requests

4 participants