Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: occasional network and server errors in tests that connect to rsc.io #49954

Open
bcmills opened this issue Dec 3, 2021 · 10 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Member

bcmills commented Dec 3, 2021

#!watchflakes
post <- pkg ~ `cmd/go` && `(reading |Get ")https://(swtch.com|rsc.io).*(connection reset by peer|network is unreachable|Internal Server Error)`

greplogs --dashboard -md -l -e 'Get "https://rsc\.io/.*read: connection reset by peer' --since=2021-01-01

2021-12-01T10:19:34-0e1d553/linux-386-longtest
2021-11-24T20:51:25-5d8c49a/linux-386-longtest
2021-11-16T17:13:33-29ec902/linux-386-longtest
2021-11-15T19:24:28-9265558/linux-386-longtest
2021-09-30T18:10:18-205640e/linux-amd64-longtest
2021-02-24T19:25:49-bf48163/linux-386-longtest
2021-02-23T21:49:20-0458d8c/linux-amd64-longtest
2021-02-13T15:15:13-66c2709/linux-amd64-longtest

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure. labels Dec 3, 2021
@bcmills bcmills added this to the Backlog milestone Dec 3, 2021
@rsc
Copy link
Contributor

rsc commented Jan 10, 2022

rsc.io is just about the most trivial App Engine server ever. Except for App Engine taking it down for being too old a Go version, it's never had any appreciable downtime. It's possible that App Engine really is hanging up on the connection, but it seems equally possible that the VM is hanging up on the connection. I wonder if we should respond to ECONNRESET by trying one more time?

@bcmills
Copy link
Member Author

bcmills commented Jan 11, 2022

@bcmills
Copy link
Member Author

bcmills commented Jan 11, 2022

It's possible that App Engine really is hanging up on the connection, but it seems equally possible that the VM is hanging up on the connection. I wonder if we should respond to ECONNRESET by trying one more time?

It's curious that this failure mode happens so much for frequently for rsc.io than for other services (like gopkg.in, golang.org/x, or vcs-test.golang.org).

Scanning the failures over the last year:

greplogs --dashboard -md -l -e 'https://.*: connection reset by peer(?:.*\n)*FAIL\s+cmd/go' --since=2021-01-01

2021-12-08T15:30:52-9fe77de/linux-386-longtest (rsc.io)
2021-12-01T10:19:34-0e1d553/linux-386-longtest (rsc.io)
2021-11-24T20:51:25-5d8c49a/linux-386-longtest (rsc.io)
2021-11-16T17:13:33-29ec902/linux-386-longtest (rsc.io)
2021-11-15T19:24:28-9265558/linux-386-longtest (rsc.io)
2021-09-30T18:10:18-205640e/linux-amd64-longtest (rsc.io)
2021-09-02T13:45:48-90ed541/linux-386-longtest (github.com, and gopkg.in reporting its own error connecting to github.com, both coinciding with https://www.githubstatus.com/incidents/lfvwm5qydw6r)
2021-02-24T19:25:49-bf48163/linux-386-longtest (rsc.io)
2021-02-23T21:49:20-0458d8c/linux-amd64-longtest (rsc.io)
2021-02-13T15:15:13-66c2709/linux-amd64-longtest (rsc.io)

To me that suggests a problem specific to the App Engine service — perhaps either a bug in the App Engine version of the Go runtime, or flakiness in some network infrastructure sitting it front of the service.

@bcmills
Copy link
Member Author

bcmills commented May 6, 2022

greplogs -l -e 'https://.*: connection reset by peer(?:.*\n)*FAIL\s+cmd/go' --since=2021-12-09
2022-05-05T18:17:08-7b1f8b6/linux-amd64-longtest (rsc.io, but with a failed git pull to github.com/rsc/pdfxxx in another test)
2022-04-26T22:08:00-79db59d/linux-386-longtest (rsc.io)
2022-03-30T18:02:38-a7e76b8/linux-386-longtest (rsc.io)
2022-03-28T16:29:15-b10164b/linux-386-longtest (rsc.io)
2022-02-18T18:46:44-d93cc8c/linux-386-longtest (rsc.io)
2022-02-09T21:52:21-2e2ef31/linux-386-longtest (rsc.io)
2022-01-26T17:03:32-0f4bd92/linux-amd64-longtest (swtch.com)
2022-01-24T17:07:30-f88c3b9/linux-386-longtest (rsc.io)

@heschi
Copy link
Contributor

heschi commented Jul 6, 2022

@bcmills
Copy link
Member Author

bcmills commented Sep 14, 2022

@gopherbot
Copy link

gopherbot commented Sep 20, 2022

Found new matching dashboard test flakes for:

#!watchflakes
post <- pkg ~ `cmd/go` && `Get "https://(swtch.com|rsc.io).*(connection reset by peer|network is unreachable)`
2022-08-22 23:36 linux-386-longtest go@0f42e35f cmd/go.TestScript (log)
go test proxy running at GOPROXY=http://127.0.0.1:33351/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/govcs (10.53s)
        script_test.go:282: 
            # (2022-08-23T00:09:02Z)
            # GOVCS stops go get (0.057s)
            # public pattern works (0.012s)
            # private pattern works (0.030s)
            # other patterns work (for more patterns, see TestGOVCS) (0.035s)
            # bad patterns are reported (for more bad patterns, see TestGOVCSErrors) (0.064s)
            # bad GOVCS patterns do not stop commands that do not need to check VCS (0.117s)
            # svn is disallowed by default (0.032s)
            # fossil is disallowed by default (0.041s)
            # bzr is disallowed by default (0.052s)
            # git is OK by default (10.088s)
            > env GOVCS=
            > env GONOSUMDB='*'
            > [net] [exec:git] [!short] go get rsc.io/sampler
            [stderr]
            go: unrecognized import path "rsc.io/sampler": https fetch: Get "https://rsc.io/sampler?go-get=1": read tcp 10.128.1.43:53588->216.239.32.21:443: read: connection reset by peer
            [exit status 1]
            FAIL: testdata/script/govcs.txt:70: unexpected command failure
2022-08-29 19:17 linux-amd64-longtest go@37cedd26 cmd/go.TestScript (log)
go test proxy running at GOPROXY=http://127.0.0.1:44621/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/mod_sumdb_golang (37.22s)
        script_test.go:259: 
            # Test default GOPROXY and GOSUMDB (0.214s)
            # Download direct from github. (20.030s)
            # Download from proxy.golang.org with go.sum entry already.
            # Use 'go list' instead of 'go get' since the latter may download extra go.mod
            # files not listed in go.sum. (2.045s)
            # Download again.
...
            > stderr github.com/rsc
            > go list -mod=mod -x rsc.io/quote  # Download module source.
            [stderr]
            go: downloading rsc.io/quote v1.5.2
            # get https://proxy.golang.org/rsc.io/quote/@v/v1.5.2.zip: 404 testing (0.000s)
            # get https://rsc.io/quote?go-get=1
            # get https://rsc.io/quote?go-get=1: Get "https://rsc.io/quote?go-get=1": read tcp 10.128.1.86:37196->216.239.34.21:443: read: connection reset by peer
            unrecognized import path "rsc.io/quote": https fetch: Get "https://rsc.io/quote?go-get=1": read tcp 10.128.1.86:37196->216.239.34.21:443: read: connection reset by peer
            [exit status 1]
            FAIL: testdata/script/mod_sumdb_golang.txt:74: unexpected command failure
2022-09-06 21:53 linux-386-longtest go@86f8b8d3 cmd/go.TestIssue11457 (log)
go test proxy running at GOPROXY=http://127.0.0.1:38493/mod
--- FAIL: TestIssue11457 (0.19s)
    go_test.go:1004: running testgo [get -d -u rsc.io/go-get-issue-11457]
    go_test.go:1004: standard error:
    go_test.go:1004: package rsc.io/go-get-issue-11457: unrecognized import path "rsc.io/go-get-issue-11457": https fetch: Get "https://rsc.io/go-get-issue-11457?go-get=1": dial tcp [2001:4860:4802:32::15]:443: connect: network is unreachable

    go_test.go:1004: go [get -d -u rsc.io/go-get-issue-11457] failed unexpectedly in /workdir/go/src/cmd/go: exit status 1
2022-09-07 05:37 linux-386-longtest go@c011270f cmd/go.TestScript (log)
go test proxy running at GOPROXY=http://127.0.0.1:33035/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/govcs (0.38s)
        script_test.go:282: 
            # (2022-09-07T06:09:41Z)
            # GOVCS stops go get (0.048s)
            # public pattern works (0.015s)
            # private pattern works (0.007s)
            # other patterns work (for more patterns, see TestGOVCS) (0.026s)
            # bad patterns are reported (for more bad patterns, see TestGOVCSErrors) (0.035s)
            # bad GOVCS patterns do not stop commands that do not need to check VCS (0.124s)
            # svn is disallowed by default (0.011s)
            # fossil is disallowed by default (0.007s)
            # bzr is disallowed by default (0.006s)
            # git is OK by default (0.097s)
            > env GOVCS=
            > env GONOSUMDB='*'
            > [net] [exec:git] [!short] go get rsc.io/sampler
            [stderr]
            go: unrecognized import path "rsc.io/sampler": https fetch: Get "https://rsc.io/sampler?go-get=1": dial tcp [2001:4860:4802:32::15]:443: connect: network is unreachable
            [exit status 1]
            FAIL: testdata/script/govcs.txt:70: unexpected command failure
2022-09-15 20:02 linux-amd64-longtest go@a2973914 cmd/go.TestScript (log)
go test proxy running at GOPROXY=http://127.0.0.1:45665/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/govcs (0.46s)
        script_test.go:282: 
            # (2022-09-15T20:25:50Z)
            # GOVCS stops go get (0.023s)
            # public pattern works (0.025s)
            # private pattern works (0.015s)
            # other patterns work (for more patterns, see TestGOVCS) (0.049s)
            # bad patterns are reported (for more bad patterns, see TestGOVCSErrors) (0.023s)
            # bad GOVCS patterns do not stop commands that do not need to check VCS (0.074s)
            # svn is disallowed by default (0.021s)
            # fossil is disallowed by default (0.016s)
            # bzr is disallowed by default (0.017s)
            # git is OK by default (0.189s)
            > env GOVCS=
            > env GONOSUMDB='*'
            > [net] [exec:git] [!short] go get rsc.io/sampler
            [stderr]
            go: unrecognized import path "rsc.io/sampler": https fetch: Get "https://rsc.io/sampler?go-get=1": dial tcp [2001:4860:4802:38::15]:443: connect: network is unreachable
            [exit status 1]
            FAIL: testdata/script/govcs.txt:70: unexpected command failure

watchflakes

@gopherbot
Copy link

gopherbot commented Sep 23, 2022

Found new dashboard test flakes for:

#!watchflakes
post <- pkg ~ `cmd/go` && `Get "https://(swtch.com|rsc.io).*(connection reset by peer|network is unreachable)`
2022-09-23 01:07 linux-386-longtest go@c2ede92a cmd/go.TestScript (log)
go test proxy running at GOPROXY=http://127.0.0.1:44891/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/govcs (1.25s)
        script_test.go:282: 
            # (2022-09-23T02:39:21Z)
            # GOVCS stops go get (0.153s)
            # public pattern works (0.041s)
            # private pattern works (0.045s)
            # other patterns work (for more patterns, see TestGOVCS) (0.130s)
            # bad patterns are reported (for more bad patterns, see TestGOVCSErrors) (0.185s)
            # bad GOVCS patterns do not stop commands that do not need to check VCS (0.333s)
            # svn is disallowed by default (0.044s)
            # fossil is disallowed by default (0.046s)
            # bzr is disallowed by default (0.063s)
            # git is OK by default (0.209s)
            > env GOVCS=
            > env GONOSUMDB='*'
            > [net] [exec:git] [!short] go get rsc.io/sampler
            [stderr]
            go: unrecognized import path "rsc.io/sampler": https fetch: Get "https://rsc.io/sampler?go-get=1": dial tcp [2001:4860:4802:36::15]:443: connect: network is unreachable
            [exit status 1]
            FAIL: testdata/script/govcs.txt:70: unexpected command failure

watchflakes

@bcmills bcmills changed the title cmd/go: occasional "connection reset by peer" errors in tests that connect to rsc.io cmd/go: occasional network and server errors in tests that connect to rsc.io Sep 23, 2022
@gopherbot
Copy link

gopherbot commented Sep 23, 2022

Found new dashboard test flakes for:

#!watchflakes
post <- pkg ~ `cmd/go` && `(reading |Get ")https://(swtch.com|rsc.io).*(connection reset by peer|network is unreachable|Internal Server Error)`
2022-09-23 15:03 linux-amd64-longtest go@65deb9c3 cmd/go/internal/modfetch.TestCodeRepo (log)
--- FAIL: TestCodeRepo (0.00s)
    --- FAIL: TestCodeRepo/swtch.com_testmod/v1.0.0 (10.28s)
        coderepo_test.go:607: repo.Stat("v1.0.0"): unrecognized import path "swtch.com/testmod": reading https://swtch.com/testmod?go-get=1: 500 Internal Server Error
2022-09-23 15:03 linux-amd64-longtest go@65deb9c3 cmd/go/internal/modfetch.TestCodeRepoVersions (log)
--- FAIL: TestCodeRepoVersions (0.00s)
    --- FAIL: TestCodeRepoVersions/parallel (0.00s)
        --- FAIL: TestCodeRepoVersions/parallel/swtch.com_testmod (0.00s)
            coderepo_test.go:832: Versions(""): unrecognized import path "swtch.com/testmod": reading https://swtch.com/testmod?go-get=1: 500 Internal Server Error
2022-09-23 15:03 linux-amd64-longtest go@65deb9c3 cmd/go/internal/modfetch.TestLatest (log)
--- FAIL: TestLatest (0.65s)
    --- FAIL: TestLatest/parallel (0.00s)
        --- FAIL: TestLatest/parallel/swtch.com_testmod (0.00s)
            coderepo_test.go:913: Latest(): unrecognized import path "swtch.com/testmod": reading https://swtch.com/testmod?go-get=1: 500 Internal Server Error

watchflakes

@gopherbot
Copy link

gopherbot commented Sep 27, 2022

Found new dashboard test flakes for:

#!watchflakes
post <- pkg ~ `cmd/go` && `(reading |Get ")https://(swtch.com|rsc.io).*(connection reset by peer|network is unreachable|Internal Server Error)`
2022-09-27 16:50 linux-amd64-longtest go@7d157fd0 cmd/go/internal/modfetch.TestCodeRepo (log)
--- FAIL: TestCodeRepo (0.00s)
    --- FAIL: TestCodeRepo/rsc.io_quote/v1.0.0 (0.09s)
        coderepo_test.go:607: repo.Stat("v1.0.0"): unrecognized import path "rsc.io/quote": https fetch: Get "https://rsc.io/quote?go-get=1": dial tcp [2001:4860:4802:36::15]:443: connect: network is unreachable

watchflakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
Status: Active
Development

No branches or pull requests

4 participants