Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: mod_lazy_new_import and mod_multirepo flaky failed on linux-wsl and linux-s390x builders #53269

Closed
mengzhuo opened this issue Jun 7, 2022 · 13 comments
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@mengzhuo
Copy link
Contributor

mengzhuo commented Jun 7, 2022

Does this issue reproduce with the latest release?

maybe, it failed at 77d9252

What did you expect to see?

All test passed

What did you see instead?

https://build.golang.org/log/8f84b20723e5bdae8362924b3a103ced90a515c6
https://build.golang.org/log/87edf847a20c338f407a5183c1fd6c80e070555e

go test proxy running at GOPROXY=http://127.0.0.1:45001/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/mod_lazy_new_import (0.11s)
        script_test.go:272: 
            # (2022-06-07T15:13:22Z)
            # This test illustrates the use of a deepening scan to resolve transitive
            # imports of imports of new packages from within existing dependencies.
            # The package import graph used in this test looks like:
            #
            # lazy ---- a/x ---- b
            #     \
            #      ---- a/y (new) ---- c
            #
            # Where a/x and a/y are disjoint packages, but both contained in module a.
            #
            # The module dependency graph initially looks like:
            #
            # lazy ---- a.1 ---- b.1
            #              \
            #               c.1 (0.019s)
            # Before adding a new import, the go.mod file should
            # enumerate modules for all packages already imported. (0.016s)
            # When we add a new import of a package in an existing dependency,
            # and that dependency is already tidy, its transitive dependencies
            # should already be present. (0.047s)
            # Now, we repeat the test with a lazy main module. (0.000s)
            # Before adding a new import, the go.mod file should
            # enumerate modules for all packages already imported. (0.008s)
            # When a new import is found, we should perform a deepening scan of the existing
            # dependencies and add a requirement on the version required by those
            # dependencies — not re-resolve 'latest'. (0.015s)
            > cp lazy.go.new lazy.go
            > ! go list all
            [stdout]
            example.com/a/x
            example.com/b
            example.com/lazy
            FAIL: testdata/script/mod_lazy_new_import.txt:52: unexpected command success
            
@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 7, 2022
@cherrymui cherrymui added this to the Backlog milestone Jun 7, 2022
@cherrymui
Copy link
Member

cc @bcmills

@cherrymui
Copy link
Member

cherrymui commented Jun 7, 2022

Also fails on the S390X builder
2022-06-07T17:36:56-81033fb/linux-s390x-ibm
2022-06-07T16:39:37-429a404/linux-s390x-ibm
2022-06-07T14:47:46-77d9252/linux-amd64-wsl
2022-06-06T15:42:19-47e34ca/linux-amd64-wsl
2022-06-04T18:42:07-8659114/linux-s390x-ibm
2022-06-03T20:15:20-162b882/linux-s390x-ibm

TestScript/mod_multirepo is also flaky on these two builders. Not sure if there is any connection.

go test proxy running at GOPROXY=http://127.0.0.1:38219/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/mod_multirepo (0.03s)
        script_test.go:272: 
            # (2022-06-07T17:04:43Z)
            # initial standalone module should use no downloaded modules (0.005s)
            # v2 import should use a downloaded module
            # both without an explicit go.mod entry ... (0.025s)
            > cp tmp/use_v2.go x.go
            > go get .
            > go list -deps -f {{.Dir}}
            [stdout]
            $WORK/gopath/src
            [stderr]
            x.go:2:8: no required module provides package rsc.io/quote/v2; to add it:
            	go get rsc.io/quote/v2
            [exit status 1]
            FAIL: testdata/script/mod_multirepo.txt:11: unexpected command failure
            
FAIL
FAIL	cmd/go	19.200s

2022-06-07T16:57:52-835a946/linux-s390x-ibm
2022-06-06T20:48:03-a71ca3d/linux-s390x-ibm
2022-06-06T19:53:39-c1e2ecb/linux-s390x-ibm
2022-06-06T19:03:15-95547ae/linux-s390x-ibm
2022-06-06T18:50:50-d43ddc1/linux-s390x-ibm
2022-06-06T18:24:19-07eca49/linux-s390x-ibm
2022-06-06T16:26:54-770146d/linux-s390x-ibm
2022-06-04T21:17:03-a32a592/linux-amd64-wsl
2022-06-03T20:16:45-54bd44e/linux-s390x-ibm

@jonathan-albrecht-ibm
Copy link
Contributor

I'm able to reproduce this on a local s390x machine. I've been running one of the failing tests go test -count=100 cmd/go -run=Script/^mod_multirepo$ and it fails about 2 - 4 times in 100 runs.

@matloob, it started failing soon after https://go-review.googlesource.com/c/go/+/403778. I changed GOINDEX=true -> GOINDEX=false in src/cmd/go/script_test.go and then did not see any failures in over 1000 runs. So maybe related?

The failure in the mod_multirepo test seems to be because the go get . doesn't download anything. When that command is successful it has output like:

... SNIP ...
        > go get .
        [stderr]
        go: downloading rsc.io/quote/v2 v2.0.1
... SNIP ...

I tried adding -v -x options, ie. go get -v -x ., to see if there was any output at all when it failed but there wasn't.

Not sure what to try next so happy to try any suggestions. I can also provide access to an s390x VM if it would help.

@cherrymui cherrymui modified the milestones: Backlog, Go1.19 Jun 8, 2022
@cherrymui
Copy link
Member

@jonathan-albrecht-ibm thanks for looking into this!

Tentatively mark as a release blocker, as this is new failure. @matloob and @bcmills can decide. Thanks.

@cherrymui cherrymui added the GoCommand cmd/go label Jun 8, 2022
@cherrymui cherrymui changed the title cmd/go: mod_lazy_new_import flaky failed on linux-wsl builder cmd/go: mod_lazy_new_import and mod_multirepo flaky failed on linux-wsl and linux-s390x builders Jun 8, 2022
@cherrymui cherrymui added the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label Jun 8, 2022
@matloob
Copy link
Contributor

matloob commented Jun 8, 2022

Hi, access to a s390x VM would be super helpful, thanks!

@jonathan-albrecht-ibm
Copy link
Contributor

@matloob no problem. Please send me an email at the address on my profile page https://github.com/jonathan-albrecht-ibm and I'll send you the VM access details.

@gopherbot gopherbot removed the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label Jun 10, 2022
@jonathan-albrecht-ibm
Copy link
Contributor

@matloob, I did a git bisect starting from before the first modindex changes were merged and got: ee87cd1 is the first bad commit.

I tested with the -testwork option and it when the test fails if I run go get . in the test tmp dir it keeps failing to get anything. But if I touch either of the x.go or go.mod src files, it will start working (ie. download the packages from the import in x.go).

I have tried debugging it a bit but haven't figured out exactly what's going on. I think its not building a new mod index in the failure case and some how using a default empty index. In the failure case, on modindex/read.go:133: data, _, err := cache.Default().GetMmap(id) doesn't return an error so data, err = indexModule(modroot) doesn't get called so there's nothing to do.

Hope that helps

@matloob
Copy link
Contributor

matloob commented Jun 14, 2022

It looks like the moduleHash isn't changing in TestScript/mod_multirepo after new contents of those files are copied in.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/412394 mentions this issue: cmd/go/internal/modindex: disable indexing for modules outside GOROOT and the module cache

gopherbot pushed a commit that referenced this issue Jun 15, 2022
… and the module cache

Since CL 410821 we were indexing these modules with a cache key based
on the mtimes of the files within the module. However, that seems to
be causing test failures (#53269 and maybe #53371).

In addition, indexing these modules caused a potentially-expensive
operation (re-indexing a whole module) whenever any individual file
within the module is changed, even if it isn't relevant to the
package(s) being loaded from that module. In some cases, that could
cause a significant performance regression for 'go' commands invoked
on a small subset of the packages in the module (such as running 'go
test' on a single changed package — a common case during development).

Instead, we now index only those modules found within the module cache
and within GOROOT.

In addition, we now check mtimes when indexing GOROOT modules if the
Go version begins with the string "devel ", which indicates a
non-released Go version that may include local file edits within GOROOT.

For #53371.
For #53269.

Change-Id: Id3aa81b55ecfc478e47dd420148d39d2cf476f2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/412394
Reviewed-by: Michael Matloob <matloob@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Bryan Mills <bcmills@google.com>
@bcmills bcmills self-assigned this Jun 15, 2022
@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 15, 2022
@bcmills
Copy link
Contributor

bcmills commented Jun 15, 2022

I think this should be fixed by CL 412394, but I'd like to see more test runs on these builders to confirm.

@bcmills
Copy link
Contributor

bcmills commented Jun 17, 2022

Does appear to be fixed by CL 412394, but that also had a typo that I missed and needs CL 412779 as a followup.

@bcmills bcmills removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 17, 2022
@jonathan-albrecht-ibm
Copy link
Contributor

FWIW, I've been running the cmd/go tests on a local s390x machine at CL 412779 and no failures in ~500 runs. Before either CL 412394 or CL 412779 it would fail about once every five runs on this machine.

@bcmills
Copy link
Contributor

bcmills commented Jun 22, 2022

CL 412779 is in, so we should be all clear on these failures now.

@bcmills bcmills closed this as completed Jun 22, 2022
@bcmills bcmills moved this to Done in Release Blockers Jun 22, 2022
@rsc rsc unassigned bcmills and matloob Jun 22, 2022
@golang golang locked and limited conversation to collaborators Jun 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
Status: Done
Development

No branches or pull requests

6 participants