Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/release: darwin-amd64 release tests aren't passing, cmd/go's TestScript/list_std_stale fails #46292

Closed
dmitshur opened this issue May 20, 2021 · 7 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin release-blocker
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented May 20, 2021

As detected early by a #29205-like process, Go 1.17 tests that will be run during the release process are not passing when using a recent Go tip commit (tested with f07e4da) on the darwin/amd64 release target, even with 3 retries. The same test is passing on other targets.

Specifically, CL 297869 (CC @bcmills, @jayconrod) made a change for Go 1.17 to fix #44725 and added a cmd/go/testdata/script/list_std_stale.txt test case. On darwin/amd64 that new test case fails with:

go test proxy running at GOPROXY=http://127.0.0.1:52037/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/list_std_stale (0.42s)
        script_test.go:252: 
            # https://golang.org/issue/44725: packages in std should not be reported as stale,
            # regardless of whether they are listed from within or outside GOROOT/src.
            # Control case: net should not be stale at the start of the test,
            # and should depend on vendor/golang.org/… instead of golang.org/…. (0.416s)
            > ! stale net
            FAIL: testdata/script/list_std_stale.txt:7: net is unexpectedly stale
            
            
FAIL
FAIL	cmd/go	112.819s

This test failure can be reproduced with cmd/release (can only be done by someone with gomote access) like so:

# (The release command uses builders to create a release artifact locally.
#  It does not publish anything, so it's safe to run for testing needs.)
$ release -target=darwin-amd64 -version=go1.17beta100 -watch \
          -rev=f07e4dae3c5cb608b4f0b9db57d1562d2125243b  # or -rev=master for latest commit

This does not seem related to #46161, because it happens both with darwin-amd64-11_0 and darwin-amd64-10_15 builders. (CC @cagedmantis, @heschi.)

It also doesn't happen for the darwin/arm64 build, only amd64.

(It's not happening in post-submit testing, the build dashboard is green on all darwin amd64 builders.)

This is possibly connected to issues like #46239 and #33598 and may be a builder issue, or a problem with how release tests are run (as one example of a difference, they are run with GOROOT_FINAL set, while normal tests are not). There's also a possibility it's a problem in cmd/go that happens to only be caught by the darwin/amd64 builders in release test mode. This needs investigation.

The Go 1.17 darwin/amd64 release artifacts cannot be made without some sort of resolution to make this test pass, so this is a release blocking issue.

CC @golang/release.

@dmitshur dmitshur added OS-Darwin NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels May 20, 2021
@dmitshur dmitshur added this to the Go1.17 milestone May 20, 2021
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label May 20, 2021
@bcmills
Copy link
Member

bcmills commented May 20, 2021

(as one example of a difference, they are run with GOROOT_FINAL set, while normal tests are not)

FWIW, TestScript explicitly propagates GOROOT_FINAL here:

"GOROOT_FINAL=" + os.Getenv("GOROOT_FINAL"), // causes spurious rebuilds and breaks the "stale" built-in if not propagated

Just to verify: does the release script re-run make.bash with GOROOT_FINAL set before it runs the tests?

But, given that the detected-stale package is net, I'm inclined to suspect some kind of difference in the detected cgo configuration — maybe something to do with which C compilers are in PATH?

@dmitshur
Copy link
Contributor Author

dmitshur commented May 20, 2021

The sequence of steps the release script performs is in the Build.make method.

An environment with GOROOT_FINAL set (here) is used both during make.bash (here) and later, when all.bash is run (here).

@bcmills
Copy link
Member

bcmills commented May 22, 2021

This section may be the problem for this test:
https://github.com/golang/build/blob/d0819edf598a7f08d0fa642ca67f9d2c094e775b/cmd/release/release.go#L427-L431

I would bet that changing CGO_CFLAGS invalidates the build cache for anything that depends on cgo, which the net package probably does. However, we don't currently propagate any CGO_ variables in TestScript:

ts.env = []string{
"WORK=" + ts.workdir, // must be first for ts.abbrev
"PATH=" + testBin + string(filepath.ListSeparator) + os.Getenv("PATH"),
homeEnvName() + "=/no-home",
"CCACHE_DISABLE=1", // ccache breaks with non-existent HOME
"GOARCH=" + runtime.GOARCH,
"GOCACHE=" + testGOCACHE,
"GODEBUG=" + os.Getenv("GODEBUG"),
"GOEXE=" + cfg.ExeSuffix,
"GOOS=" + runtime.GOOS,
"GOPATH=" + filepath.Join(ts.workdir, "gopath"),
"GOPROXY=" + proxyURL,
"GOPRIVATE=",
"GOROOT=" + testGOROOT,
"GOROOT_FINAL=" + os.Getenv("GOROOT_FINAL"), // causes spurious rebuilds and breaks the "stale" built-in if not propagated
"GOTRACEBACK=system",
"TESTGO_GOROOT=" + testGOROOT,
"GOSUMDB=" + testSumDBVerifierKey,
"GONOPROXY=",
"GONOSUMDB=",
"GOVCS=*:all",
"PWD=" + ts.cd,
tempEnvName() + "=" + filepath.Join(ts.workdir, "tmp"),
"devnull=" + os.DevNull,
"goversion=" + goVersion(ts),
":=" + string(os.PathListSeparator),
}

var extraEnvKeys = []string{
"SYSTEMROOT", // must be preserved on Windows to find DLLs; golang.org/issue/25210
"WINDIR", // must be preserved on Windows to be able to run PowerShell command; golang.org/issue/30711
"LD_LIBRARY_PATH", // must be preserved on Unix systems to find shared libraries
"CC", // don't lose user settings when invoking cgo
"GO_TESTING_GOTOOLS", // for gccgo testing
"GCCGO", // for gccgo testing
"GCCGOTOOLDIR", // for gccgo testing
}

@dmitshur
Copy link
Contributor Author

Well spotted @bcmills. I can reproduce with just:

$ export CGO_CFLAGS='-mmacosx-version-min=10.13'
$ ./make.bash # etc.
$ go test -run=TestScript/list_std_stale cmd/go

Filed #46347 as a cmd/go issue. Depending on whether it's considered a bug and fixed, or deemed working as intended, we can decide whether cmd/release needs changes.

@dmitshur
Copy link
Contributor Author

The decision in #46347 is to fix it, so after that's done, the current behavior of cmd/release should result in a passing test.

I've thought about whether it'd make sense to modify cmd/release behavior so that release-specific adjustments like setting CGO_CFLAGS are only applied during ./make.bash and not ./all.bash, to reduce the delta between testing done in TryBots/post-submit builders, and release testing. However, there is some advantage to current behavior of cmd/release: it may catch problems that happen only when the environment is configured as it is during the release, which would otherwise not be caught at all. We may want to rethink this, but it should likely be done holistically as part of #40279, not right now.

@dmitshur
Copy link
Contributor Author

Issue #46347 has been resolved (thank you @bcmills). I'm re-running this test to confirm it's working as expected, and that there weren't other failures masked by this one.

@dmitshur
Copy link
Contributor Author

All release tests for the darwin-amd64 target are passing at Go tip commit 950fa11. This issue is resolved.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 27, 2021
@golang golang locked and limited conversation to collaborators May 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin release-blocker
Projects
None yet
Development

No branches or pull requests

3 participants