-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Description
What version of Go are you using (go version)?
$ go version go version devel go1.21-518889b35c Thu Feb 16 02:02:00 2023 +0000 linux/amd64
Does this issue reproduce with the latest release?
Yes, verified with Go 1.20.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/mvdan/.cache/go-build" GOENV="/home/mvdan/.config/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/home/mvdan/go/pkg/mod" GONOPROXY="github.com/cue-unity" GONOSUMDB="github.com/cue-unity" GOOS="linux" GOPATH="/home/mvdan/go" GOPRIVATE="github.com/cue-unity" GOPROXY="https://proxy.golang.org,direct" GOROOT="/home/mvdan/tip" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/home/mvdan/tip/pkg/tool/linux_amd64" GOVCS="" GOVERSION="devel go1.21-518889b35c Thu Feb 16 02:02:00 2023 +0000" GCCGO="gccgo" GOAMD64="v3" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/home/mvdan/src/nowt/go.mod" GOWORK="" CGO_CFLAGS="-O2 -g" CGO_CPPFLAGS="" CGO_CXXFLAGS="-O2 -g" CGO_FFLAGS="-O2 -g" CGO_LDFLAGS="-O2 -g" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3213309622=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Many of the projects I contribute to use go test ./... on GitHub Actions. A majority of those also use GitHub's https://github.com/actions/cache to persist Go's GOCACHE and GOMODCACHE, so that consecutive CI runs should avoid much of the work if little code has changed.
This works for the most part, but I've noticed that commits which do not change any code at all (e.g. only modify the README) only show some packages as (cached), whereas other packages are always run from scratch. For example, see cue-lang/cue#2253.
After investigating quite a bit with GODEBUG=gocachehash=1 and its testInputs hashes, and reading through src/cmd/go/internal/test/test.go, I think the cause is modified times. Whenever a test opens a file, Go's automatic test caching tracks that file's path, file info, and contents to know whether it should re-run the test in the future. The file info includes the modified time:
func hashWriteStat(h io.Writer, info fs.FileInfo) {
fmt.Fprintf(h, "stat %d %x %v %v\n", info.Size(), uint64(info.Mode()), info.ModTime(), info.IsDir())
}
Including the modified time may be needed for the test cache's own purposes, and in general it's not a problem when continuously developing on a machine. However, CI is a bit special: it's quite common to run in a temporary environment and start by fetching the code. In particular, the way GitHub Actions is used as CI tends to start with https://github.com/actions/checkout, which uses git clone to download the repository to a path like /home/runner/work/foobar/foobar. Note that the clone directory location does not change between runs; I first suspected that was the problem, but it is not.
The problem is that git does not store modified times in any way, and git clone creates new files when it's checking out a repository, so the cloned files end up having the current time as their modified time.
As a reproducer and demo, I've written https://github.com/mvdan/nowt/tree/master/test-cache-open-modtime. It contains a single Go test which opens a testdata file, and a script which reproduces the situation:
$ ./repro.bash
Clone and test twice in a row, using the same directory and -trimpath. The test caching does not work.
-rw-r--r-- 1 mvdan mvdan 8 Feb 16 22:14 test-cache-open-modtime/testdata/foo.txt
ok mvdan.cc/nowt/test-cache-open-modtime 0.001s
-rw-r--r-- 1 mvdan mvdan 8 Feb 16 22:14 test-cache-open-modtime/testdata/foo.txt
ok mvdan.cc/nowt/test-cache-open-modtime 0.001s
The same again, but this time resetting the testdata file's mtime, since git clone otherwise sets it to the time it ran.
-rw-r--r-- 1 mvdan mvdan 8 Jan 1 2000 test-cache-open-modtime/testdata/foo.txt
ok mvdan.cc/nowt/test-cache-open-modtime 0.001s
-rw-r--r-- 1 mvdan mvdan 8 Jan 1 2000 test-cache-open-modtime/testdata/foo.txt
ok mvdan.cc/nowt/test-cache-open-modtime (cached)
What did you expect to see?
I expected modified times to not affect Go's test caching hit vs miss decisions. Particularly since Go largely allows its developers to not be aware of modified times for incremental builds to work well, unlike other build systems like make.
What did you see instead?
The way most people use CI on popular platforms like GitHub Actions causes unnecessary Go test cache misses, and it's caused by modified times.
I could fix this for my projects by following each actions/checkout step with a script to walk all cloned files and reset their modified times to a static timestamp. However, that feels very unfortunate, and I imagine many Go users might not even notice that their CI is slower than it should be to begin with.
We could also try to convince third parties like git clone or actions/checkout to leave reproducible modified times on files, but I don't think that is likely to happen.