Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: allow declaring additional test dependencies #22593

Closed
rsc opened this issue Nov 5, 2017 · 20 comments
Closed

cmd/go: allow declaring additional test dependencies #22593

rsc opened this issue Nov 5, 2017 · 20 comments
Milestone

Comments

@rsc
Copy link
Contributor

@rsc rsc commented Nov 5, 2017

Quoting src/cmd/go/internal/test/test.go:

// TODO(rsc): How to handle other test dependencies like environment variables or input files?
// We could potentially add new API like testing.UsedEnv(envName string)
// or testing.UsedFile(inputFile string) to let tests declare what external inputs
// they consulted. These could be recorded and rechecked.
// The lookup here would become a two-step lookup: first use the binary+args
// to fetch the list of other inputs, then add the other inputs to produce a
// second key for fetching the results.
// For now, we'll assume that users will use -count=1 (or "go test") to bypass the test result
// cache when modifying those things.

See also #22583.

@rsc rsc added this to the Go1.10 milestone Nov 5, 2017
@rsc
Copy link
Contributor Author

@rsc rsc commented Nov 5, 2017

One downside to the specific API in the TODO is that it doesn't handle tests that glob a directory: if a file appears or disappears, the test should be rerun, but listing specific files to watch does not account for that.

@myitcv
Copy link
Member

@myitcv myitcv commented Nov 6, 2017

Thanks for pointing out your TODO, Russ; I'd totally missed that in my glancing through the code/CLs.

Using our test base an example, the "second key" you refer to would need to be a function of:

  1. Environment variables
  2. Directory contents (glob-able):
  3. Files are in the git index (glob-able)

Filepaths are either relative or resolved via environment variables.

Such a "second key" applies to both individual tests but also a number of tests (not necessarily in the same file or package).

Could we therefore follow some sort of declarative pre-amble in a test (or test file in order to apply to all tests in that file)?

import "testing"

func TestExtDeps(t *testing.T) {
    t.Deps(
        t.DepsDir("testdata"), // relative
        t.DepsGitFiles("$ENV_VAR/**/testdata/blah.txt"), // glob env-resolved git files
        // ...
    )

    // ...
}

For now, we'll assume that users will use -count=1 (or "go test") to bypass the test result cache when modifying those things.

As I mentioned in my golang-dev response, -count=1 has the unfortunate side effect of bypassing the result cache for all tests in package list mode, despite (in our case) that only representing ~5% of tests. What are your thoughts on flagging tests to bypass the cache?

import "testing"

func TestRegular(t *testing.T) {
    // will be part of the result cache

    // ...
}

func TestThatBypasses(t *testing.T) {
    // flag that this test's result should not be cached, i.e. re-run each time
    t.BypassCache()

    // ...
}

Or if we had something like the the .Deps approach then you could easily have a random dep:

import "testing"

func TestExtDeps(t *testing.T) {
    t.Deps(
        t.DepsRandom(),
    )

    // ...
}

@dlsniper
Copy link
Contributor

@dlsniper dlsniper commented Nov 6, 2017

This issue / proposed API doesn't deal with the external dependencies such as data in non-filesystems, such as databases.

@hdonnay
Copy link

@hdonnay hdonnay commented Nov 8, 2017

Tests that use testing/quick also need some thinking about. I'd certainly expect tests that use randomness to be re-run, or perhaps resume the same rng state and run fewer iterations.

@rsc
Copy link
Contributor Author

@rsc rsc commented Nov 9, 2017

Obviously we need to fix this in some way for Go 1.10, for test caching to make a good first impression. Equally obviously, we need to fix it soon, since we are into the freeze.

The main problem with "just assume all of testdata" is that it's incomplete: we have tests even in the standard library that read files outside testdata (for example compress/flate reads compress/testdata, not compress/flate/testdata), and I'm sure others do too. Environment variables were also raised, although I think they are clearly less common than testdata.

The main problem with adding separate new API (testing.ThisTestDependsOnThatThing) is that it means people's first interaction with test caching is finding out that test caching breaks their testing workflow and then are forced to go through and add redundant annotations to their code to unbreak it.

Because neither of those really seems tenable to me, my inclination now is to take @bradfitz's suggestion on #22583 and arrange via a new internal-only package for package os and package testing to conspire to report which environment variables and which files and directories (restricted to within $GOROOT and $GOPATH) are consulted by a particular test execution. That handles all tests that read testdata, tests that name testdata something else (not that I care a lot about supporting that case), tests that read from nearby testdata, tests that are sensitive to environment variables, and so on, all without redundant annotations that will not stay up to date.


It is not a goal to address every last possible incidental input to a test that might change its behavior. Tests that depend on network servers or databases are still typically not worth rerunning if the code in the test binary has not changed: if I modify encoding/json then it's still useful to be able to 'go test all' and only actually re-run the networked tests that use json. Of course if you are actively changing the records in the database accessed by the test, then you will not want the test to be cached, but we have clearly documented ways to disable the test cache in that case ("go test" with no package arguments, or "go test -count=1 mypkg"). The same goes for tests that use pseudo-random generators or query the system time or the system randomness source. If the tests are good tests, then these incidental inputs should not change the outcome, even if they do make each test invocation explore slightly different test cases. If your goal is to run the test many times precisely because you know it might vary each time, then again it's easy to avoid the caching. (And in fact you might naturally reach for something like -count=1000, which will disable the caching without you even thinking about it.)

@myitcv
Copy link
Member

@myitcv myitcv commented Nov 9, 2017

Because neither of those really seems tenable to me, my inclination now is to take @bradfitz's suggestion

👍

Is it possible to do something clever with calls to path/filepath.Glob? Or would, as you pointed out before, globs be a known and documented exception?

... restricted to within $GOROOT and $GOPATH ...

Just to confirm (selfishly, given that it's our setup) that multi-value GOPATH will be covered by this?

And just to pick up on @cespare's comment (because this the case in our situation too):

In a few cases we have test fixtures that are shared between a few different projects (not all of which are Go). In these cases the fixtures are symlinked into testdata.

Can the containment test be pre-symlink evaluation?

It is not a goal to address every last possible incidental input to a test that might change its behavior...

👍

I think with environment variables and files we have at least two mediums by which a coordinating process that is mutating the state of a database/server/other can communicate a hash of that state to a Go test, so to my mind most cases here are covered too.

@spenczar
Copy link
Contributor

@spenczar spenczar commented Nov 10, 2017

Could tests explicitly opt out of caching? t.DoNotCache() would be clear and handle many cases.

This could mean "you may cache this test's compilation, but re-execute it even if you think it hasn't changed."

@aronatkins
Copy link

@aronatkins aronatkins commented Nov 10, 2017

Would syscall tracing follow spawned processes? We are often testing process coordination (Go code calling another binary that spawns R running some specific interpreted script).

@rsc
Copy link
Contributor Author

@rsc rsc commented Dec 1, 2017

@spenczar I addressed that in the second half of #22593 (comment). Whether caching is valid is never a property of the test itself. It's a property of your model for what you've changed recently and want to test.

@spenczar
Copy link
Contributor

@spenczar spenczar commented Dec 1, 2017

@rsc I am thinking of cases in which the Go tool is not capable of knowing whether something changed. Particularly bulky integration tests that use the network: transient network blips can make a test fail; I'd like to retry that test, not assume it's always going to fail because my code hasn't changed.

@spenczar
Copy link
Contributor

@spenczar spenczar commented Dec 3, 2017

I realize you spoke to my concern above, @rsc, when you said this:

Of course if you are actively changing the records in the database accessed by the test, then you will not want the test to be cached, but we have clearly documented ways to disable the test cache in that case ("go test" with no package arguments, or "go test -count=1 mypkg")

I don't like that this would mean go1.10 requires that I execute my tests differently. Most of my tests are run automatically by build systems, smoke testing systems, and so on. Requiring that the tests be executed in a different way means that I'll need to change code. It's really the same problem as you mentioned with an API expansion:

The main problem with adding separate new API (testing.ThisTestDependsOnThatThing) is that it means people's first interaction with test caching is finding out that test caching breaks their testing workflow and then are forced to go through and add redundant annotations to their code to unbreak it.

It seems like we're going to need to change some code somewhere to unbreak tests if caching is enabled. It's either in the test runners or its in the tests themselves. I think changing the tests themselves sounds much more obvious to the reader, much clearer, and much harder to mess up when bringing new people onto a codebase. I don't want to have to educate people on how to run tests for my particular project - I want go test ./... to just work regardless of what project they're working on. Annotating tests emphasizes clarity over cleverness, which seems right to me.

@rsc
Copy link
Contributor Author

@rsc rsc commented Dec 4, 2017

@spenczar I missed in your most recent message why "go test ./..." does not just work for people working on your project. Perhaps the reason is in your previous message:

Particularly bulky integration tests that use the network: transient network blips can make a test fail; I'd like to retry that test, not assume it's always going to fail because my code hasn't changed.

See the documentation (go help test): "In package list mode, go test also caches successful package test results."

@spenczar
Copy link
Contributor

@spenczar spenczar commented Dec 4, 2017

@rsc Thank you, I did miss that very important fact! That is enough to satisfy me.

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 5, 2017

Change https://golang.org/cl/81895 mentions this issue: cmd/go: invalidate cached test results if env vars or files change

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 11, 2017

Change https://golang.org/cl/83256 mentions this issue: cmd/go: don't pass -test.testlogfile on NaCl

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 11, 2017

Change https://golang.org/cl/83257 mentions this issue: cmd/go: remove script.sh in TestTestCacheInputs

gopherbot pushed a commit that referenced this issue Dec 11, 2017
It causes every test to fail as the log file is on the local file system,
not the NaCl file system.

Updates #22593

Change-Id: Iee3d8307317bd792c9c701baa962ebbbfa34c147
Reviewed-on: https://go-review.googlesource.com/83256
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
gopherbot pushed a commit that referenced this issue Dec 11, 2017
Updates #22593

Change-Id: I76e52dc8b874da13ae9e2d80e5c0d6d8424b67db
Reviewed-on: https://go-review.googlesource.com/83257
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Dec 11, 2017

Change https://golang.org/cl/83395 mentions this issue: cmd/go: remove file created by test

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 12, 2017

Change https://golang.org/cl/83455 mentions this issue: os: don't use test logger for Getwd

gopherbot pushed a commit that referenced this issue Dec 12, 2017
Otherwise, on systems for which syscall does not implement Getwd,
a lot of unnecessary files and directories get added to the testlog,
right up the root directory. This was causing tests on such systems
to fail to cache in practice.

Updates #22593

Change-Id: Ic8cb3450ea62aa0ca8eeb15754349f151cd76f85
Reviewed-on: https://go-review.googlesource.com/83455
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Dec 12, 2017
The file cmd/go/testdata/src/testcache/script.sh was accidentally
committed with CL 83256. Sorry about that.

Updates #22593

Change-Id: Id8f07587ea97015ed75439db220560a5446e53e6
Reviewed-on: https://go-review.googlesource.com/83395
Reviewed-by: Russ Cox <rsc@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Dec 12, 2017

Change https://golang.org/cl/83475 mentions this issue: cmd/go: don't pass -test.testlogfile on android

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 13, 2017

Change https://golang.org/cl/83578 mentions this issue: cmd/go: don't use a testlog if there is an exec command

gopherbot pushed a commit that referenced this issue Dec 13, 2017
An exec command is normally used on platforms were the test is run in
some unusual way, making it less likely that the testlog will be useful.

Updates #22593

Change-Id: I0768f6da89cb559d8d675fdf6d685db9ecedab9e
Reviewed-on: https://go-review.googlesource.com/83578
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@golang golang locked and limited conversation to collaborators Dec 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants