Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: Need an easy/obvious way to copy testdata from module cache to a test-running directory #50312

Closed
dr2chase opened this issue Dec 22, 2021 · 21 comments
Labels
FrozenDueToAge fuzz Issues related to native fuzzing support Proposal
Milestone

Comments

@dr2chase
Copy link
Contributor

To test, benchmark, or fuzz-test downloaded software modules, the current recipe (at least, the one that I learned) is

mkdir foo
cd foo
go mod init foo
go get -d -t -v somepackage@someversion
go test -c -o foo somepackage
./foo -test.bench=BenchmarkSomething

The problem here is that test and benchmarking often depend on files in testdata. It is often possible to run the test in the module cache, but this depends on the test/benchmark not writing to the testdata directory, and also depends on knowing the directory in the module cache (see workaround below). Fuzzing that finds a failure does write to the testdata directory, so this just doesn't work for new fuzzing.

One workaround is

cp -rp `go list -f {{.Dir}} somepackage`/testdata .

but this is not something that people find or figure out easily.

I propose adding an option to go get or a subcommand to go mod to copy testdata, if it exists in the downloaded package, into the current directory. For example, go get -d -t -T -v somepackage@someversion or go mod testdata.

@gopherbot gopherbot added this to the Proposal milestone Dec 22, 2021
@prattmic
Copy link
Member

Rather than building the test binary (go test -c) and running it separately, I presume if you used go test -bench=BenchmarkSomething somepackage then cmd/go would just Do The Right Thing.

Could you elaborate on why you don't do that?

My guess would be to exclude time taken to build the test from running the test. If so, then perhaps (just throwing out ideas) an alternate approach would be for go test -c to save the test binary in the cache directory so that subsequent go test runs don't need to build.

@dr2chase
Copy link
Contributor Author

I don't do that (in bent) because

  1. Bent does repeated runs (25) of A, B, A, B, benchmarking of baseline and tip
  2. Bent does repeated timed builds (sometimes 25), also A, B, A, B (optionally completely randomized, I need to do the same for benchmark running) to get a benchmark of the time to build the binary.
  3. To guard against user-error when compiler people (e. g. me) are running benchmarks of compiler modifications and with experimental flags, bent repeatedly clears the cache.

On the other hand, bent already deals with the testdata problem. I definitely need at least the interleaved run order unless benchmarks are run on a very controlled machine, "stuff" just happens sometimes. (I.e., the purpose of bent is to get rid of all my benchmarking and testing speedbumps and footguns -- and immutable testdata in the module cache is one of those speedbumps).

That also doesn't help fuzzing, and some benchmarks/tests assume (incorrectly, and I sometimes submit pull requests to fix them) that they can write to testdata.

An alternate approach for fuzzing (because bent has solved the problem in its case) might be to have fuzzing copy testdata out of the module cache into the current directory?

@katiehockman katiehockman added the fuzz Issues related to native fuzzing support label Dec 23, 2021
@katiehockman
Copy link
Contributor

/cc @golang/fuzzing

@josharian
Copy link
Contributor

//go:embed seems like a useful tool here. Maybe cmd/go and/or package testing could make that easier. As a straw man, cmd/go could embed testdata by default and package testing could expose a fs.FS with their contents.

@FiloSottile
Copy link
Contributor

Strong +1 on providing an fs.FS with the contents of testdata. go test -c would use embed, while go test would just use the filesystem. I think I would use that even if not moving the binary, as it's more obviously correct than relying on the cwd.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Jan 5, 2022
@rsc rsc moved this from Incoming to Active in Proposals (old) Jan 12, 2022
@rsc
Copy link
Contributor

rsc commented Jan 12, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jan 19, 2022

When you run go test foo it runs in the module directory.
Why not do that?
Why copy the data out of the module directory?
In general it's not clear exactly what to copy.
For example compress/gzip etc use the compress/testdata directory.
So in general it's not 1:1.

You can do what go test does using cd $(go list -f '{{.Dir}}' pkg).

@josharian
Copy link
Contributor

When you run go test foo it runs in the module directory. Why not do that?

A common workflow for me (to catch flaky tests) is to go test -c and then run, often with a wrapper. And it's an annoyance that I then have to move around that binary to the right place.

In general it's not clear exactly what to copy.

Sure, but that doesn't mean we can't make the common case (testdata subdir) easy.

@ianlancetaylor
Copy link
Contributor

cp -r $(go list -f '{{.Dir}}' pkg)/testdata .

@josharian
Copy link
Contributor

Yes, you can also copy the testdata instead of moving the binary. This is a question of "just works"/"batteries included", vs not.

@ianlancetaylor
Copy link
Contributor

Sorry, I thought this issue was about copying the testdata.

Oh, now I see that @dr2chase suggested the cp approach earlier. Are we just talking about a more convenient way to do that?

@dr2chase
Copy link
Contributor Author

The current way of doing it requires knowing the where's-my-module-cache incantation, which I think is a too-large speedbump.

With the benefit of additional hindsight, -ALSO_PUT_A_COPY_IN_THE_CURRENT_DIRECTORY, which also places a copy of all the files and subdirectories for the referenced package into the current directory, might be a good choice.

mkdir foo
cd foo
go mod init foo
go get -d -t -v -ALSO_PUT_A_COPY_IN_THE_CURRENT_DIRECTORY somepackage@someversion
go test -c -o foo somepackage
./foo -test.bench=BenchmarkSomething

The reason for this is that tests already exist that variously

  1. write files into testdata
  2. read files from other subdirectories
  3. read files from the source directory
  4. fuzzing writes into testdata

As far as I know, none of our tools will complain at people for doing any of these things, therefore, they have already done it, and will continue to do it. I am open to suggestions about a better name for the option. The benchmark runner (bent) contains (or will contain, once CLs are submitted) workarounds for all these things.

For the commands above, where the current directory is treated as containing "build instructions", the go files in the current directory are ignored. Building "." instead of somepackage@someversion generates a request to "go mod tidy" and then builds a same-behaving binary that does not compare equal. (I am generating these answers by running experiments, figuring that there would be questions).

@rsc
Copy link
Contributor

rsc commented Jan 26, 2022

Regarding the enumeration in the previous comment:

(1) Writing files during tests is disallowed. People may do it, but it makes tests break and we are actively discouraging it by doing things like making the module cache read-only.

(2) Reading files from other subdirectories or parent directories within the same module is allowed; tests can assume they run in the directory where the test sources live, since that's what go test does.

(3) Same.

(4) That was a bug and I believe it is being fixed or is now fixed.

It sounds like maybe what you would like better is "extract this module into a directory so I can cd into it and pretend to be a developer of that module". I wonder how we should address that. We could make go mod download have a -o flag, I suppose. The next problem will be that you can't use version control commands in that directory, and we won't fix that though. Maybe doing

go mod download -json mod@vers 

which will tell you the Dir where it is unpacked, and then just cp -a the entire dir to go into it?

/cc @bcmills @matloob

@bcmills
Copy link
Member

bcmills commented Jan 26, 2022

(4) That was a bug and I believe it is being fixed or is now fixed.

That was #48495, and should be fixed as of CL 359414.
(go test -fuzz should now fail for packages outside of the user's workspace; the packages within the workspace are assumed to be writable.)

@dr2chase
Copy link
Contributor Author

This "extract this module into a directory so I can cd into it and pretend to be a developer of that module" is a plausible description of the sort of thing I'm after, but I think "go test" should not just be for developers of a particular module.
For example:

  • I want to run other people's benchmarks to see how changes to the compiler affect their code
  • I am inclined to want to run their tests to get early warning of problems from compiler changes
  • I might try to run their tests with code coverage turned on to see how much I should (not) trust their code

@bcmills
Copy link
Member

bcmills commented Jan 26, 2022

It sounds like maybe what you would like better is "extract this module into a directory so I can cd into it and pretend to be a developer of that module". I wonder how we should address that.

We have lots of related proposals for that already:

Some combination of those would allow for a tool that accepts a Go module path and version, resolves it to a VCS origin and revision, verifies that the module contents are the same, and materializes a clone of the repo checked out to that revision.

@rsc
Copy link
Contributor

rsc commented Feb 2, 2022

It sounds like this specific fix "copy testdata out" is probably not the answer to working with other people's modules. Given that, perhaps we should decline this proposal and focus on others that help obtaining the whole module?

For the record, 'go test' of dependencies is expected to work. If a test is writing to its own testdata directory during a test, that is a bug in their code, same as the code not compiling on a particular system or any other kind of bug, and it should be fixed.

@martin-sucha
Copy link
Contributor

For the record, 'go test' of dependencies is expected to work. If a test is writing to its own testdata directory during a test, that is a bug in their code, same as the code not compiling on a particular system or any other kind of bug, and it should be fixed.

(1) Writing files during tests is disallowed. People may do it, but it makes tests break and we are actively discouraging it by doing things like making the module cache read-only.

(2) Reading files from other subdirectories or parent directories within the same module is allowed; tests can assume they run in the directory where the test sources live, since that's what go test does.

Perhaps we should explicitly document this in testing package's documentation and in the output of go help test?

@bcmills
Copy link
Member

bcmills commented Feb 3, 2022

@martin-sucha, it is documented as of Go 1.18, in the same place where we mention where the test binary is run. Per https://pkg.go.dev/cmd/go@master#hdr-Testing_flags:

When 'go test' runs a test binary, it does so from within the corresponding package's source code directory. Depending on the test, it may be necessary to do the same when invoking a generated test binary directly. Because that directory may be located within the module cache, which may be read-only and is verified by checksums, the test must not write to it or any other directory within the module unless explicitly requested by the user (such as with the -fuzz flag, which writes failures to testdata/fuzz).

@rsc
Copy link
Contributor

rsc commented Feb 9, 2022

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Feb 9, 2022
@rsc rsc moved this from Likely Decline to Declined in Proposals (old) Feb 16, 2022
@rsc
Copy link
Contributor

rsc commented Feb 16, 2022

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc closed this as completed Feb 16, 2022
@golang golang locked and limited conversation to collaborators Feb 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge fuzz Issues related to native fuzzing support Proposal
Projects
Status: No status
Development

No branches or pull requests

10 participants