Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: TOOLEXEC_IMPORTPATH is ambiguous for test packages #44963

Open
mvdan opened this issue Mar 12, 2021 · 7 comments
Open

cmd/go: TOOLEXEC_IMPORTPATH is ambiguous for test packages #44963

mvdan opened this issue Mar 12, 2021 · 7 comments

Comments

@mvdan
Copy link
Member

@mvdan mvdan commented Mar 12, 2021

#15677 was fixed, so now tools called via commands like go build -toolexec= can easily find out what package is currently being assembled/compiled/linked. That's great, and I'm already using it in a tool with great success.

There's just one exception; for test packages, it's ambiguous, as it contains less information than go list -test. Let me show an example, run with cmd/testscript:

https://play.golang.org/p/A4UaS8kkVdI

Here are the output lines I care about, from the stderr of go test -toolexec=mytool when running like testscript -v repro.txt:

# foo
TOOLEXEC_IMPORTPATH=foo
# foo [foo.test]
TOOLEXEC_IMPORTPATH=foo
# foo_test [foo.test]
TOOLEXEC_IMPORTPATH=foo_test

Note that we compile three packages at the root, excluding the foo.test main package. Those are:

  • foo, the regular package
  • foo [foo.test], the regular package built with test files, built for foo.test
  • foo_test [foo.test], the external foo_test package, built for foo.test

All these three show up in go list -test, too:

> go list -test .
[stdout]
foo
foo.test
foo [foo.test]
foo_test [foo.test]

Now, the problem is that TOOLEXEC_IMPORTPATH is lacking the components in brackets, so it's ambiguous. For example, ignoring the external test package for a second:

# foo
TOOLEXEC_IMPORTPATH=foo
# foo [foo.test]
TOOLEXEC_IMPORTPATH=foo

If mytool sees TOOLEXEC_IMPORTPATH=foo, it can't possibly know which of the two packages is actually being compiled. Another way to say that is that, if it builds an index of packages from go list -json -test -deps, using the ImportMap field as the key, it will incorrectly think that it's building foo twice, instead of each of the two packages once.

There's currently one workaround; since this generally only happens for the compiler, mytool can check the arguments to see if any Go file arguments are test files, and then add the required bracket suffix. This is the kind of hack I'm doing in a tool right now: https://github.com/burrowers/garble/blob/06a7a21e17a79c6d6717434207fa40b65ea1fe47/main.go#L284-L308

I think we should fix this in cmd/go. Whatever package import path it writes in lines like # foo [foo.test], which is also what go list -f {{.ImportPath}} reports, it should also include in TOOLEXEC_IMPORTPATH. Right now, there's this ambiguity and inconsistency with test packages.

I'm happy to attempt a fix. I would argue that this is worth backporting; there's a workaround, but it's very hacky and it only sort of works for toolexec-wrapped calls to the compiler. The other option is to admit that TOOLEXEC_IMPORTPATH is broken in 1.16.x for go test -toolexec=, and fix it for 1.17.

cc @bcmills @jayconrod @matloob @bradfitz @josharian

@bcmills
Copy link
Member

@bcmills bcmills commented Mar 12, 2021

It's unfortunate that the ImportPath field output from go list is not always an actual import path.

I think the orthogonal basis of these names is:

  • import path (example.com/foo)
  • package name (foo, foo.test, or foo_test)
  • test variant (foo.test or none)

That gives:

  • foo
    • {path: "example.com/foo", name: "foo"}
  • foo.test
    • {path: "example.com/foo", name: "foo.test"}
  • foo [foo.test]
    • {path: "example.com/foo", name: "foo", forTest: "foo.test"}
  • foo_test [foo.test]
    • {path: "example.com/foo", name: "foo_test", forTest: "foo.test"}

I wonder if it would make sense to provide those as separate variables. (Perhaps TOOLEXEC_PACKAGE and TOOLEXEC_FOR_TEST?)

@mvdan
Copy link
Member Author

@mvdan mvdan commented Mar 12, 2021

I have mixed feelings about your suggestion.

On one hand, I agree that ImportPath fields not being actual import paths is weird and surprising. I'd love to clean that up, too.

On the other hand, if you make that refactor, then go list -test -json loses unique fields by which to globally identify and index packages. Right now, ImportPath works because it includes all three components (original path + _test separation + test variant). I guess you could rely on the user's code to stitch together unique strings based on three fields (ImportPath + Name + ForTest), but that would be easy to get wrong.

Maybe simplify ImportPath in go list -json as per the above, and add another field like UniqueIdentifier that's not guaranteed to be a valid import path.

In any case, though, I think that's a separate cleanup that should happen for all of cmd/go at once. Right now, TOOLEXEC_IMPORTPATH mostly mirrors ImportPath, with the exception of the ForTest part, so I think we should just fix that.

@mvdan
Copy link
Member Author

@mvdan mvdan commented Mar 12, 2021

To clarify, when I say "uniquely and globally identify a package", I mean within a single build list. For example, to build a map[string]Package out of go list -json -test -deps.

@bcmills
Copy link
Member

@bcmills bcmills commented Mar 12, 2021

To be clear: I don't think we should change the contents of the ImportPath field reported by go list -json at this point. That seems much too invasive, even if it is defensibly correct.

(I wouldn't be opposed to adding a new field for the actual import path, though.)

@mvdan
Copy link
Member Author

@mvdan mvdan commented Mar 21, 2021

Ah, gotcha, I may have misunderstood. Do you agree with just making TOOLEXEC_IMPORTPATH agree with go list -f {{.ImportPath}}? I can try to send a patch for that this cycle.

I don't object to other env vars like TOOLEXEC_PACKAGE or one for the actual import path, but those can always be added at a later time.

@bcmills
Copy link
Member

@bcmills bcmills commented Mar 22, 2021

I'm not sure that I really understand what use-cases TOOLEXEC_IMPORTPATH was added to address, so I don't know whether making it agree with go list -f {{.ImportPath}} would be the right choice.

That said, I really don't like the behavior of go list -f {{.ImportPath}}, so I have a hard time thinking that changing anything else to agree with it would be the right direction. 😅

@mvdan
Copy link
Member Author

@mvdan mvdan commented Mar 22, 2021

The use case is uniquely identifying a package. I use that to be able to locate it in the output of go list -json -test.

go list -f {{.ImportPath}} is weird, but it is unique. TOOLEXEC_IMPORTPATH is already not just a plain old import path, as I showed in the original post, as in the third case it prints foo_test instead of foo. So I do think the easiest option is to make the two align, even if it's confusing to call that string "import path" :)

The alternative is more env vars like TOOLEXEC_FOR_TEST, but then it's not clear how I can map "real import path + package name + for test" back into go list -json -test. At least, not until go list -json gains a "real import path" field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants