Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go does not download toolchains when the cachi2 Go version is newer than the specified toolchain #501

Closed
taylormadore opened this issue Mar 19, 2024 · 4 comments

Comments

@taylormadore
Copy link
Contributor

taylormadore commented Mar 19, 2024

Cachi2 does not provide a toolchain in GOMODCACHE when the installed version of Go in the cachi2 image is the same or newer than specified by the toolchain directive in go.mod. This behavior of Go is noted in the documentation for Go toolchains [1]:

the go command uses its own bundled toolchain when that toolchain is at least as new as the go or toolchain lines in the main module or workspace.

This may cause an issue when performing a hermetic build from the cachi2-output where the GOTOOLCHAIN environment variable is set to auto. If the builder image does not have a version of Go matching or exceeding that specified by the toolchain directive, the build may fail when trying to fetch the toolchain.

References

[1] https://go.dev/doc/toolchain
[2] #419 (Go 1.21 design doc)

@eskultety
Copy link
Member

Hmm, I wonder if we can ultimately provide a generic fix that would cover all user scenarios as consumers must not blindly assume we'll deal with everything, they need to cooperate by selecting an image that on its own can attain what they want, i.e. building the project. Consider the following scenarios and go.mod contents:

# go.mod
   go 1.21
   toolchain 1.21.5

# project base image
$ go version
go version go1.21.4 linux/amd64

# cachi2
$ go version
go version go1.21.6 linux/amd64

Because cachi2 has a newer toolchain, 1.21.5 won't be downloaded, we set GOTOOLCHAIN=auto and the project build will fail because the builder image will try installing the newer toolchain which we didn't prefetch. In this case it may be reasonable for us to compare the versions and fallback to GOTOOLCHAIN=local and pass handling of the build problems, e.g. stemming from potential version mismatch in the builder image to the user. Since toolchain isn't enforced it seems like any 1.21 toolchain would successfully build the project with GOTOOLCHAIN=local and hence this would be a reasonable tradeoff for us.

# go.mod
   go 1.21.5

# project base image
$ go version
go version go1.21.4 linux/amd64

# cachi2
$ go version
go version go1.21.6 linux/amd64

This is a user problem, because they strictly mandate go 1.21.5 , yet the base image they build with doesn't even comply with cachi2 out of the picture (note that cachi2 in this case has a good argument of using whatever the latest Go version is). This would fail either way (unless we employ one of the suggestions below), so I wonder whether it's okay for us to do nothing and let it fail with a more confusing error about not being able to download the toolchain because of GOTOOLCHAIN=auto or again set it to GOTOOLCHAIN=local and make it a user error, but likely with a more descriptive error.

Despite the above, there are options for us to go more than one step further and basically walk the consumers by the hand through the whole process taking care of their own problems:

change our base image and be in full control of the Go version in it

This is already being discussed here, so a TL;DR is that we'd install the first minor release of all supported Go SDKs (e.g. go1.21.0) and always set GOTOOLCHAIN=auto for container-based environments since any 1.21 toolchain would meet the criteria and hence we'd prefetch anything newer. The downside to this (aside from maintenance and Containerfile changes) is that we'd potentially be using vulnerable code for the prefetch as I really don't know what toolchain is used for go mod download with GOTOOLCHAIN=auto as this is quite opaque. If all operations are performed by the new toolchain, then the security aspect essentially becomes a non-issue (or at least not that pressing), but if it's only used for actually building the project, but the old toolchain is used for module fetching, then it may become an aspect to consider.
That all said, keep in mind that ^this would only handle container-based environments and for local host deployments we'd still need do it differently, e.g. that variable handling mentioned earlier.

download any toolchain explicitly

We do have a code foundation to download toolchains already, but we decided to only ever fall back to using that code path with local environments rather than container based environments. The arguments against using this solution in container-based envs are:

  • downloading whole Go SDKs poses a non-negligible performance penalty
  • we're not in charge of all environment variables exposed in the container so one of the default locations either go download or go install uses might not be actually writable to the default user
  • SDKs downloaded via go download are different from the behaviour of GOTOOLCHAIN=auto in that the SDKs are not stored in GOMODCACHE as the latter would do so we'd still need to figure out how to get it there manually
    While definitely an option I'd personally really like to avoid this one if possible.

This was referenced Mar 20, 2024
@eskultety
Copy link
Member

  1. ...
    In this case it may be reasonable for us to compare the versions and fallback to GOTOOLCHAIN=local and pass handling of the build problems, e.g. stemming from potential version mismatch in the builder image to the user. Since toolchain isn't enforced it seems like any 1.21 toolchain would successfully build the project with GOTOOLCHAIN=local and hence this would be a reasonable tradeoff for us.

In retrospect, ^this won't work with submodules or workspaces if each submodule requires a different version of Go by the means of the go line, because we'd only ever set GOTOOLCHAIN once (if at all).

So I think as discussed via private channels the only reasonable fix for this is to always use GOTOOLCHAIN=auto for fetching only (using 1.2X.0) as the base toolchain for the job to make sure we always prefetch whatever toolchain a project/module requires.

At the same time, we'd stop forcing any GOTOOLCHAIN setting onto the users' container build process by the means of env variables generation and let them either define it explicitly or let them used whatever the default is in the respective base image (now that we'd use GOTOOLCHAIN=auto on our own for the prefetch) they wish to use (will update the design with this remark).

@taylormadore any objections to the paragraph above^?

@taylormadore
Copy link
Contributor Author

any objections to the paragraph above^?

Nope, looks good to me 👍. Thank you for all of your efforts on this feature ❤️

@eskultety
Copy link
Member

Resolved with e0732c7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants