Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
proposal: cmd/go: introduce a build configurations file #39005
We describe a file format for specifying a list of build configurations, where build configurations are characterized by environment variables and command-line arguments for the build system.
Go has the notion of build tags, which control the set of files that make up a package under a given configuration. Tags can be user-defined and specified with the
Due to these tags, a single import path may effectively refer to a set of packages, each package differentiated by the active tags. While referring to a single build configuration is straightforward (by specifying the correct tags and environment variables), it is much more difficult to explore all relevant build configurations.
Many tools, however, would like to know the list of relevant build configurations, either for correctness reasons (static analysis) or for UI reasons (IDEs, …). A CI pipeline should execute the tests of all relevant build configurations, not just one. Static analysis tools such as staticcheck should analyze all relevant build configurations to detect issues under all viable code paths. Detecting unused functions needs to observe function calls under all relevant build configurations, not just one. A language server such as gopls needs to be able to provide accurate code intelligence and offer the user a list of build configurations to choose from. The list goes on.
Naively iterating through all unique combinations of tags quickly leads to combinatorial explosion. Go supports a dozen operating systems on a dozen CPU architectures, can be used with and without cgo support, and makes use of tags such as
In practice, however, only a small fraction of possible build configurations are actually relevant to the user. For example, a project may only be interested in actively supporting Linux and Windows on amd64, never use any of the standard library's tags, and only differentiate their build based on whether it is a debug build or not. This reduces thousands of build configurations down to four.
Since many tools would benefit from knowing the list of relevant build configurations, and because it cannot be determined automatically, it is desirable to be able to explicitly list relevant build configurations in a format that can be shared between different tools.
We propose a file format, as well as best practices for using files in this format.
The format is line-based, with each non-empty line describing a build configuration. A build configuration consists of a name, a (possibly empty) set of environment variable assignments, followed by a (possibly empty) set of command-line arguments.
Names are separated from environment variables and command-line arguments by a colon followed by a space. Names can consist of Unicode letters, Unicode numbers, dashes (
Quoted strings may be used for elements containing whitespace. The specific format for quoted strings will match that of GOFLAGS, which is currently TBD (see #26849.) Names must not use quoted strings.
Syntactically valid examples include
A line is split into environment variables and command-line arguments at the first element that is not a valid environment variable assignment. Usually, this would be an element that begins with a dash, or one that does not contain an equal sign.
The process environment described in a build configuration is merged with the existing environment, with the existing environment taking precedence. Command-line arguments will be passed to the build system verbatim, but tools are free to add additional arguments, and it is not specified whether tools pass their own arguments before or after the arguments specified in a build configuration.
The format itself puts no restrictions on allowed environment variables or command-line arguments. However, it is strongly advised not to modify the workspace itself. That is, variables such as
Different tools have different requirements and may make use of files in this format in different ways, but they should keep the following points in mind.
Tools should allow specifying a file, but they may look for a default file name.
There are various reasons why a project may use more than one build configuration file. For example, it may want to build binary releases for only a small set of first class platforms, while still running static analysis for more platforms, to future-proof their code. Or, a parameter that meaningfully differentiates binary builds does not contribute anything to static analysis: compiling with and without
Therefore, tools should allow selection of the file to use.
It is, however, desirable to agree on a default file name to look for, so that every tool needn't be configured manually, especially for projects that can make do with a single file, and so that tools can use build configuration files by default. The default file is located at the top of the project, for example the top of a Go module. For build systems that do not have a notion of projects, such as Go in GOPATH mode, we don't define a default location at this moment.
Most tools should deduplicate build configurations to avoid unnecessary work
For most tools, it makes no sense to execute duplicate configurations. However, duplicate configurations may occur from concatenating files, or from on-the-fly generators that do not deduplicate configurations themselves. Therefore, tools should only execute unique configurations.
Tools should allow using the current build configuration
While tools may use existing build configuration files by default, they should also allow executing the active build configuration as specified by the user's current environment. In its simplest form this would be by ignoring build configuration files and operating as tools did before implementing this proposal. It may also take the form of manually or automatically appending the current configuration to the list of configurations to execute. For example, when executing staticcheck, the user would assume that their active configuration will be used, regardless of other configurations that may be used as well.
Tools may allow using specific build configurations
Depending on the tool, it may be useful to allow selection of individual build configurations, for example by their name.
The line-based file format
The proposed format is the simplest imaginable format for describing a list of build configurations: it contains one line per configuration, with the configurations explicitly spelled out. Notably absent are any form of scripting, conditionals, or maths. For example, there is no automatic way of expressing all builds
This simplicity provides several benefits:
Most users will be content typing these files by hand. Projects with many dozens of similar build configurations, however, may opt to generate them instead, which is easy via
The line-based nature of the format makes it easy to manipulate with standard UNIX tools. Most notably, multiple files can simply be concatenated. They can also be sorted, grepped, and so on. This suggests the possibility of preprocessors. For example, a simple script could process a CI log file and filter a list of builds down to those that have failed.
We include mandatory build configuration names to aid the implementation of good UIs. An editor may display these names instead of the actual configuration, and command-line tools may support executing build configurations specified by name.
Alternatives for specifying names
We explored two other ways of specifying names:
The first way loses the nice attribute that each build is described by a single line, and introduces ambiguities such as
or the issue that a file may end with a name, which affects how the concatenation of two files is interpreted.
Even without these issues, users may confuse this syntax with general comments and attempt writing something like this:
The second way lost simply because names were no longer aligned.
Preferring the user's environment
Given that the build configuration specifies environment variables, there are three ways in which they can be applied:
Option 1 is not viable. The user environment contains many important variables that cannot be discarded nor will be defined by the build configuration, such as
Option 2 and 3 only differ in which value has higher precedence: the one in the file, or the one in the user’s environment. We believe that option 3 is overall the better option. It matches the common understanding that the environment is the most specific to a single invocation of a program, more specific than a configuration file. It also allows users to use a build configuration but change details of it, such as using the
build configuration, but changing
Both options 2 and 3 mean that the build configuration is not pure, since it is affected by the user’s environment. This is not a problem. This is already the case for all invocations of Go tools, and well-designed CI environments already account for important variables. Additionally, most environment variables that are worth setting in a build configuration are not normally defined in the user’s environment, unless the user explicitly wishes to override a default.
No restrictions on variables and command-line arguments
We do not restrict the format to only specifying environment variables and tags; instead, all command-line arguments are permitted. This makes the format useful for more tools than just static analysis. For example, a tool that builds binary distributions of projects might benefit from flags such as
We do not attempt to implement a whitelist of environment variables, as different build systems use different environment variables. Even the list of environment variables that affect Go are so numerous that it would be easy to miss some of them, such as
We do not restrict command-line arguments to valid flags because, again, we do not know what the build system considers valid flags, nor what syntax it uses for passing flags. We count on users not to abuse this mechanism. For example, one concern is that someone might use the
The primary open issue is finding a name for the default build configuration file. Lacking a concrete suggestion at this point in time, we impose the following requirements for deciding a name:
Additional desirable attributes that have been asked for:
I don't really like the idea of having a file that causes tools to run with arbitrary environments and I don't really understand why it's needed here. ISTM that by putting flags into the build configuration, we already make that configuration specific to a certain build system - bazel or others don't really have a
The other comment I have would be: Why not Make? I'm not a huge fan of needing Makefiles to build code, but ISTM that all this format needs to be a Makefile would be some linebreaks and tabs and a "go build" in the right spot. So, if we have this file anyway, why not converge on something existing?
A given file would be specific to a build system, yes. Projects don't usually use multiple build systems. But the file format would be generic enough that any project could use it, regardless of chosen build system.
Because executing a program multiple times isn't always a viable solution. For example, staticcheck wants to know the different combinations of GOOS, GOARCH and tags, in a single invocation. And
This proposal is much more about tools being able to discover build configurations than it is about spawning executables.
I'm ashamed to say that I did not consider the security implications of this. We have similar problems with changing GOPATH or GOBIN or GOPROXY/GOSUMDB and probably many others. I don't have a solution to this.
I still don't understand this. You mention an example file:
To clarify: Would that file look exactly the same if you'd use bazel, for instance? If so, how would I, as a project maintainer, use this to build my project?
Or would the file in this case look more like
and in that case, how would a tool that has never heard of bazel translate that into useful information?
I realize that I come off as intensely negative here, so I want to clarify: I'd really like having a mechanism like this. As elegant as I find the tags- and GOARCH/GOOS approach to conditional compilation the go tool has taken, the exact issues you are mentioning and are trying to solve have always bugged me. I'm all for explicitly listing a set of valid build configurations. I just don't see how build configurations can be listed agnostic to the build system used, while staying even remotely declarative.
If, OTOH, we were to restrict this to the go tool itself, this would open up possibilities for a purely declarative format. If, say, you could only specify combinations of
Of course, that would also mean that this file can't be used if your project can't be built with the go tool. Personally, I find that a small price to pay for far more well-defined semantics; at the end of the day, I don't really see it as the job of the go project to define its interactions with any and all third-party tools out there. And it would still be possible for projects like bazel or the like to provide a way to programmatically generate this new build format from a
So, anyway, it's because of how much I'd like a solution to this problem that I'm trying to hammer the suggested solution into something I'd consider more workable :)
The file would look different depending on what build system your project uses. This is the same approach that https://pkg.go.dev/golang.org/x/tools/go/packages takes, if you look at its
The main use case here is tooling, so it makes sense to me that the design should be compatible with
If you use
This is all a bit theoretical at the moment, as
For what it's worth, we did consider a more constrained and less generic format at first, where we could interpret each "build configuration" statically. However, that didn't make sense to us because it would mean only supporting a single build system. It could make sense for
Also, I concur with the sentiment that this is a problem really worth solving, but it's also really hard to solve well. I also worry about potentially malicious or costly files being picked up automatically. Personally speaking, I don't care about build systems other than Go's own, but I don't think it's good long-term planning for tooling.
A few loosely related thoughts:
@mvdan Ah, so the intended way for tools to consume this file is to basically pass it through as an input for