cmd/go: stamp git/vcs current HEAD hash/commit hash/dirty bit in binaries #37475
Comments
This may be a duplicate of #29814. |
Maybe we'll auto-bump this with a bot over time. See golang/go#37475 & golang/go#29814 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We should figure out exactly what we want to record. It would be helpful to time how much overhead this would be in 'go build'. @bcmills, do you have any numbers about how much time this would add? |
BTW I agree it's a duplicate of #29814 but I'll keep using this one because it is marked as a proposal and already appeared in the minutes. |
|
Hmm, I realized that I didn't account for checking tags in the above calculations. Still, I expect those costs will be order-of-magnitude similar to any other |
I'd like to point out a (maybe small) problem with this approach: changing the version of source code, but not the code itself will cause the binary to change. Let me explain the use-case I have that will be broken by this change:
If the Helm chart contents and the binaries don't change, then no upgrades are performed by Kubernetes. If the version of the checkout is stamped into every Go binary, then this scheme crumbles and:
|
@dottedmag, what if we made it conditional on importing a new package, say Would that work for your use case? |
I'm in a practically identical case to @dottedmag. Inevitably, somewhere in the monorepo we will (perhaps unintentionally) bring in a dependency that depends on a magic package that breaks deterministic builds. I think for most common cases, having this proposal enabled by default would be preferable. For my use case, I would be satisfied if there was a documented way to opt out of it. We already are using |
I agree with @mark-rushakoff: relying on imports will be brittle unless this import is considered only for It's not an author of some recursively included library, but a builder of a final binary who in a position to decide whether to put versioning information into the binary or not. |
@dottedmag, note that many functionally-equivalent builds will already produce slightly different binaries due to the version-stamping for This proposal would case more of the same sort of version churn, but it is fundamentally the same churn. That suggests that we may want to provide an option to disable version stamping in general. IMO, that should be a separate proposal. |
True. In practice, it is not a problem as changing the versions of dependencies nearly always changes the code of dependencies — nobody is updating versions of dependencies endlessly for no reason, usually, they only get updated to get a new feature or a bugfix. Filed #37693. |
The discussion above about reproducible builds sounds like it would be satisfied by having the version embedded by default but also having an opt-out command-line flag; no special package needed. Do I have that right, @dottedmag and @mark-rushakoff? |
Yes, I think a flag to opt out of embedding version details would suffice for reproducible builds. It would be nice if there was a single flag like I don't care about reproducible builds when I'm at the command line building something for my own use; I care about reproducible builds when I am writing build scripts that run as part of a CI/CD pipeline, so it is not a big deal if I need to look up the whole collection of settings to make those builds reproducible. |
@rsc Correct. |
OK, it sounds like everyone agrees about doing this by default, with a flag to turn it off.
That buildid step could install the git version info too. I'm confident git will be faster than the link. Based on the discussion, then, this seems like a likely accept, although we may not be able to implement it until the next release (Go 1.16). |
Will this include just the commit hash or also a (any?) version tag |
Version tags introduce many sharp edges, at least for git… since you can have a git repo cloned without having fetched all tags, or can have different local tags, or can add a tag to a SHA at any point in time, using the nearest tag (e.g. |
@liggitt, note that that problem only occurs in one direction. (It is fine in general to have N names for one commit. The important property is that we resolve only one commit for a given name.) The I realize that that algorithm makes things awkward for the |
Since tags can be local, preferring a tag over a sha would mean two users could have identical local tags associated with different SHAs. Or are you suggesting the mapping of shas to tags would not be determined by consulting the local VCS, but the remote canonical VCS? |
That seems like a reasonable motivation to include both the commit hash and semantic version, rather than just one or the other. |
No change in consensus, so accepted. We may still need to work out exactly what to include, but everyone seems to agree that this is worth doing (barring some discovery about it being more expensive than we think). |
Temporary fix until golang#37475 is done.
Temporary fix until golang#37475 is done.
Tentatively marking this for 1.16. We'll try to get this in, but we have a lot planned, and I can't promise it will get done. If someone is interested in working on this, I'll sketch out what needs to be done. This is complicated though, so probably not a good first issue.
|
The `$Id$` which can be auto-expanded in files via the `ident` attribute does not function the same as the old CVS `$Id$` keyword. In CVS, the keyword expansion was updated on every commit, to contain the current commit id. In git, it is expanded with the identifier of the _blob it is found in_. That is, previous to this commit, the `$Id$` (and hence the reported "version hash") was f220e479c5d8d85c7b753e95dc5fe0b67bbfbd38 -- and had been since the file was changed last, in f386360. Remove the misleading hash, and attendant git attributes file. It will frequently mislead callers that the version has not changed when, in reality, it has. Git itself does not have a way to embed "the current commit hash" into a file in a way that is updated whenever the commit changes. Nor does Go natively have a way to embed it into the binary at build time, though this may change in the future[1]. As alluded to in that ticket, most projects elect to pass in the build-time commit information via something like: ``` GIT_COMMIT=$(git rev-parse HEAD) go build -ldflags "-X main.gitCommit=$(GIT_COMMIT)" ``` However, smokescreen does not currently have any build system external to `go build` which could embed the above logic. Rather than choose a build system and introduce a new dependency, remove the misleading hash entirely. [1] golang/go#37475 and also golang/go#29228
As per the discussion in golang/go#41145, it turns out that we don't need special support for build caching in -toolexec. We can simply modify the behavior of "[...]/compile -V=full" and "[...]/link -V=full" so that they include garble's own version and options in the build cache key. We add a number of things to the -V=full output. First, "+garble" since that is the relatively unique name of the Go program. Second, the version of Garble itself. Since we can't do this via modules until golang/go#37475, we instead use the hex-encoded sha256 of our own binary. Finally, we need to add the garble options which modify how we obfuscate code, since each should result in different build cache keys. GOPRIVATE also affects caching, since a different GOPRIVATE value means that we might have to garble a different set of packages. This feature works, with only a minor regression in the ldflags test since the -X linker flag is now broken with private names. The following commit will fix that.
(Related but different than #35667)
cmd/go currently embeds all the module dep information in binaries and it's readable with e.g. https://godoc.org/rsc.io/goversion/version but it does not include any information about the top-level module's version.
I propose that cmd/go look at {git,svn,etc} state and include in the binary:
Currently many projects do this by hand with a
build-program.sh
and stamping it manually with--ldflags=-X foo=bar
, but that means programs built the normal Go way lack that information, and people end up with non-portable (shell, often) build scripts.I've hit this enough times with my own projects that it's actively frustrating me. It's worse when programs are clients that want to report their version number to a server (which might want to do analytics, build horizon enforcement, protocol version negotiation, etc) and then can't. There are alternative ways to do all that, but they're tedious.
Mostly I'm concerned that people have bespoke, often non-portable build scripts.
The text was updated successfully, but these errors were encountered: