Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
cmd/go: allow package authors to mark older package versions as insecure #24031
In order to achieve reproducible builds vgo keeps using specific package versions until an explicit upgrade is done. IMHO this is an excellent default but I'm worried about insecure package versions as currently vgo can't detect if the build contains an insecure package version.
Can vgo be changed so that a package author is able to specify that every version below X is deemed insecure and if an insecure package version is used during a build that the build will fail (with a flag to override)?
@michael-schaller I'm not sure what new functionality you are asking for.
Right now vgo will not choose a version "below" a version you specify. So if there is an insecure package version, put the minimum version selector in your package or another package and it will not choose it. Maybe For modules that build main packages, you can also specify version ranges to exclude. Maybe I'm missing something?
This only works if you know that a certain version is insecure. I think what he's asking for is a mechanism for package authors to broadcast to the world that a certain version is insecure; so that every time a user pulls it, they'll be warned that that version is deprecated and they'll know to update their mod file.
@ALTree correct. :-)
One naive idea would be that 'vgo build' could check the 'go.mod' (or another machine readable file) of the latest package versions for security information. This would also be great for Continuous Integration as then a package author could notify of security issues via CI build failures that are (hopefully) monitored.
@rsc mentioned Deprecated Versions (as part of the Defining Go Modules article) which is similar to this issue. He proposed to append +deprecated to a version tag which would also be a viable solution for this issue if +insecure would be honored by vgo.
IMHO that would be a pretty bare bones solution though as I presume that people would soon want to extend that further. For an instance I could see that someone would also want +buggy for a version with a serious bug (for an example a serious memory leak) or +broken for a version that is broken under certain circumstances (for an example the Windows build is broken). Furthermore this solution lacks a way to add more context as for deprecated versions one might be interested in the deprecation announcement or timeline and for insecure versions one might be interested in CVE, severity, ... and so on.
That said I think signaling via tags if a version is deprecated, insecure, ... is not adequate. Maybe even the proposal from my previous comment isn't. Maybe the discussion should rather go into the direction of a machine readable changelog which could be managed via vgo release.
I'm not sure about the utility of this feature. If I release version 1.2.3, surely it must be fixing some bugs over 1.2.2 and I would probably mark 1.2.2. In other words, on almost every (point) release, I would mark all older versions as insecure. You might say this is only for bugs with a CVE, but I think the point still stands.
I don't see this providing much over just reporting that newer versions have been published.
@uluyol there are actually multiple points for this feature (which I sadly failed to point out so far):
I see. I agree that tags lack content, and that it would be useful to also have a reason (or changeling) listed for why a version is insecure. Then on build you could get messages like
build failed: insecure packages
I can see why this would be useful. I do think that we'd want to let people override this behavior though.
I agree that additional in-band signalling would be useful way to let the maintainer know that action may need to be taken—but I disagree that every situation requires automatic, mechanical action.
Security fixes are important but automatically applying them can be as fraught as always ignoring them.
If the module in question is for a non-opensourced essential service using foo v1.0.5 and there's a foo v1.1.4+security that needs to be immediately investigated by a person. However, if its fixing a vulnerability introduced in foo v1.1.0 it may not necessarily be worth the effort and risk to drop everything and upgrade right now.
I would prefer if vgo would continue to work the way it does today, with another tool, perhaps
The idea is that this tool is run on-demand by the programmer, rather than automatically. If this tool is easy to use,
The purpose of vgo is to add versions to the vocabulary of the toolchain, so that users and tools can talk to each other sensibly about versions. As I mentioned in the https://research.swtch.com/vgo-module article, I think it would make sense to have a v1.2.3+deprecated tag, using an annotated tag so that there's a commit message. The commit message can say anything it wants about why the release is deprecated, and we can show that to users. We could easily add a notation in the text for identifying security problems. What happens next is up to tools. Probably vgo list -m -u (tell me about pending module updates) would do well to show information about currently-used modules that have deprecation notices.
changed the title
x/vgo: allow package authors to mark older package versions as insecure
Jul 12, 2018
I've been thinking a bit about where to write down this information. The magic extra tag is clearly too limited in what it can record. I looked briefly into finding a way to write more information, such as using an annotated tag's commit message in Git, using svn propset to record a special per-revision property in Subversion, and so on. But something that must be reinvented for every different version control system is a bad idea.
Of course, we can't write the information in the original module version's go.mod, since we didn't know it was insecure when we tagged it, and the file tree is by convention (and enforcement via go.sum) immutable after tagging.
But maybe we can record it in a go.mod in a later release of the same module. Specifically, we could say that to look for updated post-release metadata about a particular module we grab the latest version's go.mod and look there. So for example, suppose v1.1.2 has a security problem, it was fixed in a rewrite for v1.2.0, and we're up to v1.2.4 when we discover the problem. Then we'd issue a v1.2.5 that is just v1.2.4 with an updated go.mod that adds something like:
The fields are "bug", the affected package (if you don't use this package you don't have the bug), the half-open version range when the bug existed, a URL with more information, and a short description. Maybe a security bug would conventionally begin with a "security: " prefix in the description.
Then any future "go get", even one not asked about that module, would look up the latest version, find v1.2.5, learn about the bug in v1.1.2, and print a warning. Also, we could make this information available to running programs, which could inspect their own binaries for the package and version and then self-diagnose on a server status page, automatically report to local monitoring systems, and so on. (We've done something like this inside Google since early 2013 and it works really well.)
If we later decided to issue a v1.1.3 with that fix, we could issue a v1.2.6 that only updates go.mod:
If we wanted to warn people about the bug but didn't have time to fix it yet, or the bug has been there from the beginning, the half-open interval can drop either side:
The same general idea could apply to marking earlier versions deprecated or to reporting known conflicts with other dependencies.
It's slightly awkward to be issuing go.mod-update-only patch releases, but doing so creates a history of the annotations and makes them available via module proxies without special arrangement.
All of this is still sketchy but the above seems like it's on a better path than just the +deprecated tags.
@rsc for some clarification, when should the bug directive be used? Only when there are security issues? Security issues plus deprecation? If deprecation, is there a common test people should us for marking something as such (e.g., every patch release)? Every bug?
Note, for every bug I poked at a couple repos I've had to deal with:
I tried to look at a sampling of typical and worse case scenarios.
@mattfarina If I have followed, I believe @rsc has referred (e.g., here and elsewhere) to this particular issue #24031 as potentially also being part of the solution for recording pair-wise incompatibility post publishing:
And in #24031 (comment), towards the end of that more recent comment (which was mostly using security or a general bug as an example), Russ also added:
That said, the one-liner here is currently:
"cmd/go: allow package authors to mark older package versions as insecure"
If the intent of this particular issue #24031 is broader than security, it might make sense to update the one-liner to help people know where to discuss which topic (vs. maybe #26829 is the better place to discuss recording incompatibilities, or ___).
Mechanism still TBD, but perhaps the mechanism (if I've followed the discussion) might be something like:
But sorry if this is off base or just noise... and I agree some clarity on what this particular issue is intended to cover would be helpful, because it is an important set of topics...
referenced this issue
Jan 3, 2019
Since my proposal was closed to merge into this thread, I'll put a slightly cut down version here:
My proposal is to introduce two git tag forms, which look like:
I think this would be a huge boon to Go security overall. I expect my proposal won't be perfect and I welcome feedback.
See also: go mod version definition.
rsc's comment above suggests putting this information in the go.mod file, what advantage do you see in putting this in VCS tags?
The go.mod file seems to have the advantage that it doesn't need to be reimplemented per-VCS, and it'll work when the client is retrieving a package over HTTPS (aka a "mirror" in the diagram on https://blog.golang.org/modules2019).
the two major points against using repo metadata like tags rcs mentions are (1) the tag is too limited and (2) it has to be re-implemented across VCSs. Here's some answers to that and some extra points:
1. I think it is significantly more expected for tags to contain metadata, rather than files in the 'latest' revision
I do like how the go.mod approach naturally has an audit trail and history. That said, I think this approach falls further out of line with what is expected of version control systems than the tag approach. I did a little research, and there appear to be few restrictions on what can be tagged across the VCSs Go supports.
In terms of meeting expectations of technology, I think it's quite odd to have a file that contains cross-version metadata on the repo that is itself a revision of the repo versus tags which are created for the express purpose of expressing such metadata.
2. The noted issues on the limitations of tags don't seem to be as bad as thought
Bazaar is the one VCS that has issues – with white space – because canonically whitespace is replaced with "-" for ease of manipulation. As I propose, I think this is simply a case of allowing "-" to be interchangeable with " ".
3. I can't find any evidence that tracking tag history is a problem in any major or supported VCS
When it comes to the history of tags, mercurial commits these to the history via a file, git has these via tag annotations, bazaar collects this information by default and SVN considers them the same as branches. I don't see any issues here with collecting version information, and I think the precedent here is that VCSs will support this.
4. a special committed file is significantly more difficult to implement in pre-existing systems that would make use of this information
Where I work, there's a third-party import tool that works – as I think many others would also – by acquiring tag metadata and cloning the repo at a set of given tags. Using tags to indicate vulnerability makes acquiring this data a case of reading the information already on hand.
Using a file committed to the 'latest version' requires the tool to understand first determine what the 'latest version' might be, download its contents, load the module file and parse the module format.
If the third-party import tool is language agnositic, it's not necessarily going to clone the latest version of the repo. In this case, downstream services looking at the repo and its tags won't be able to see any information on whether their specific version is vulnerable.
5. with tags, it's feasible to determine the vulnerability of forks or other derivative works
Version information is repo specific, while tags are universally pinned to a specific part of the repo's history, regardless of where that code ends up.
In cases where a repo is forked, merged or otherwise ends up in a state where previous version indicators no longer apply, a system based purely on module versioning would make it extremely difficult to determine if the security bug affects this fork. Using the module file system, a mapping between the versions of repo A and repo B, its fork would need to be maintained in order to determine if the fork is affected by security bugs in the repo it descends from.
In the case of using tags, it can be determined if a version of a fork contains its parents security bug simply by checking if the history tagged with the bug exists in the fork. Correspondingly, it's possible to detect if the issue was already addressed by checking if the fix for the bug also exists in the history of the fork.
It's important to note the system I propose doesn't solely rely on this kind of data, it's simply a side-effect of using version control systems' existing tag systems.
6. Creating new histories to mark versions as vulnerable seems like a nightmare when it comes to derivative repositories
It's common for software companies to discover security issues and attempt to remediate them before going public with the information. This information might also be shared under embargo with third parties.
In this case, adding or modifying the go.mod file would create many separate histories that might be incompatible, especially if internally a non-'latest' revision is being used. If v1.1.2 is being used internally, and v1.1.3 is created to mark a revision as vulnerable, it's going to be really quite difficult to resolve differences with the public upstream repo. It might even require architectural changes to modify frozen repos to mark them as vulnerable.
Ideally, it should be possible for internal maintainers of packages to mark revisions of the package as vulnerable without modifying the history in a way that causes potential incompatibility of the upstream or requires changes to code that would normally require an unfrozen repo.
7. Older packages or systems may not have support for go modules
It's common for very old systems to have security bugs found through better, more modern techniques. Introducing go.mod in these cases necessitates moving all its dependencies to the Go Module format which may be nontrivial. In some cases, semver versions might need to be introduced for the first time so that systems can now detect whatever the 'latest' revision is. Many systems, like x/go/loader don't support Go modules and as such, pipelines may be broken by making this change.
I think it's very important that we allow package maintainers to be able to tag vulnerabilties without potentially breaking the build.
8. The module proxy protocol already includes a metadata structure containing similar information
It's come up a few times that this system would need to be compatible with the module proxy protocol. The proxy protocol already includes an 'Info' struct containing repo metadata such as the semver version tag my proposal is derived from.
I'd argue that this is a much better and more convenient place to contain vulnerability metadata than in the 'latest version' of the repo. Otherwise, as mentioned before you expect the downstream systems to acquire the latest version of the repo regardless of which version they're interested in, download its associated file and parse the metadata out of go.mod.
I think from a Go perspective it makes the most sense to put this in
I think @Zemnmez has valid concerns but I don't know how much the Go team and community should care. Go modules are a major shift and that is bound to break some existing custom solutions. Trying to ease that shift for public and commonly used alternatives is IMHO a must but breaking non-public company-internal solutions is IMHO fair game as companies should be able to make the resources to adapt if needed and they choose to be non-public with their custom solution and so it is their responsibility to catch up and stay in the game. As I'm working for a large company I know this pain all to well but as with any open source work this first needs to be solved sufficiently in public so that it doesn't block company-internal adoption and I don't think that using
Last but not least this feature request was solely intended to make this kind of information available in a public, consistent and human and machine readable format and to let it be used by the Go tooling so that humans can be notified in case something is less than desirable during a Go build. How the whole security ecosystem will react to the availability of this information is IMHO not in scope for this discussion as first of all the Go community/ecosystem needs to be happy with it. That said I think it is a must that this information is easily accessible by other tools (like security audit tools) and IMHO
@michael-schaller I've been thinking about how I can respond to this comment for a while, and at its core I can't in good faith take you up on any of your points without you at least attempting to address any of my seven points beyond your saying:
I'm sure you can see how important it is to have this problem solved in a meaningful way, and how handwaving away real comments on the specifics of the solution because it wasn't the intention of your request to support the wider Go community doesn't help us get there.
If my comments really are so far-flung from the needs of the Go community, please address how individually. I am sure there's at least one point of my seven which is aligned with other members of the Go community, considering the highly positive reaction on my original proposal.
@Zemnmez I think you misunderstood me. IMHO the (abstract) ideas, needs and concerns behind your seven points are important. You didn't make that the primary focus of your seven points though and instead it feels like you presented an implementation proposal that fits your needs and added justifications why this implementation makes sense to you.
As an example I wholeheartedly agree with you that there is the need for an audit trail. I disagree that this information needs to be stored in VCS tags but I might be wrong on that and could be convinced otherwise. What I would like to see is that people clearly communicate their (abstract) ideas, needs and concerns so that abstract goals and non-goals can be formed. Based on that information brainstorming can happen on how these goals can be solved with respective pros/cons. Then a conscious decision can be made on what fits/works best for Go...
If anything then I think that our comments show that it is too early for brainstorming on specific problems and that we first of all need a set of goals/non-goals. Once we all agree on that we can start brainstorming with the goals/non-goals as common ground...
It goes without saying that any implementation proposal is going to fit the proposer's needs. I don't see how / why the justifications for my proposal would't as a result directly be those that make sense to the proposer.
This is what I have difficulty with. I explicitly presented those points as a summary of why VCS tags are a better approach than the other approach presented and the position continues to be held that they're not a good approach without any response on those points. All I'm asking is that if you think my points aren't compelling, you present reasons why so I can understand your point of view and dispel any misunderstanding.
I might be wrong here, but I feel like between your thread and my thread there's a clear communication of intent in the subtext. We'd like a (1) auditable and (2) functional way of marking package revisions as vulnerable.
The arguments I'm making for VCS tags are in two groups: (1) problems with the alternative approach and (2) benefits of the VCS approach.
I think it goes without saying that the best approach will be chosen on the basis of benefits it brings with it beyond minimum outcome requirements, and I don't think that means that such auxiliary benefits have to be litigated as 'requirements' for such a solution. If that were not the case, it would be impossible to decide between valid approaches because by definition they'd all have exactly the same value as measured against the requirements. This is something you state implicitly yourself:
I think requiring consensus on every primary and secondary goal without a solution in mind is going to take us to bikeshed town and I don't like it there very much anymore. It's my opinion that based on some basic, easy to agree on criteria as previously proposed we should be making proposals and judging them on their relative merits instead of trying to theorise on what success looks like in a completely abstract space.
@Zemnmez Git annotated tag (not "tag annotations") are comparable to commit messages, and have nothing to do with tracking history. Git tags have no history, and are not intended to ever change.