Add new spec for `go` package URLs #338

maceonthompson · 2024-11-04T18:36:26Z

The current PURL specification for Go was created before Go 1.11 modules and thus has namespace inconsistencies and lacks semantic versioning.

Although in many cases a module path corresponds directly to the URL of the hosting repository, that is not always true. The URL formed from the module path may be an endpoint that serves a redirect to the true host. This indirection protects projects that for whatever reason must change their hosting provider: their module names will continue to work. Consequently, it is undesirable to encode any aspect of the underlying hosting system as part of the PURL.

In essence, all Go modules form a single namespace. Since it is used by the majority of Go programmers, we propose to represent this namespace by the empty string. Though not included in this commit, other namespaces could be possible and would represent package managers and/or build tools that are alternatives to the go command.

The go type proposed here fixes the current issues by removing the namespace, using valid Go module versions (including pseudoversions), and adds some extra functionality to encode optional information about specific builds (GOOS, GOARCH, etc).

If accepted, all tools maintained by the Go project (such as govulncheck and pkg.go.dev) that surface PURLs will use this new type to provide canonical PURLs for Go modules and packages

matt-phylum

See also #196 #294 #308

This is a breaking change that affects all software utilizing PURL for Go. Personally, I don't think there's anything fundamentally wrong with pkg:golang except that the description is outdated, and I'm sure it can be fixed without making this level of breaking change. Maintaining the separation of namespace and name and putting the entire Go package ID into the PURL name makes PURLs difficult for human users to work with.

matt-phylum · 2024-11-04T18:48:55Z

PURL-TYPES.rst

+------
+``go`` for Go modules:
+
+- The ``namespace`` field is empty and implies the go mod proxy.


Is the field empty or does it imply the go mod proxy? It can't be both.

Should be done now, see the new commit (sorry for that).

matt-phylum · 2024-11-04T18:49:25Z

PURL-TYPES.rst

+- The ``name`` will be the full module path.
+- The ``subpath`` will represent the package path within a module.
+- The ``version`` will be a valid go version or pseudoversion, or empty.
+- Additional Build information for binaries can be included as ``qualifiers`` (i.e VCS info, go version info, GoArch/GoOS info etc)


The additional information should be explicitly defined here.

exactlty. be specific in the spec, so we all are on the same page.

PTAL in the new commit (sorry for that).

matt-phylum · 2024-11-04T18:58:38Z

PURL-TYPES.rst

+``go`` for Go modules:
+
+- The ``namespace`` field is empty and implies the go mod proxy.
+- The ``name`` will be the full module path.


This should probably specify that it is case sensitive. pkg:golang incorrectly states that it is not case sensitive and must be lowercased.

Exactly. this is what the whole #308 is about.
Please don't repeat the mistakes from the past.

Should be done now, see the new commit (sorry for that).

maceonthompson · 2024-11-04T19:44:05Z

See also #196 #294 #308

Thanks for pointing at these! This is essentially a combination of #196 and #308 (with the addition of qualifiers for build info). They go into more detail than this proposal, but especially in the case of namespaces #63 (comment) is a good example as to why dropping name in favor of an entirely coded namespace would be more useful. I understand that having a bunch of %2F in the PURL is ugly for humans, but is (we feel) necessary to ensure that go PURLs are consistent (which is to say that go module -> PURL is injective, a go module cannot be represented by different PURLs).

Say you have a module with the path host.com/maybeuser/module.
With the current type definition, both pkg:golang/host.com/maybeuser/module and pkg:golang/host.com/maybeuser%2Fmodule, could represent that module. In order for PURLs to canonically and uniquely define go modules in the way that they are defined on pkg.go.dev or the go module proxy, they must be unique as well.

matt-phylum · 2024-11-04T20:18:38Z

Say you have a module with the path host.com/maybeuser/module.
With the current type definition, both pkg:golang/host.com/maybeuser/module and pkg:golang/host.com/maybeuser%2Fmodule, could represent that module. In order for PURLs to canonically and uniquely define go modules in the way that they are defined on pkg.go.dev or the go module proxy, they must be unique as well.

I think the better solution to this problem is that pkg:golang/host.com/maybeuser%2Fmodule stays illegal. It'd be better if the documentation explicitly stated it were illegal, but based on the examples and test cases the correct form is pkg:golang/host.com/maybeuser/module, and based on the reference parsing and formatting algorithms it's clear that these PURLs are distinct.

However, "a go module cannot be represented by different PURLs" is not generally the case:

The PURL spec describes a canonical format for PURLs, but users and even commonly used PURL implementations often get this wrong and produce non-canonical PURLs which must still be considered equal. For example, pkg:golang/host%2Ecom/maybeuser/module is a non-canonical, valid, PURL which refers to the same package.
A PURL may have qualifiers which may or may not be critical to the PURL. A PURL with a ?goarch is a different PURL which refers to the same module, but a PURL with a ?repository_url (or however the module proxy is specified) is a different PURL which may refer to a different module (probably more likely in other ecosystems).

jkowalleck · 2024-11-08T15:37:56Z

This is a breaking change that affects all software utilizing PURL for Go.

I'd disagree. In fact, it is non-breaking, as it adds a completely new purl type. Therefore, no breaking changes are introduced.

matt-phylum · 2024-11-08T15:45:48Z

It is breaking because no existing PURL software expects pkg:go, and new PURL software will not expect pkg:golang. This creates a compatibility problem where either the PURL is rejected as an unrecognized type or software on different sides of the breakage don't understand each other. If this is merged, all software that works with Go PURLs will need to be updated to accept both types of Go PURL and convert before they interoperate again.

jkowalleck · 2024-11-08T15:53:22Z

It is breaking because no existing PURL software expects pkg:go [...]

this is true to every newly proposed PURL Type :-)
And none of them is a breaking change - neither in spec nor in behaviour.

this PR is trying to add a new type go. the existing golang is not touched at all.

matt-phylum · 2024-11-08T16:01:54Z

The problem is that this is not a new type. The go type is intended to replace golang.

jkowalleck · 2024-11-08T16:07:30Z

The problem is that this is not a new type.

it is not? Could you point me to the existing go type?

The go type is intended to replace golang.

I wonder how you come to this conclusion.
this very PR adds a new type, it does neither obsolete nor deprecate the existing golang type.

matt-phylum · 2024-11-08T16:19:47Z

I wonder how you come to this conclusion.
this very PR adds a new type, it does neither obsolete nor deprecate the existing golang type.

From the PR description:

If accepted, all tools maintained by the Go project (such as govulncheck and pkg.go.dev) that surface PURLs will use this new type to provide canonical PURLs for Go modules and packages

golang is the type currently used for Go modules and packages. For example: https://github.com/anchore/syft/blob/3c070e0ad9d69c0f2191be52e2f2fb4904bcd558/syft/pkg/cataloger/golang/package_test.go#L24 . This PR is introducing a second, more preferred type for the same purpose.

jkowalleck · 2024-11-08T16:24:52Z

I wonder how you come to this conclusion.
this very PR adds a new type, it does neither obsolete nor deprecate the existing golang type.

From the PR description:

If accepted, all tools maintained by the Go project (such as govulncheck and pkg.go.dev) that surface PURLs will use this new type to provide canonical PURLs for Go modules and packages

which is a behavioural change in a downstream application. This is out of scope of this spec, and not in our hands at all - we have no authority there.

golang is the type currently used for Go modules and packages. For example: https://github.com/anchore/syft/blob/3c070e0ad9d69c0f2191be52e2f2fb4904bcd558/syft/pkg/cataloger/golang/package_test.go#L24 . This PR is introducing a second, more preferred type for the same purpose.

exactly this paragraph makes it clear: this is a non-breaking change.

Causing no breaking change is the whole point of introducing a new purl type, instead of modifying an exising one.

jkowalleck · 2024-11-08T16:35:50Z

I'm sure it can be fixed without making this level of breaking change.

i don't think so. #308 makes this clear: the existing spec has flaws that require breaking changes to fix them

The only way to fix golang is

a) introduce breaking changes in the existing purl-type << undesired !!!
b) introduce a new purl-type << feasible
c)
1. have the PURL spec modified to allow versioning of purl-types << burocratic efforts that might lead to nothing
2. if c)1. was successful: craft a purl-type golang version 2
3. else fall back to a) or b)

matt-phylum · 2024-11-08T16:44:32Z

Introducing a new type for an existing type is a breaking change to the PURL ecosystem. Implementations that use golang can continue to use golang and their golang PURLs will still be golang PURLs, but PURL has no negotiation mechanism where all the software that's going to read the PURLs agrees with the software that writes the PURLs on whether to use go or golang to describe Go dependencies.

If you start writing SBOMs that have go, they will be processed incorrectly by software that doesn't support go. If you continue writing SBOMs that have golang, they will be processed incorrectly by software that doesn't support golang. If you combine SBOMs using software that doesn't understand that go and golang are really the same type, the dependencies will be duplicated in the output. If you query go or golang packages against a vulnerability database, you have a 50/50 chance of finding the vulnerabilities unless the database understands both and converts golang to go.

Keeping golang is incompatible with the "a go module cannot be represented by different PURLs" goal of this PR.

You cannot just fix a PURL type by introducing a new type. Even if PURL libraries are updated to support transparently upgrading the old type into the new type on read, any software that is comparing pre-canonicalized PURL strings will need updates.

the existing spec has flaws that require breaking changes to fix them

What are the flaws that require breaking changes? #308 is about the path being incorrectly converted to lowercase, which is much more easily fixed by just not doing that.

jkowalleck · 2024-11-08T16:59:43Z

Introducing a new type for an existing type is a breaking change to the PURL ecosystem.

how?

If a tool that produced purls would change it's behaviour by using the new purl-type, where they've used the other one before - this would be a breaking change in that very tool.
This is out of the scope of the purl spec -- we do not have authority there.

Implementations that use golang can continue to use golang and their golang PURLs will still be golang PURLs, but PURL has no negotiation mechanism where all the software that's going to read the PURLs agrees with the software that writes the PURLs on whether to use go or golang to describe Go dependencies.

So?
This is true to every purl type that is added over time.
An implementation written 2 years ago might not know the purl type that was defined yesterday.
This is by design and was never an issue. This is out of the scope of the purl spec -- we do not have authority there.

Keeping golang is incompatible with the "a go module cannot be represented by different PURLs" goal of this PR.

A PR tells a story, and the effective patch gets updated along with the discussions on a PR.
the initial PR description is usually not updated in accordance with the effective patch.

(PS: I review the content of the PR. and at the time of review, I saw no breaking change.
I was starting the "breaking" discussion in expectation that you'd agree that is no longer a breaking change, based on the current state of the PR.
I am happy we are discussing the topic anyway, i might be wrong, and I still need to learn.)

You cannot just fix a PURL type by introducing a new type. Even if PURL libraries are updated to support transparently upgrading the old type into the new type on read, any software that is comparing pre-canonicalized PURL strings will need updates.

how comes?

the existing spec has flaws that require breaking changes to fix them

What are the flaws that require breaking changes? #308 is about the path being incorrectly converted to lowercase, which is much more easily fixed by just not doing that.

the curerent golang spec says: the path MUST be lowercased.
This is wrong in terms of actual go dependency management: the path MUST NOT be lowercased.
Changing MUST to MUST NOT in golang purl-type is a breaking change of the specification.

jkowalleck · 2024-11-08T17:35:06Z

PURL-TYPES.rst

+``go`` for Go modules:
+
+- The ``namespace`` field is empty and implies the go mod proxy.
+- The ``name`` will be the full module path.


Suggested change

- The ``name`` will be the full module path.

- The ``name`` is the full module path. It MUST be unmodified, and follow the `Go Module Reference <https://go.dev/ref/mod#go-mod-file-ident>`_.

this change would close #308

- - The ``name`` will be the full module path. + - The ``name`` is the full module path. In case of an URL: protocol MUST be lowercased; host-part MUST be lowercased; path-part MUSTbe unmodified, as it is case-sensitive.

this change would close #308

I don't think this is correct.

I don't think it's legal to include a protocol in the module path. Go makes some HTTPS requests to resolve a VCS URL to download the package from (usually this is delegated to the proxy).

The host part is also part of the case sensitive module path. It should not be lowercased. Uppercase characters are currently forbidden by Go for modules. I don't think it's worthwhile or really correct for the PURL spec to be specifying how to convert an invalid module path into a valid module path, I don't think it's worthwhile for the PURL spec to be specifying how to validate Go module paths, this doesn't cover all the restrictions, and this may cause problems if Go ever changes the restrictions for some reason.

re 1: I see. i was wrong there. Adjusted my suggestion for the protocol.
re 2: the host-part is, per URL-spec case-insensitive, and is normalized to lowercase.

As far as Go is concerned, it's usually a host-part but it has additional restrictions and it is case sensitive: https://go.dev/ref/mod#go-mod-file-ident

I see. I will modify my change-suggestion accordingly. does it fit better, now?

jkowalleck · 2024-11-08T17:37:12Z

PURL-TYPES.rst

+
+- The ``namespace`` field is empty and implies the go mod proxy.
+- The ``name`` will be the full module path.
+- The ``subpath`` will represent the package path within a module.


Suggested change

- The ``subpath`` will represent the package path within a module.

- The ``subpath`` is the unmodified package path within a module.

jkowalleck · 2024-11-08T17:38:10Z

PURL-TYPES.rst

+- The ``namespace`` field is empty and implies the go mod proxy.
+- The ``name`` will be the full module path.
+- The ``subpath`` will represent the package path within a module.
+- The ``version`` will be a valid go version or pseudoversion, or empty.


Suggested change

- The ``version`` will be a valid go version or pseudoversion, or empty.

- The ``version`` may be a valid go version or pseudoversion, omitted when empty.

Why may here?

because version is optional.

Should be done now, see the new commit (sorry for that).

matt-phylum · 2024-11-08T18:04:43Z

Adding a new type for a new type is much different than adding a new type for an existing type. An old tool not recognizing a truly new type is expected, but an old tool not recognizing Go PURLs anymore because a tool producing the data says that golang is now spelled go is a breaking change. You can argue that this isn't a breaking change in the PURL spec itself because it doesn't change golang, but it necessitates a breaking change in every current implementation of Go PURLs and complicates implementations of Go PURL consuming software as long as there are both go and golang PURLs going around.

Changing "MUST be lowercased" to "MUST NOT be lowercased" is a much less impactful change than this. From what I've seen, names with uppercase characters are uncommon, and an outdated implementation that is incorrectly lowercasing is still working correctly for all names that do not contain uppercase characters to lowercase. I would even say that on a larger scale it is not a breaking change because:

An outdated PURL producer that incorrectly lowercases an ID containing capitals produces the wrong PURL, but today those producers are producing exactly the same PURL and calling it correct despite referring to the wrong package.
An outdated PURL consumer that incorrectly lowercases an ID containing capitals reads the wrong ID, but today those consumers are already reading exactly the same ID and calling it correct despite referring to the wrong package.

In both cases, the PURL is still parsed successfully and the meaning of the PURL is unchanged with respect to the current "MUST be lowercased" spec. The only differences would be that the canonical form changes¹ and a new consumer receiving a PURL from an old producer might be more likely to expect that the ID refers to the correct package, but since there is no good way for an outdated consumer to recover the correct ID after an outdated producer lowercases it, any consumer that relies on getting the correct ID (eg to resolve the package files) is likely already broken and not lowercasing the name can only improve the behavior in that situation.

This causes the same alignment problems as introducing a go type, except that if the correct ID is lowercase, no problem occurs because lowercasing is already producing the correct PURL.

¹ Due to underspecification in the text and tests, I wouldn't trust incoming PURLs to be in the canonical form as my implementation understands it. There are numerous minor differences in which characters are escaped when (and sometimes how), so if you're accepting PURLs from an external source, even if you don't expect user-entered, non-canonical PURLs in that source, you should be canonicalizing those PURLs yourself if your application depends on them all being canonical for the same definition of canonical.

matt-phylum · 2024-11-13T13:36:06Z

Go isn't the only ecosystem that has this problem of incorrect name normalization rules in this repo. I'm also aware of:

NPM is case sensitive but PURL lowercases it: Letter case for npm packages #136
NuGet is case insensitive but PURL does not lowercase it: NuGet package names should not be case sensitive #226
PyPI replaces "----.....__--.-." with "-" but PURL replaces it with "----.....----.-.": Add more specific details around normalised PyPI package naming per PEP 503 #165

zpavlinovic · 2024-11-14T19:47:01Z

Introducing a new type for an existing type is a breaking change to the PURL ecosystem.

If this is indeed true, then there is something really wrong with PURL: it does not allow for evolution. On the one hand, we cannot add modifications to the existing specification that could introduce breaking changes. On the other hand, we cannot introduce a new type because somehow that is a breaking change as well. So one is pretty much stuck with slight variations of the initial spec. Specs should be allowed to evolve just the way the software does.

There should really be a way to add versioning on top of PURL itself. What is being proposed here might in essence be just that for the go spec.

pombredanne · 2024-11-18T14:05:09Z

@maceonthompson Thanks for putting this together! this makes a lot sense, and we have an issue with Go alright. Let me look at the comments in details and come back with my 2 cents!

pombredanne · 2024-11-18T14:07:35Z

@matt-phylum re:

Introducing a new type for an existing type is a breaking change to the PURL ecosystem.

I am not sure that's hte case, but a new type vs. updating the existing type demands some careful thinking :)

pombredanne · 2024-11-18T14:20:05Z

PURL-TYPES.rst

+
+      pkg:go/google.golang.org%2Fgenproto#googleapis/api/annotations
+      pkg:go/github.com%2Fjmorion%2Fsqlx@v1.1.2#api
+      pkg:go/golang.org%2Fx%2Fvuln?goversion=1.23.2&vcs=git&vcs_modified=true#cmd/govulncheck


There is a likely problem with the use of subpath: there is no way to determine where the module ends and the package starts in the general case, is there?
For instance, in the path google.golang.org/genproto/googleapis/api/annotations how can I determine safely that google.golang.org/genproto is a module and that googleapis/api/annotations is a package inside this module? I need either a go proxy lookup or a full filesystem to locate a go.mod/go.sum file, right?

There is a way, if the module's code is available to you, to determine from a package import where module path ends and the package path begins by making HTTP requests.

I think the use of the subpath here is good because it puts the burden of determining this on whatever generates the PURL, which is likely aware of Go and either has the module paths or is most likely to be able to find the module path from the full package path. Then if you want to use a tool that checks PURLs against a database of information about modules (eg vulnerabilities), the tool already has all the information it needs. Otherwise, either the tool would need to make external API calls to figure out the module path of the PURL or the database would need to have an entry for every package in the module.

Just to add to @matt-phylum's comment. If a tool is producing a PURL for a Go artifact, then it can use go version, Debug.BuildInfo, or packages.Load to get information about the package and its corresponding module. The encoding proposed here then makes it clear what the modules and packages are.

I don't believe this is true for the general use case of PURLs. E.g. we do static analysis of binaries and while we can get information about linked packages, there's no indication of which part of the paths correspond to modules

Right, if you are looking at a Go symbol from the symbol table, you can get its package. You can get the module correctly by prefix-matching it with module information from debug.Buildinfo of the binary, unless there are several modules that are prefixes of the package. My inclination is that it should not affect what is proposed here. (Arguably, there should be a way to get module information for a symbol in the binary, just the way one can do it for the source analysis.)

pombredanne · 2024-11-18T14:22:21Z

BTW, an elephant in the room is whether the distinction between a namespace and name makes sense not only here, but also in the whole spec, globally.

I found myself using a variable with a "namespace/name" substring more often than not.
Then, how to split this in optional namespace and name could become a type-specific distinction, but the general concept would be that of "namespace/name", which could look like:

pkg:golang/google.golang.org/genproto/googleapis/api/annotations@v1.2.1

With this the whole google.golang.org/genproto/googleapis/api/annotations would be the namespace/name and would not have a specific split in Go, all would be in the name?
(and the same could apply where relevant to other package types)

It could have a minimal impact on the spec.

pombredanne · 2024-11-18T14:24:07Z

PURL-TYPES.rst

+
+      pkg:go/google.golang.org%2Fgenproto#googleapis/api/annotations
+      pkg:go/github.com%2Fjmorion%2Fsqlx@v1.1.2#api
+      pkg:go/golang.org%2Fx%2Fvuln?goversion=1.23.2&vcs=git&vcs_modified=true#cmd/govulncheck


Is the plan to include all the buildinfo structure as qualifiers?
If so, this would only apply in a built binary?

good point.
If so, all the qualifiers MUST be documented in the type-spec.

currently it reads:

Additional Build information for binaries can be included as qualifiers (i.e VCS info, go version info, GoArch/GoOS info etc)

I am afraid this documentation is insufficient.

We will expand on this.

pombredanne · 2024-11-18T14:24:57Z

PURL-TYPES.rst

+      pkg:go/google.golang.org%2Fgenproto#googleapis/api/annotations
+      pkg:go/github.com%2Fjmorion%2Fsqlx@v1.1.2#api
+      pkg:go/golang.org%2Fx%2Fvuln?goversion=1.23.2&vcs=git&vcs_modified=true#cmd/govulncheck
+      pkg:go/golang.org%2Fx%2Fvuln@v1.1.3?goversion=1.23.2#cmd/govulncheck


Are the Go module versions always to be prefixed with a v?

A version identifies an immutable snapshot of a module, which may be either a release or a pre-release. Each version starts with the letter v, followed by a semantic version.
-- https://go.dev/ref/mod#versions

version could also be a pseudo-version -- a git-tag, a git-commit-hash, or something like this.

A pseudoversion is a special kind of version that also starts with a v: https://go.dev/doc/modules/version-numbers#pseudo-version-number

I think for Go modules, including when using the Go module system to refer to something that predates modules, the version always starts with a v. In which case, versions that don't start with v would only be used with older tools like Dep?

If a version exists, it should be a valid Go module version. It should start with a v.

Note that hashes should not be permitted, they are not a valid Go version (resolution of hash commits in go tooling is a convenience feature).

pombredanne · 2024-11-18T14:28:38Z

@matt-phylum you wrote

See also:

Go suggestions for PURL-TYPES.rst #196

Should we support a leading v in golang packages? #294

[PURL-TYPE: golang] fix type spec regarding path segments #308

This is a breaking change that affects all software utilizing PURL for Go. Personally, I don't think there's anything fundamentally wrong with pkg:golang except that the description is outdated, and I'm sure it can be fixed without making this level of breaking change.

Thanks for the links! I tend to think along the same lines, and we can likely salvage the golang type.

Maintaining the separation of namespace and name and putting the entire Go package ID into the PURL name makes PURLs difficult for human users to work with.

I need to pounder this. See my other comment wrt. the namespace/name above in #338 (comment)

zpavlinovic · 2024-11-18T20:36:01Z

However, "a go module cannot be represented by different PURLs" is not generally the case:

The PURL spec describes a canonical format for PURLs, but users and even commonly used PURL implementations often get this wrong and produce non-canonical PURLs which must still be considered equal. For example, pkg:golang/host%2Ecom/maybeuser/module is a non-canonical, valid, PURL which refers to the same package.

A PURL may have qualifiers which may or may not be critical to the PURL. A PURL with a ?goarch is a different PURL which refers to the same module, but a PURL with a ?repository_url (or however the module proxy is specified) is a different PURL which may refer to a different module (probably more likely in other ecosystems).

It is fine that PURL spec allows for more flexibility, but there should be only one way the Go module and package information is encoded. This simplifies the work for clients. It is easy to drop qualifiers from a PURL. It is annoying to generate multiple module+package encodings to see if the incoming PURL applies to your code.

In general, this proposal tries to make it simple and clear to generate and accurately check against PURLs. It might not be the most user-friendly solution, but tools that render PURLs can easily prettify the output. We believe this is worth the sacrifice.

rhalar · 2024-11-19T09:26:51Z

Could it also be clarified how standard library packages are to be represented?

Go has special handling for these, and the 'module' is never explicitly required when using them. But the module does exist for std and cmd
https://github.com/golang/go/blob/master/src/go.mod#L1
https://github.com/golang/go/blob/master/src/cmd/go.mod#L1

Go uses stdlib when reporting vulnerabilities though
https://vuln.go.dev/ID/GO-2024-3105.json

but the exact module name would make more sense we believe.

matt-phylum · 2025-03-19T16:23:36Z

You keep denying this and calling it a fundamental misconception, but you clearly do not understand what it is @puerco and I are concerned about. It doesn't matter if pkg:golang is still around. The problem is created by somebody starting to use pkg:go while any of the software that currently supports pkg:golang exists. If the plan wasn't to replace pkg:golang with pkg:go in at least some contexts then there would not be a PR to introduce a pkg:go that must never be used.

I know that fixing golang does not definitely break all downstream implementing applications because I am aware of multiple applications that will not break. In fact, I am not aware of any applications that will break in a way that is worse than how they are already broken for the PURLs that would be affected by fixing pkg:golang. I'm not 100% sure that it won't cause new problems in some cases, but I still don't understand how it could cause more problems than introducing pkg:go PURLs into the input of software that only understands pkg:golang PURLs, and I don't understand how it would be feasible, without changing the schemas of documents that contain PURLs, to introduce pkg:go to the PURL spec without introducing pkg:go PURLs into the input of software that only understands pkg:golang PURLs without either a years-long transition period were implementations are expected to support pkg:go but only output pkg:golang or leaving it to users to figure out when they need to normalize all Go PURLs in a file to pkg:go or pkg:golang.

PPS: you asked which poor design choices could lead to such a thing downstream? if you dont check on the purl scheme, but simply assume your input being a purl for golang, thats such a thing ... your argumentation is almost suggesting that this is exactly the poor design that you base your points on, or is it?

This does not make sense to me. I never said anything about not checking whether the PURL is a Go PURL. If the input has pkg:golang, then it is a Go PURL, and you can resolve it to a package file or look up vulnerabilities or other information about it by using that knowledge. If it the input has pkg:go, then it is currently not a Go PURL, and implementations know nothing about it and can do nothing with it. If you are doing the correct thing and applying Go semantics to only pkg:golang PURLs then your implementation will not work when it starts receiving pkg:go PURLs.

zpavlinovic · 2025-03-19T16:53:44Z

You keep denying this and calling it a fundamental misconception, but you clearly do not understand what it is @puerco and I are concerned about. It doesn't matter if pkg:golang is still around. The problem is created by somebody starting to use pkg:go while any of the software that currently supports pkg:golang exists. If the plan wasn't to replace pkg:golang with pkg:go in at least some contexts then there would not be a PR to introduce a pkg:go that must never be used.

I think it would be very useful if you could create some scenarios/examples that are representative of the concerns you have. We could then discuss them. I am not trying to diminish your concerns, I simply don't understand them; they are quite abstract to me at this point. It feels like we are running in circles, so I believe working out concrete scenarios might help here.

matt-phylum · 2025-03-19T18:12:24Z

Here is an example of the way pkg:go can go wrong.

The user is using an SBOM generation tool to analyze their source code and produce an SBOM. They take that SBOM and feed it into a vulnerability scanner so they can be aware of vulnerabilities in their software. The vulnerability scanner, either directly or indirectly, needs to recognize the Go modules being used. In practice this will be something like reading a CycloneDX file, extracting component PURLs, and either converting those PURLs into ecosystem-native identifiers that can be looked up in an advisory database (eg pkg:golang/github.com/AdguardTeam/AdGuardHome@1.107.52 -> {"ecosystem":"Go", "name": "github.com/AdguardTeam/AdGuardHome", "version": "1.107.52"}) or the other way around, using the PURL to query a database where the advisory database entries have already been mapped to PURLs.

If the user updates the SBOM generation tool to a version that outputs pkg:go PURLs without updating their vulnerability scanner or its database first, either they will get an error that pkg:go isn't supported, and maybe a hint that they should upgrade the service, or they will just get no results for the new PURL pkg:go/github.com%2FAdguardTeam%2FAdGuardHome@1.107.52 and not know that they are vulnerable, a critical failure that can result in a surprise compromise. The second case is what will happen if the vulnerability matching is being offloaded to today's api.osv.dev.

$ curl -d '{"package":{"purl":"pkg:golang/github.com/AdguardTeam/AdGuardHome@1.107.52"}}' "https://api.osv.dev/v1/query"
{"vulns":[{"id":"GO-2024-2924","summary":"AdGuardHome privilege escalation vulnerability in github.com/AdguardTeam/AdGuardHome","details":"AdGuardHome privilege escalation vulnerability in github.com/AdguardTeam/AdGuardHome.\n\nNOTE: The source advisory for this report contains additional versions that could not be automatically mapped to standard Go module versions.\n\n(If this is causing false-positive reports from vulnerability scanners, please suggest an edit to the report.)\n\nThe additional affected modules and versions are: .","aliases":["CVE-2024-36586","GHSA-7jp9-vgmq-c8r5"],"modified":"2024-09-06T20:44:16Z","published":"2024-06-28T15:28:30Z","database_specific":{"url":"https://pkg.go.dev/vuln/GO-2024-2924","review_status":"UNREVIEWED"},"references":[{"type":"ADVISORY","url":"https://github.com/advisories/GHSA-7jp9-vgmq-c8r5"},{"type":"ADVISORY","url":"https://nvd.nist.gov/vuln/detail/CVE-2024-36586"},{"type":"WEB","url":"https://github.com/go-compile/security-advisories/blob/master/vulns/CVE-2024-36586.md"}],"affected":[{"package":{"name":"github.com/AdguardTeam/AdGuardHome","ecosystem":"Go","purl":"pkg:golang/github.com/AdguardTeam/AdGuardHome"},"ranges":[{"type":"SEMVER","events":[{"introduced":"0"}]}],"ecosystem_specific":{"custom_ranges":[{"type":"ECOSYSTEM","events":[{"introduced":"0.93.0"}]}]},"database_specific":{"source":"https://vuln.go.dev/ID/GO-2024-2924.json"}}],"schema_version":"1.6.0"}]}
$ curl -d '{"package":{"purl":"pkg:go/github.com%2FAdguardTeam%2FAdGuardHome@1.107.52"}}' "https://api.osv.dev/v1/query"
{}

This seems like a fairly likely case to me. SBOM generation can happen in CI close to developers while vulnerability scanning may happen in some software like Dependency-Track hosted by IT or a dedicated security team. Even if the software vendors are keeping up and have added support for pkg:go, it's much easier for a developer to update a GitHub action (it could be as easy as clicking "merge" on an unassuming automated dependency update PR) than for somebody to update the deployed version of the service being used to monitor for security vulnerabilities in released products and go through any revalidation processes associated with changing a service involved in safeguarding customer data. Companies may be running very old services on internal networks if they don't perceive a need to upgrade them.

zpavlinovic · 2025-03-19T18:44:09Z

Here is an example of the way pkg:go can go wrong.

Thanks for the example, this is really helpful.

The user is using an SBOM generation tool to analyze their source code and produce an SBOM. They take that SBOM and feed it into a vulnerability scanner so they can be aware of vulnerabilities in their software. The vulnerability scanner, either directly or indirectly, needs to recognize the Go modules being used. In practice this will be something like reading a CycloneDX file, extracting component PURLs, and either converting those PURLs into ecosystem-native identifiers that can be looked up in an advisory database (eg pkg:golang/github.com/AdguardTeam/AdGuardHome@1.107.52 -> {"ecosystem":"Go", "name": "github.com/AdguardTeam/AdGuardHome", "version": "1.107.52"}) or the other way around, using the PURL to query a database where the advisory database entries have already been mapped to PURLs.

I believe I understand the scenario, sorry if I don't. It seems to me that the concern you described here, again, can apply to any new PURL package. If the SBOM tool outputs a pkg:cool-new-lang PURL and the vulnerability scanner does not recognize it, the user can be in a proverbial pickle.

If the user updates the SBOM generation tool to a version that outputs pkg:go PURLs without updating their vulnerability scanner or its database first, either they will get an error that pkg:go isn't supported, and maybe a hint that they should upgrade the service, or they will just get no results for the new PURL pkg:go/github.com%2FAdguardTeam%2FAdGuardHome@1.107.52 and not know that they are vulnerable, a critical failure that can result in a surprise compromise. The second case is what will happen if the vulnerability matching is being offloaded to today's api.osv.dev.

They should get that pkg:cool-new-lang, or pkg:go for that matter, is not supported and they should then decide whether to use the vulnerability scanner or call it differently. If the scanner does not issue the unsupported or update-me warning, then this vulnerability scanner has some major issues. I mean, if the scanner just eats the unsupported PURL, proceeds to produce results pretending like nothing happened, and then that results in missed vulnerabilities, then this is definitely not the scanner one should be using.

$ curl -d '{"package":{"purl":"pkg:golang/github.com/AdguardTeam/AdGuardHome@1.107.52"}}' "https://api.osv.dev/v1/query"
{"vulns":[{"id":"GO-2024-2924","summary":"AdGuardHome privilege escalation vulnerability in github.com/AdguardTeam/AdGuardHome","details":"AdGuardHome privilege escalation vulnerability in github.com/AdguardTeam/AdGuardHome.\n\nNOTE: The source advisory for this report contains additional versions that could not be automatically mapped to standard Go module versions.\n\n(If this is causing false-positive reports from vulnerability scanners, please suggest an edit to the report.)\n\nThe additional affected modules and versions are: .","aliases":["CVE-2024-36586","GHSA-7jp9-vgmq-c8r5"],"modified":"2024-09-06T20:44:16Z","published":"2024-06-28T15:28:30Z","database_specific":{"url":"https://pkg.go.dev/vuln/GO-2024-2924","review_status":"UNREVIEWED"},"references":[{"type":"ADVISORY","url":"https://github.com/advisories/GHSA-7jp9-vgmq-c8r5"},{"type":"ADVISORY","url":"https://nvd.nist.gov/vuln/detail/CVE-2024-36586"},{"type":"WEB","url":"https://github.com/go-compile/security-advisories/blob/master/vulns/CVE-2024-36586.md"}],"affected":[{"package":{"name":"github.com/AdguardTeam/AdGuardHome","ecosystem":"Go","purl":"pkg:golang/github.com/AdguardTeam/AdGuardHome"},"ranges":[{"type":"SEMVER","events":[{"introduced":"0"}]}],"ecosystem_specific":{"custom_ranges":[{"type":"ECOSYSTEM","events":[{"introduced":"0.93.0"}]}]},"database_specific":{"source":"https://vuln.go.dev/ID/GO-2024-2924.json"}}],"schema_version":"1.6.0"}]}
$ curl -d '{"package":{"purl":"pkg:go/github.com%2FAdguardTeam%2FAdGuardHome@1.107.52"}}' "https://api.osv.dev/v1/query"
{}
This seems like a fairly likely case to me. SBOM generation can happen in CI close to developers while vulnerability scanning may happen in some software like Dependency-Track hosted by IT or a dedicated security team. Even if the software vendors are keeping up and have added support for pkg:go, it's much easier for a developer to update a GitHub action (it could be as easy as clicking "merge" on an unassuming automated dependency update PR) than for somebody to update the deployed version of the service being used to monitor for security vulnerabilities in released products and go through any revalidation processes associated with changing a service involved in safeguarding customer data. Companies may be running very old services on internal networks if they don't perceive a need to upgrade them.

I agree that this is a likely case. When a new package type is added, I believe it is a reasonable expectation that the users will check if their scanners support it or the scanners will yell that the PURLs are not supported. Perhaps I am expecting too much. Either way, it looks like this problem, if it exists, exists for both pkg:go and pkg:cool-new-lang.

puerco · 2025-03-19T18:45:05Z

It would be very useful if you could create some scenarios/examples that are representative of the concerns you have

Happy to. I think the point missing in all of this is to understand purls are embedded in all sorts of documents and databases across the supply chain. It is not just one tool's concern. A software supply chain tool will ingest multitudes of data from different generators and databases.

Some practical examples:

In general, component matching becomes a mess as you need to build - for go specifically - a compatibility layer whenever you want to match components, this is done all the time when working with SBOMs, attestations, databases. The fact that the new type does not break the other does not mean that supply chain security tools don't need to ingest both.

SBOM Enrichment or Augmentation:

One SBOM generator reads component data. Another produces licensing data. You need to merge the output of both. But now you can't. The inputs to both are the same: "A Go module". If one speaks golang: and the other speaks go: you need to translate the whole set of purls. As soon as purls in the new type start showing up, all the tools that augment and enrich will break until they add a compatibility layer.

CycloneDX

CycloneDX has one field, and one field only, for a purl (which IMHO it's the way it should be). When generating an SBOM you need to choose one schema to use. This means that your SBOM needs to pick one type and hope that everything downstream can handle both. If a tool downstream can't translate, you just lost supply chain data.

Asset Management Systems:

Asset management systems cataloging components from SBOMs will need to support both. Imagine when the new log4shell hits a go module and you need to find it:

Did the advisory use go: or golang:? You better understand both, or you are screwed.
Once read, in which one do you store the data?
And then check your asset databases for variants of the same module in go: and golang:. Again if your system understands just one and a supplier handed you an SBOM in the other, you're toast.

Vulnerability Advisories and VEX

Both advisories and VEX documents use the same input: "A go module". If advisories are published in go: but all the vex tooling understands golang then you cannot match the component data between the advisory and the VEX doc. This cannot be worked around in either cdx (see my point above) or OpenVEX (it is also designed to handle just one purl) . The only way to bridge this is hacking a compatibility layer in the ingestion logic.

Deduplication of Component Data

Say you merge two SBOMs and want to de-duplicate component data. Today you can just recurse the SBOM data and match, but now you would need to (only for Go) add another compatibility normalization before deduping components.

Anyway I could go on and on..

jkowalleck · 2025-03-19T19:08:21Z

So we are again where we were 3months ago? (and you claimed you've read all comment?! )

Then read this again: #338 (comment)

Tldr: downstream implementations are not our domain. Their poor decisions are not our concern., and there are solutions for that.

puerco · 2025-03-19T19:27:18Z

@jkowalleck :

we are here in the domain of specification.
we dont care if a change in the specification breaks downstream implementations due to poor design choices, since we are not in their domain.

downstream implementations are not our domain. Their poor decisions are not our concern.

I think the opposite is true. Steering a specification is a great responsibility. This is why, in general, steering members are elected from senior, experienced community members who grasp the consequences of their every move. You are in charge of fostering the adoption of the spec and ensuring decisions are made to keep a healthy community and ecosystem. In other words, you are responsible for poor implementations just as much as for those you may consider good ones. This change will break the data exchange for both.

Again, note that this is not a single tool's concern. It's about wrecking an ecosystem already exchanging data.

Regardless of what you think of the user's design, the new type will introduce a barrier that will blind tools using one type from the data already produced using the other and in the process, it will make naming go modules unreliable, possibly for years. We've done our part by providing feedback as adopters to fix the current type and not break the whole purl/go ecosystem. Feel free to take it or not.

jkowalleck · 2025-03-19T20:54:43Z

(i am so fed up trying to understand your points).
Whatt is your solution, then? Yout last one was introducing breaking chanhes into an existing spec. Still up?

matt-phylum · 2025-03-19T22:12:43Z

I agree that this is a likely case. When a new package type is added, I believe it is a reasonable expectation that the users will check if their scanners support it or the scanners will yell that the PURLs are not supported. Perhaps I am expecting too much. Either way, it looks like this problem, if it exists, exists for both pkg:go and pkg:cool-new-lang.

Yes, except that the user wasn't getting advisories for pkg:cool-new-lang before it existed and they will continue not getting advisories for pkg:cool-new-lang after its introduction. Users that are today getting advisories for pkg:golang may stop getting advisories for pkg:golang if their SBOM generation process switches to pkg:go without starting to get advisories for pkg:go. You could have installed everything and even tested to ensure that you receive notifications about vulnerabilities--you could even still be testing by periodically submitting a vulnerable SBOM to a test project--and this change could catch you by surprise.

For the software I work on, this would just be annoying. I can safely address it because it's a cloud-hosted solution and I am aware that the change might be happening and I can ensure that pkg:go is handled appropriately if it happens. I'm worried about other cloud solutions that are not watching this repository and particularly self-hosted solutions where it's unlikely anyone is watching or even aware of this repository.

Tldr: downstream implementations are not our domain. Their poor decisions are not our concern., and there are solutions for that.

This comment is confusing to me considering the #purl channel is on the OWASP CycloneDX Slack instance.

Is it a poor decision that CycloneDX doesn't provide a way for indicate that a pkg:go package may also known as a pkg:golang package? Does CycloneDX have a different way to provide that kind of backwards compatibility? Admittedly, I don't know very much about CycloneDX and what it can or can't do, but the component model has room for one PURL and has no alias fields that I can see. If there's no way for CycloneDX to represent that the package could be either PURL, introducing a new PURL for an existing package will cause interoperability problems.

Is it a poor decision that Dependency-Track, also an OWASP project, identifies components by a single PURL and has Go-specific behavior that activates for pkg:golang but not the pkg:go that may come to exist in the future? Does it reject components if it doesn't understand the PURL package type? I haven't tried giving it unknown PURLs to see what it looks like to a user. If it accepts unknown PURLs, the interoperability problems caused by introducing new PURL for an existing package may be difficult for a user to notice.

What is a good decision where this problem doesn't occur and how is that communicated to people designing systems that use PURL so this problem does not affect users? This PR is the most active, but there have been other requests to introduce a new package type for representing packages that are already supported by an existing type and any such PR creates the same potential for costly surprises.

jkowalleck · 2025-03-19T23:52:40Z

I don't see much sense in all this back-and-forth, we are just repeating.

You're very much invited to join the fortnightly PURL community meeting to discuss your points: #377 - Otherwise, I don't see a reason to give them any attention.

idunbarh · 2025-03-20T04:50:09Z

The concerns raised by @matt-phylum resonate with me. Users not receiving vulnerability notifications is a significant potential issue.

We don’t care if a change in the specification breaks downstream implementations due to poor design choices, since we are not in their domain.

The specification changes will impact the community. While downstream implementations are not the responsibility of the specification, serious consideration should be given to those most at risk from vulnerabilities (e.g., users who aren’t patching regularly and are waiting for alerts).

@jkowalleck @zpavlinovic, do you have any thoughts on what a good mitigation strategy would be for transitioning from golang to go for producers and consumers of SBOMs?

jkowalleck · 2025-03-20T08:34:54Z

do you have any thoughts on what a good mitigation strategy would be for transitioning from golang to go for producers and consumers of SBOMs?

just repeating #338 (comment)

lets play this whole evolution, for arbitrary SBOM generators in combination with DependencyTrack(DT), and all its related systems:

if it applies: OSS indexes, would add yet another package identifier to their list, for each existing go package. they have this for SWID, SWHID, PURL, so adding another purl is possible.
SBOM generators would either switch from golang to go - and call this a breaking change in their domain, or use feature flags to use one or the other in a non-breaking way.
Until the SBOM-ingesting tools dont support the new go spec, users would not use the breaking version and stick with the old one, or they would simply not use the new feature flag - whatever applies.
As soon as SBOM-ingesting tools support go, users could switch to the new behavior.
So as long as DependencyTrack(DT) does not know go, users would generate the SBOMs like before and "ignore" the new feature, until DT supports it.
Users that depend on other tools might use a differrent behaviour - the one that suites their needs.
the PURL-libraries would add support for parsing and canonicalizing go purls, just like they did for any other ecosystem.
Eventually, the Java library is able to parse a go purl into the parts(namespace, name, ...), and craft a canonicalized purl from these parts, just like it can do for golang - so migrating from one to the other is no issue.
SBOM ingesting tools would simply add the capability to understand the new go and act on it in their needed way.
For DependencyTrack(DT) this would depend on the PURL-Java library. DT crafts and parses purls, and uses them to match with existing OSS indexes.
Until the OSS indexes dont support the new go purl, DT would convert the purls from go to golang, when matching purls to these indexes.
Eventually, we have the whole chain of tools and services capable of ingesting the new go purl.
And at that point, every purl generator could default to use go. no more feature flags and hold-back needed.

You see, none of these steps required breaking changes, none of these steps are illusionary, all of these steps are how feature development in a stack of independent software worked since ages.
Are these steps not common sense, don't they come to you naturally? I think they do, and i am certain that every maintainer in that chain comes to the same idea, since they deal with change management all the time - they know their peers.

Most importantly, for users that dont patch anything, nothing will change for them. they still will use the old stack and everything will work like before.
But dont underestimate users. They are capable of reading docs, man pages, change logs, and they are able to ask for help, file tickets, etc.
I am sure users will find out when and how to transition from one tool to the other.

P.S.: I noticed in earlier comments that people seem fully aware of how a proper transition from a no-feature state to a fully rolled-out feature would function in their respective fields. Are you concerned that others might not understand this? With all due respect, you're not the only one with insight, so perhaps it's fair to trust that others are equally capable of understanding this.

idunbarh · 2025-03-20T18:25:01Z

I'm specifically stating that thought and communication with tool developers should exist to address @matt-phylum concerns in the transition from a golang to go type and the impact that would have with vuln discovery.

E.g. push the messaging that while most tools produce multiple IDs for components it would be even better multiple PURLs to identify components in SBOMs (yes I know that is not always supported in SBOM formats, we're a 100% CDX shop).

This enables "grace period" for tool developers to transition

Another example is also include a "official" mapping from golang to go types to support conversions within tooling who might have tooling that has yet to migrate.

With all due respect, you're not the only one with insight, so perhaps it's fair to trust that others are equally capable of understanding this.

Do not confuse my concern in helping improve the user experience and help address a path forward for your proposal with belittling you or end users. What I do see is a lack of addressing the concern brought up by @matt-phylum. Providing contingencies into your PR that address community concerns will help drive agreement.

jkowalleck · 2025-03-21T08:13:18Z

Regarding communication: this is open-source, you do not know all your downstream users, but all know you.
The best solution I can think of is creating a "go" help-section in the GitHub discussions, and link this in the type-specs section for go and golang. There, we could kick off some expected questions, and then wait for community participation.

I would not want to suggest/dictate how the transition or adoption is to be made. it should be a natural process driven by the community. They will figure it out.

For example, when a new version of our favorite BOM standard is released, I implement them into libraries in a non-breaking fashion, and i implement them into BOM-generators and other tools using feature flags defaulting to of/false.
After a year or so, when i see that BOM ingesting tools/services have adopted/support the new BOM features, I charge the tool's feature flags to defaulting to on/true and release a new major version of the tools and describe the breaking changes in the release log.
If i can be responsible with my feature-adoption and releases, I think others can be too.

With all due respect, you're not the only one with insight, so perhaps it's fair to trust that others are equally capable of understanding this.

Do not confuse my concern in helping improve the user experience and help address a path forward for your proposal with belittling you or end users. What I do see is a lack of addressing the concern brought up by @matt-phylum. Providing contingencies into your PR that address community concerns will help drive agreement.

Sorry, this was not to you, @idunbarh , this was towards all those naysayers. those people that claimed to know how standards work and how this one will definitely break the community for they would not know how to deal with changes and adoption.

puerco · 2025-03-21T16:49:16Z

Are these steps not common sense, don't they come to you naturally?

OK, no. They don't. But now I see where you're coming from:

So as long as DependencyTrack(DT) does not know go, users would generate the SBOMs like before and "ignore" the new feature, until DT supports it.

The problem you are not seeing is that not everything works in this simplistic way, where one person generates an SBOM and feeds it to DT (or similar).

Things are much more complex than that, SBOMs will be generated and provided by suppliers and you have no control over the software they use generate them, then you often need to use a mix of tools to achieve the document with the data you want. Also, as noted before, purls are in lots of other places beyond SBOMs such as databases, vulnerability scan results and other supply chain technologies and formats such as attestations, advisories, VEX and so on which will take years to move, if they ever do it.

So there is no magic feature flag that you can just flick to move everything over to the new schema.

jkowalleck · 2025-03-22T09:22:00Z

SBOMs will be generated and provided by suppliers.
Suppliers you might have a contract with, which includes what and how they deliver - including the BOM, right?

Also, as noted before, purls are in lots of other places [...] which will take years to move, if they ever do it.

True, and fully understood. Who said that things should change from one day to the other. The more complex, the more management and time it takes. Been there, done that: I've worked in domains where I've accompanied change processes that took around 5 years to be fully effective.
But things will never change to a better, if there is no spec/option to do so. Let's remember why the community came up with the new go type-spec in the first place, and what users are gaining from it.

So there is no magic feature flag that you can just flick to move everything over to the new schema.

Oh, there is a "featue flag" for almost everything. Lets sketch one for you:
lets say you have a contract with some suppliers A and B, and both ships some software/hardware/whatever, and part of that contract is to provide a BOM with golang purls for this. You let A and B know that you would want to change from golang purls to go purls some time in the future, and eventually, when the parties are ready, you will contract that they supply BOM with go purls.
There is no need to ask for go purls when you are not ready, so this change is under your control - no breaking changes to expect here. And the same is for all your other tools and processes that issue/ingest purls - they are probably not gonna change magically themselves - no reason to panic.
Just needs communication and planning ahead. Embrace the change, give people options, they will find out how to transition, if they have a benefit from it.

This situation would not be different, if the breaking changes you suggested were implemented.
Then, you would need to change all your processes to support the new golang purl, and the old golang purl in parallel, until you can fully transition.
This proposal here at least gives you a notice: a different purl type go is kind of a feature flag itself :-)

Anyway, the new go purl-type is a non-breaking enhancement of the type-spec.
It introduces a superset of the existing golang type-spec - golang purls can be looseless migrated to go purls.

matt-phylum · 2025-03-24T12:00:24Z

I doubt anyone is going to write into a contract that SBOMs use pkg:golang or pkg:go when talking about Go. The receiving company would need to be aware of the problem ahead of time and you'd need to go through legal and contract negotiations to get it fixed.

It'd be much easier in the case where you know you need one form or the other to apply a transform that rewrites the PURLs to all be the required form.

puerco · 2025-03-24T15:31:55Z

I doubt anyone is going to write into a contract that SBOMs use pkg:golang or pkg:go when talking about Go.

I agree. Also, ensuring a sound technical proposal should not rely on legal assumptions. "Supplier" in the document exchange can mean any third party producing SBOMs, not necessarily as part of a contractual obligation. This is especially true for open source where there are no obligations whatsoever.

zpavlinovic · 2025-03-24T18:21:59Z

It would be very useful if you could create some scenarios/examples that are representative of the concerns you have

Happy to. I think the point missing in all of this is to understand purls are embedded in all sorts of documents and databases across the supply chain. It is not just one tool's concern. A software supply chain tool will ingest multitudes of data from different generators and databases.

I understand, but I also do think that the point missing right now is that the system is fundamentally broken.

Some practical examples:

In general, component matching becomes a mess as you need to build - for go specifically - a compatibility layer whenever you want to match components, this is done all the time when working with SBOMs, attestations, databases. The fact that the new type does not break the other does not mean that supply chain security tools don't need to ingest both.

This is already a huge mess with the current PURL. It allows for different modules to have the same PURL and for the same module to have multiple different PURLs.

SBOM Enrichment or Augmentation:

One SBOM generator reads component data. Another produces licensing data. You need to merge the output of both. But now you can't. The inputs to both are the same: "A Go module". If one speaks golang: and the other speaks go: you need to translate the whole set of purls. As soon as purls in the new type start showing up, all the tools that augment and enrich will break until they add a compatibility layer.

Again, the "translation" already needs to exist due to the mess the current Go PURL is creating. Correct matching is already a mess (theoretically impossible).

CycloneDX

CycloneDX has one field, and one field only, for a purl (which IMHO it's the way it should be). When generating an SBOM you need to choose one schema to use. This means that your SBOM needs to pick one type and hope that everything downstream can handle both. If a tool downstream can't translate, you just lost supply chain data.

I understand, but I still maintain the position that this will also happen when a PURL for a completely new, say, language is introduced.

Asset Management Systems:

Asset management systems cataloging components from SBOMs will need to support both. Imagine when the new log4shell hits a go module and you need to find it:

Did the advisory use go: or golang:? You better understand both, or you are screwed.

Once read, in which one do you store the data?

And then check your asset databases for variants of the same module in go: and golang:. Again if your system understands just one and a supplier handed you an SBOM in the other, you're toast.

Vulnerability Advisories and VEX

Both advisories and VEX documents use the same input: "A go module". If advisories are published in go: but all the vex tooling understands golang then you cannot match the component data between the advisory and the VEX doc. This cannot be worked around in either cdx (see my point above) or OpenVEX (it is also designed to handle just one purl) . The only way to bridge this is hacking a compatibility layer in the ingestion logic.

Deduplication of Component Data

Say you merge two SBOMs and want to de-duplicate component data. Today you can just recurse the SBOM data and match, but now you would need to (only for Go) add another compatibility normalization before deduping components.

Anyway I could go on and on..

See my previous comments on matching.

zpavlinovic · 2025-03-24T18:25:12Z

Yes, except that the user wasn't getting advisories for pkg:cool-new-lang before it existed and they will continue not getting advisories for pkg:cool-new-lang after its introduction.

I am not sure I understand the second part of the sentence. How would they continue not getting advisories?

matt-phylum · 2025-03-24T18:31:05Z

If their software doesn't already support pkg:cool-new-lang, introducing pkg:cool-new-lang in this repository won't make their software pkg:cool-new-lang aware.

zpavlinovic · 2025-03-24T18:41:05Z

The concerns raised by @matt-phylum resonate with me. Users not receiving vulnerability notifications is a significant potential issue.

We don’t care if a change in the specification breaks downstream implementations due to poor design choices, since we are not in their domain.

The specification changes will impact the community. While downstream implementations are not the responsibility of the specification, serious consideration should be given to those most at risk from vulnerabilities (e.g., users who aren’t patching regularly and are waiting for alerts).

@jkowalleck @zpavlinovic, do you have any thoughts on what a good mitigation strategy would be for transitioning from golang to go for producers and consumers of SBOMs?

Here is how I see the situation. There are really three options going forward.

Do nothing.
Change the existing specification.
Introduce a new type as in here.

I think option 1 is the worst one. I personally don't mind option 2 (if the definition of golang is what is currently proposed for go), but I also do believe that option is worse than 3. I believe that pretty much all concrete concerns people have here with option 3 (e.g., SBOM enrichment and augmentation example of #338 (comment)) will also manifest with 2. Things will start failing and then it will be even harder to detect how and why they are failing. Option 3 at least makes it clear what is failing: a new type is introduced that is not being recognized.

I would have to think more about the strategy, but here are a few thoughts.

Producers should announce that they will start switching to go. After the grace period is finished, the default output will become go and golang output could be obtained via a flag or configuration.
Consumers should be able to ingest both go and golang.

zpavlinovic · 2025-03-24T18:46:18Z

If their software doesn't already support pkg:cool-new-lang, introducing pkg:cool-new-lang in this repository won't make their software pkg:cool-new-lang aware.

Hm, it could be the case it took time for PURL specification to land for an existing software. Out of curiosity, how long did it take for Rust to get supported? I could see this still being a problem, but I also do see how this might be a bigger problem with a new type for an existing language.

zpavlinovic · 2025-05-13T18:34:39Z

Just a gentle ping on this. How can we move forward with this?

jkowalleck · 2025-05-14T08:02:57Z

@package-url/core-team and @package-url/purl-spec-helpers are currently working hard towards standardization of the PURL core spec.
To be honest, I would not expect any movement in a PURL-type related ticket soon.

pombredanne · 2025-05-15T01:29:50Z

@jkowalleck re: #338 (comment)

To be honest, I would not expect any movement in a PURL-type related ticket soon.

Actually this is not correct. Support for Go is quite important in part because of the peculiarities of this package type and its ecosystem. But we need to resolve the last few bits of clarification on the percent encoding of slash in the core spec before being able to tackle this proposal.

Some of the problems here OTH are:

Getting two types, one for go, one for golang will be a source of confusion
The current proposal assume you can determine when a Go module ends and when a Go package path starts. This is something that is impossible to guess reliably at scale short of making a call to a Go module proxy AFAIK, so using the subpath for Go package is unlikely to work smoothly.

zpavlinovic · 2025-05-15T16:34:48Z

@jkowalleck re: #338 (comment)

To be honest, I would not expect any movement in a PURL-type related ticket soon.

Actually this is not correct. Support for Go is quite important in part because of the peculiarities of this package type and its ecosystem. But we need to resolve the last few bits of clarification on the percent encoding of slash in the core spec before being able to tackle this proposal.

Some of the problems here OTH are:

Getting two types, one for go, one for golang will be a source of confusion

Just to reiterate. If we could redefine the existing type to have the spec proposed here, that is also fine with me. Although, I suspect this will cause problems too.

The current proposal assume you can determine when a Go module ends and when a Go package path starts. This is something that is impossible to guess reliably at scale short of making a call to a Go module proxy AFAIK, so using the subpath for Go package is unlikely to work smoothly.

We thought about this when coming up with the proposal. Creating purls from Go binaries and source code should not experience the problem you mentioned since there is sufficient information (1 and 2) to infer module and package paths. Were you perhaps thinking of some different Go related artifacts?

Add new spec for go package URLs

3a5d973

matt-phylum reviewed Nov 4, 2024

View reviewed changes

jkowalleck added Proposed new type type: golang Proposed new type as well as component discussions labels Nov 8, 2024

jkowalleck requested changes Nov 8, 2024

View reviewed changes

pombredanne reviewed Nov 18, 2024

View reviewed changes

johnmhoran mentioned this pull request Nov 18, 2024

Resolve Go/go/golang-related issues and PRs as a group #346

Open

jkowalleck requested a review from a team March 21, 2025 08:24

johnmhoran added this to the 1.0-draft milestone Apr 4, 2025

oliverchang mentioned this pull request Jun 2, 2025

PURLS not following spec google/osv.dev#3530

Closed

	- The ``name`` will be the full module path.
	- The ``name`` is the full module path. It MUST be unmodified, and follow the `Go Module Reference <https://go.dev/ref/mod#go-mod-file-ident>`_.

	- The ``subpath`` will represent the package path within a module.
	- The ``subpath`` is the unmodified package path within a module.

	- The ``version`` will be a valid go version or pseudoversion, or empty.
	- The ``version`` may be a valid go version or pseudoversion, omitted when empty.

Add new spec for go package URLs #338

Are you sure you want to change the base?

Add new spec for go package URLs #338

Uh oh!

Conversation

maceonthompson commented Nov 4, 2024 • edited by jkowalleck Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matt-phylum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maceonthompson commented Nov 4, 2024

Uh oh!

matt-phylum commented Nov 4, 2024

Uh oh!

jkowalleck commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matt-phylum commented Nov 8, 2024

Uh oh!

jkowalleck commented Nov 8, 2024

Uh oh!

matt-phylum commented Nov 8, 2024

Uh oh!

jkowalleck commented Nov 8, 2024

Uh oh!

matt-phylum commented Nov 8, 2024

Uh oh!

jkowalleck commented Nov 8, 2024

Uh oh!

jkowalleck commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matt-phylum commented Nov 8, 2024

Uh oh!

jkowalleck commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkowalleck Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matt-phylum Nov 8, 2024 • edited by jkowalleck Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matt-phylum commented Nov 8, 2024

Uh oh!

Add new spec for `go` package URLs #338

Add new spec for `go` package URLs #338

maceonthompson commented Nov 4, 2024 •

edited by jkowalleck

Loading

jkowalleck commented Nov 8, 2024 •

edited

Loading

jkowalleck commented Nov 8, 2024 •

edited

Loading

jkowalleck commented Nov 8, 2024 •

edited

Loading

jkowalleck Nov 8, 2024 •

edited

Loading

matt-phylum Nov 8, 2024 •

edited by jkowalleck

Loading

zpavlinovic commented Nov 14, 2024 •

edited

Loading

jkowalleck Nov 18, 2024 •

edited

Loading