Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use purl to identify artifacts in EiffelArtifactCreatedEvent #182

Closed
d-stahl-ericsson opened this issue Mar 8, 2018 · 3 comments
Closed
Labels
protocol All protocol changes protocol-incompat Protocol changes that aren't backwards compatible
Milestone

Comments

@d-stahl-ericsson
Copy link
Contributor

Description

EiffelArtifactCreatedEvent currently uses GAV to identify all artifacts, which is less than useful for non-Maven based artifacts. The purl project (https://github.com/package-url/purl-spec) aims to standardize package identification across tools. We could reuse that rather than trying to roll our own.

Motivation

Standardized tool-agnostic identification of artifacts.

Exemplification

EiffelArtifactCreatedEvent currently identifies artifacts like so:

"data": {
  "gav": {
    "groupId": "com.mycompany.myproduct",
    "artifactId": "artifact-name",
    "version": "2.1.7"
  },
...

If the artifact is something other than a Maven artifact this is awkward. If it's a Docker image or a Ruby gem or something else, it would be much easier to identify it in some way that is truer to its "native" form. Using purl, this could look like:

"data": {
  "purl": "pkg:docker/cassandra@sha256:244fd47e07d1004f0aed9c",
...

or

"data": {
  "purl": "pkg:gem/ruby-advisory-db-check@0.12.4",
...

or, for that matter, in the original GAV example:

"data": {
  "purl": "pkg:maven/com.mycompany.myproduct/artifact-name@2.1.7",
...

Presumably we should also drop data.fileInformation. The concept of classifier and extension doesn't make much sense if we cut the tight connection to Maven. Purl on the other hand allows identification of package internals in the identifier string, which is arguably a step up from current syntax.

Benefits

Standardized, more flexible and precise identification of packages.

Possible Drawbacks

Losing data.fileInformation.

@e-backmark-ericsson
Copy link
Member

Sounds promising to me. So, is the suggestion that we drop the gav object or that we just add a purl object and let it be mutual exclusive with the gav object?
Also, I'm a little bit concerned about the name of the object. Would it be better to call it 'identity' or 'coordinates' (as in the leading text in that events description)

The ArtC event also has the 'implements' and 'dependsOn' objects, which are assumed to be lists of gavs. How should we handle those? If the format of those should be implicit, as today, but changed to be purl identifiers I think it makes sense to not use the 'purl' keyword for the identity of the artifact itself.

In your suggestion, the object is a simple string in purl format. In the gav case we've created it from its atoms which in purl would be: type, namespace, name, version, qualifiers, subpath. I tend to like the string format, but is there a reason why you suggest to not construct it using its atoms?

Spontaneously I would like to see this syntax:
data.identity (identity/coordinates in purl string syntax)
data.implements[] (list of identities in purl string syntax)
data.dependsOn[] (list of identities in purl string syntax)

@e-backmark-ericsson
Copy link
Member

Also, when I read about purl I don't see the 'pkg' keyword that you're using. As I understand it a maven example should, using purl, be: "maven:com.mycompany.myproduct/artifact-name@2.1.7"

@d-stahl-ericsson
Copy link
Contributor Author

The pkg: scheme is not yet on the master branch. There's a rather long conversation in this pull request: package-url/purl-spec#31

From what I can tell it seems more than likely to be merged. By the way, there's an ongoing conversation about this in the Grafeas community, and they seem to be leaning towards adopting purl for artifact identification as well.

As for your comments on the syntax, yes, I completely agree. And to be consistent, we should change meta.source.serializer, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
protocol All protocol changes protocol-incompat Protocol changes that aren't backwards compatible
Projects
None yet
Development

No branches or pull requests

3 participants