Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support format SBOM conversion #563

Closed
wagoodman opened this issue Oct 16, 2021 · 6 comments · Fixed by #964
Closed

Support format SBOM conversion #563

wagoodman opened this issue Oct 16, 2021 · 6 comments · Fixed by #964
Assignees
Labels
enhancement New feature or request

Comments

@wagoodman
Copy link
Contributor

Syft can output multiple SBOM formats, however, once you have that format you can't convert to another format. This could be most useful if you have a syftjson formatted document and you want to produce SPDX (which should be lossless). In these cases something like this would be nice:

syft my-image:latest -o json > original.json
syft convert original.json --to spdx > original.spdx
syft convert original.json --to cyclonedx > original-cyclonedx.xml

It's not clear what to do in cases where there is potentially lossy behavior (e.g. convert from cyclonedx to syftjson):

  • Warn the user that this is a lossy conversion and continue
  • Stop the conversation (possibly bypass with --force or similar option)
  • Do nothing --this is an end user concern (I feel that this is not a good option)
  • Possibly guarantee non-lossy behavior by including all additional (not in specification) fields into all document formats (there is a heavy pro, and several cons)

One question that comes to mind: do we want to restrict these conversions to only documents that syft created to begin with? Or be able to generically convert between formats for a document that was generated from another (non-syft) tool? (does thing bring on more complexity? or not?... if so, how much?)

@wagoodman wagoodman added the enhancement New feature or request label Oct 16, 2021
@hectorj2f
Copy link
Contributor

I'd start converting Syft to any of those, as detailed in the example. Another option could be to support multiple output formats instead of a single output format as today. wdyt ?

@luhring
Copy link
Contributor

luhring commented Nov 10, 2021

Another option could be to support multiple output formats instead of a single output format as today. wdyt ?

This definitely makes sense — and I'd say this is a separate feature, which we happen to have captured in #325

@sambhav
Copy link
Contributor

sambhav commented Feb 14, 2022

Relevant to this is the cyclonedx's SPDX taxonomy > CycloneDX/cyclonedx-property-taxonomy#7

@wagoodman
Copy link
Contributor Author

from refinement:

  • This ties into the github dependency API work in progress, pending Export GitHub format #836
  • Should we be warning on a per-field aspect? if so, how would we do this?
  • It may be that the cyclonedx and spdx encoders/decoders have gotten good enough where we don't need to warn on anything.
  • We can go an encode-decode-encode pass to detect if we should output a warning (potential loss in data). This isn't perfect as data ordering may have caused this.
  • Do we want to capture intent of the convert functionality? That is, this is a utility to convert SBOMs that are similar to syft output, not cover all features in all SBOM specs. Suggestion: we can add verbiage in the readme and help text around this being for "packages", thus does not consider VEX or similar components (so that information would be lost).
  • Maybe we cover some of the caveats in a blog post to show the most useful use cases this covers.
  • Consider mirroring what we do with cyclonedx properties with spdx annotations.

@jonasagx
Copy link
Contributor

jonasagx commented Apr 26, 2022

We can go an encode-decode-encode pass to detect if we should output a warning (potential loss in data). This isn't perfect as data ordering may have caused this.

For SPDX this the encode/decode process has data loss, I suspect: due to library differences, we encode it via our JSON definition, and decode using SPDX's go lib. Moving to SPDX's official lib might help us here in two fronts: offload code we have to maintain AND fix data loss.

It may be that the cyclonedx and spdx encoders/decoders have gotten good enough where we don't need to warn on anything.

[0000]  WARN unable to convert relationship from CycloneDX 1.3 JSON, dropping: {From:Pkg(name="musl" version="1.2.2-r7" type="apk" id="20dc20cbb6dbea6") To:Location<RealPath="/lib/ld-musl-x86_64.so.1" Layer="sha256:8d3ac3489996423f53d6087c81180006263b79f206d3fdec9e66f0e27ceb8759"> Type:contains Data:<nil>}
[0000]  WARN unable to convert relationship from CycloneDX 1.3 JSON, dropping: {From:Pkg(name="musl" version="1.2.2-r7" type="apk" id="20dc20cbb6dbea6") To:Location<RealPath="/lib/ld-musl-x86_64.so.1" Layer="sha256:8d3ac3489996423f53d6087c81180006263b79f206d3fdec9e66f0e27ceb8759"> Type:contains Data:<nil>}
[0000]  WARN unable to convert relationship from CycloneDX 1.3 JSON, dropping: {From:Pkg(name="busybox" version="1.34.1-r3" type="apk" id="2e32896982ce9587") To:Location<RealPath="/bin/busybox" Layer="sha256:8d3ac3489996423f53d6087c81180006263b79f206d3fdec9e66f0e27ceb8759"> Type:contains Data:<nil>}

@jonasagx
Copy link
Contributor

jonasagx commented May 2, 2022

Open Q&As:

  • Why files are relevant to SBOMs?
    a: awareness of what sources/files are in the bundle. Digest-checking validation.

  • Encode/Decode are the only operations during conversion?
    a: yes

  • Is encode/decode/encode distorting the use-case? Maybe just test encode/decode?
    a: A better test seems to be decode/encode

  • What is next for this feature?
    a: iterate with community about what is good and what needs improvement.

  • What we tell the users about data loss? Warn?
    a: clearly say it is an experimental feature, and point out where to find more info.

  • Provide two tables with warnings about possible loss of data? (1) for encoding (2) for decoding since direction of conversion matters. Maybe one with both directions?

MVPing it

Basic supported fields:

  • Packages (most relevant, because of pURLs)
  • Files (2nd most relevant)
  • Relationships*

Other relevant points

  • Decoding is the main operation with conversion.
  • No focus on automated tests for 1st version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants