-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files are not output in CycloneDX formats #1710
Comments
Related to #1524 in that files belonging to said packages could be potentially excluded from the generic cataloguing path. |
Related to #1256 in potential SPDX specification constraints. |
Hey @bureado, thanks for the report. We're discussing this as a team and we think it might be useful if we were able to talk to you "live" about this issue and some of the related topics. Any chance you can join one of our upcoming community meetings? The next one will be April 27, at Noon Eastern time. Or ping us on Slack (https://get.anchore.com/join-anchore-community/) and we can have an async conversation. Much appreciated! |
This issue has a few different changes described. If we limit this to outputting files in CycloneDX format, this would solve what I believe is the main issue: SPDX and CycloneDX formats are not equivalent in terms of files being output. We could, of course, add more options to restrict what is actually being cataloged, but I think these would be separate issues from enhancing CycloneDX output. |
I think this will be made a little better with #1383 , specifically, we'll be capturing file digests for files that are directly related to packages by default. That means that:
|
@wagoodman What do you think about whether to make it configurable if we output information on files for SPDX? Other vendors like GitHub do not include this in the SBOM. |
We could start doing that since we have format-specific configurations now, however, in the upcoming workin #1383 I think this will get much better:
Where file selection would be controlled by configuration (regardless of the format): file:
metadata:
# can be: all-files, owned-files, no-files
selection: all-files |
Sounds good thanks @wagoodman . I ran a Syft generated spfx document through https://tools.spdx.org/app/validate/ and it complained that "Found analyzed files for package apk-tools when analyzedFiles is set to false", so the current output is not quite aligned. I'll check again once your PR is merged. |
Independent of #1383, I think a format configuration like you're suggesting wouldn't be a bad idea. Something like: spdx:
# common options...
analyze-files: true
# json specific options
json:
....
# xml specific options
xml:
...
We'd need to adjust the app configuration to allow for this nesting, right now it's: spdx-json:
...
spdx-xml:
... |
What happened:
When scanning an image such as
debian:bullseye
,syft
will catalog the individual files found in the image and output that information apparently only when using thespdx-tag-value
,spdx-json
andsyft-json
formats. This also means that users ofsyft
using those formats appear to be penalized in terms of output file size.What you expected to happen:
I would have expected a combination of:
It's possible some output formats (e.g.,
syft-table
) don't support files or files aren't a good fit for the use case, since files can be less helpful for downstream consumption scenarios such asgrype
performing a vulnerability assessment.It's possible this is by design. I'm bringing it up as it surprised me when looking at "stage" scans produced by
buildkitd
(which currently usessyft
) given it only supports SPDX JSON and I noticed some very large SBOM files containing file references which appear largely duplicative and of limited security interest.Steps to reproduce the issue:
One of:
Anything else we need to know?:
This appears to correspond with the use of
s.AllCoordinates
only in the handlers for SPDX and Syft formats.A compounding scenario is that
syft/syft/formats/common/spdxhelpers/to_format_model.go
Line 435 in 8a574c9
file-metadata
cataloger is enabled) which makes the output larger at no increased security value. Insyft-json
output, MD5 digests are present. It's unclear if they can't be used in SPDX because of a spec constraint or something else.Environment:
The text was updated successfully, but these errors were encountered: