Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for user provided content "hints" file #31

Open
wagoodman opened this issue Jun 1, 2020 · 11 comments
Open

Add support for user provided content "hints" file #31

wagoodman opened this issue Jun 1, 2020 · 11 comments
Labels
I/O Describes bug or enhancement around application input or output

Comments

@wagoodman
Copy link
Contributor

wagoodman commented Jun 1, 2020

syft should be aware of user-specified content files, which can override or add additional known packages to a catalog.

@wagoodman wagoodman added this to the v0.1.0 milestone Jun 3, 2020
@alfredodeza alfredodeza self-assigned this Jun 23, 2020
@alfredodeza alfredodeza removed their assignment Jun 23, 2020
@wagoodman
Copy link
Contributor Author

@wagoodman wagoodman removed this from the v0.1.0 milestone Aug 5, 2020
@wagoodman
Copy link
Contributor Author

Consider using CycloneDX for the input format as well #67

@wagoodman wagoodman changed the title Add support for content file Add support for user provided content "hints" file Oct 8, 2020
@wagoodman
Copy link
Contributor Author

wagoodman commented Aug 17, 2021

We should consider allowing this functionality to be downstream (outside) of syft. Syft is cataloging what was actually found, and if there is a modification to the output needed a consumer can perform this action. It isn't immediately clear that this is syft's responsibility.

Since this is a security tool that can be used in verifying supply-chain concerns it is reasonable to assert that the SBOM output generated from Syft should be verified by syft --allowing for a "catch all" hints file to add, modify, or remove packages outside of the observations of syft would start to break this assumption.

@wagoodman wagoodman added the I/O Describes bug or enhancement around application input or output label Aug 23, 2021
@wagoodman
Copy link
Contributor Author

wagoodman commented Sep 14, 2021

Somewhat contradictory to the above comment, I think there is room for adding "exceptional" content in syft output via configuration. I think it matters how we do this. Such as labeling individual packages/elements with "manually-added" or similar to track in the SBOM what was "magically" added. We want to make SBOMs that we generate as reproducible as possible, which means being transparent about what the inputs were to generate the SBOM (including content hints).

It could be that all contents hints get injected into a separate SBOM that gets referenced in the main SBOM that contains what was discovered.

As a side note "content hints" sounds very optional/conditional/implicit where as the mechanism being described here should imply that what is being used is explicit and intentional. We should consider naming this feature something different than "content hints".

@wagoodman
Copy link
Contributor Author

We could simplify the functionality some to make the solution space more tractable; what if we only allowed for the addition of packages and maybe the removal of packages, but not the modification of packages.

Even not allowing package mutations makes this much simpler (you don't need to try to pair-wise match every hint-package with every discovered-package, and figure which fields should be considered).

@zhill
Copy link
Member

zhill commented Jan 6, 2022

That is inline with the current anchore engine behavior, which can only add new entries to the list, not modify an existing entry.

@wagoodman
Copy link
Contributor Author

wagoodman commented Jan 24, 2022

From refinement:

  • We probably shouldn't call this "content hints"
  • Possible implementation path: implement template output which would allow the user to add packages via the template

Note: this is blocking removal of the existing python code for the analysis in anchore-engine.

@thepwagner
Copy link

Is CycloneDX the expected input format? (and if so, is this just a dupe of #737 ?)

I'm considering containers like eclipse-temurin:17-jre-alpine, which fetch a trusted binary that existing catalogers don't understand.

My naive and ideal solution would be dropping a CPE+purl in simple text format like /opt/java/openjdk/breadcrumb-for-syft.txt, it is feasible with echo in the same layer as the wget.

@spiffcs
Copy link
Contributor

spiffcs commented Jul 12, 2022

@wagoodman just a ping for when you get back I think this "hints" or new SBOM cataloger is a common feature request we're seeing a lot more of now this year. I want to see if we can come to a basis on what the initial feature looks like so I can make a PR that supports at least some user-related configuration for custom CPE generation while also allowing them to fill in packages that we cannot detect at this moment (binary analysis or db parse for image scan)

@thepwagner
Copy link

We're using https://github.com/shopify/hansel as a hack today. For deb/apk/rpm-based distributions it generates empty packages that serve as simple hints: name+version. If there's a way we can accept+encode custom CPEs in the packages or you have any other feedback, please open an issue!

@spiffcs
Copy link
Contributor

spiffcs commented Aug 1, 2023

During a recent discussion we wanted to capture that format specific fields should be considered as a part of this hints file

Ex:

I know package x has supplier: foobar
I expect the SPDX output of the sbom given this hints file to have foobar as the supplier no matter other syft logic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I/O Describes bug or enhancement around application input or output
Projects
Status: No status
Development

No branches or pull requests

5 participants