Invoke known tools to gather build-time dependency information #1562

kzantow · 2023-02-09T22:11:49Z

What would you like to be added:
Add the ability to shell-out to known tools such as go and mvn in order to capture more accurate build-time dependency information.

Why is this needed:
To improve the build-time dependency support in Syft.

Additional context:
From a working document:

Creating higher quality SBOMs in Syft at build-time

At build time, static analysis of dependencies implemented today is limited. Improving static analysis metrics can be done by simulating what build systems do. This is subject to drift and additional maintenance to keep up with behaviors of the build systems.

One approach to resolving this issue is to call out to build systems to get that information instead. This introduces additional (optional) dependencies.

Syft as a build-time SBOM generator tool

Syft can be seen as a “build-time” SBOM generator tool, and can start thinking about utilizing build tooling. Calling out to build tools can be such as the following, we will use 2 examples.

Golang

Instead of reading and trying to parse and resolve the go.mod file, the go mod graph command can be used to get a fully resolved dependency tree.

Use go mod graph to resolve dependency information for go source #2018

Java

Maven has the mvn dependency:tree command which shows the fully-resolved dependency graph.

Use mvn dependency:tree to resolve dependency information for Maven source #2019

NPM

Npm has npm ls --all

Use npm ls --all to resolve dependency information for NPM packages #2020

Python

Python has pipdeptree

Python pip dependency information #2023

Considerations

When using external tooling, version and parameter information should be captured
Warnings on quality of what is being used to generate it can be made visible, as well as suggestions on how to obtain better SBOMs (e.g. dependency pinning)

The text was updated successfully, but these errors were encountered:

kzantow · 2023-02-09T22:15:14Z

cc: @lumjjb

wagoodman · 2023-04-13T17:05:46Z

This is most likely needed on order to achieve #1674 and #572 in a meaningful way.

This functionality should be opt-in, that is, by default syft should remain a static analysis tool. Executing other commands on the system should still be not allowed by default (again, unless the user opts in).

Considerations:

should these "external querying capabilities" be encapsulated into their own separate catalogers? For example go-mod-file-cataloger stays as it is today and allow for a new go-tooling-cataloger. In this way opting in would be adding a cataloger (or enabling a flag which would automatically swap out one cataloger for another)... or should we go in the direction of keeping the existing catalogers today that behave differently based on configuration? (one assumption Im making by going down this path is that the impl for the go.mod cataloging today is mutually exclusive to using the build tooling)
we probably don't want to find duplicate packages by doing a static analysis and tooling query, there should be an obvious mechanism for enforcing mutual exclusivity for existing (static) analysis and tooling analysis.
even if a new cataloger is not used to encapsulate this behavior it should be obvious to the user that this was found via a tooling query vs looking at just the go.mod contents (more than just the application configuration probably).

setchy · 2023-06-30T16:21:03Z

would the same be true for npm, too?

noqcks · 2023-09-12T04:41:46Z

Instead of shelling out to cli tools, would you consider building parsers directly inside syft? It wouldn't require one to depend on the presence of local tooling, and I could envision that the tools might have different ouputs depending on the installed version of the tooling.

Snyk, for example, has built a bunch of parsers for various ecosystems in js https://github.com/snyk/dotnet-deps-parser

I just implemented something similar in cdxgen for .NET [ref] and npm [ref] to determine direct/indirect deps in build files. Wondering if the same could work here but written in golang.

Add the ability to shell-out to known tools such as go and mvn in order to capture more accurate build-time dependency information.

Can you expand on what specifically would be more accurate. In my mind I can only imagine direct/indirect deps. But is there more?

I also noticed that syft doesn't generate a dependencies section for CycloneDX for different language specific files (go.mod, package-lock.json). Would the outcome of this issue be that this section would be filled?

kzantow · 2023-09-12T11:29:30Z

@noqcks we do already have lots of parsers for different ecosystems. This change, at least initially, would be an opt-in behavior to shell out to the tools. This would allow things like Go - which has a flat list of dependencies in the go.mod - to get the dependency graph and properly output it in different formats.

noqcks · 2023-09-12T17:57:43Z

I suppose what I meant is only shelling out to tools where necessary (in the case that go mod graph is truly the only way to see the dependency tree for go projects), and writing all other dep graph parsers directly into syft where possible.

I'd like to work on getting real dependency graph for javascript projects inside syft, and wanted to write this dep parser inside syft instead of relying on an external npm cli.

Wanted to clarify whether this would be an appropriate avenue to pursue before I started the work.

wagoodman · 2024-03-15T13:59:59Z

Since this is a potentially large item that would affect multiple ecosystems I think a detailed plan is needed to move forward with this (how would this work within a single cataloger, what abstractions do we want to introduce (if any), would abstractions be generalizable to other ecosystem catalogers (if so, how), etc)

kzantow added the enhancement New feature or request label Feb 9, 2023

kzantow mentioned this issue Feb 9, 2023

Explore using cyclonedx gomod as a library #761

Closed

kzantow mentioned this issue Feb 9, 2023

Unable to identify license on Golang packages imported by URL #1056

Closed

kzantow self-assigned this Feb 14, 2023

tgerla mentioned this issue Feb 23, 2023

Parsing gradle dependencies from the output of ./gradlew dependencies task #1553

Closed

kzantow mentioned this issue Apr 13, 2023

Support fetching packaging information from build tooling #1736

Closed

This was referenced Aug 14, 2023

Python pip dependency information #2023

Open

Add support for dpkg dependency relationships #2040

Closed

willmurphyscode mentioned this issue Oct 26, 2023

Grype is catching a false positive on spring-boot-starter-web because it cannot detect inherited version from parent anchore/grype#1012

Open

wagoodman added the planning high level epic that should be broken into smaller tasks label Feb 7, 2024

willmurphyscode mentioned this issue Feb 28, 2024

Support use of Maven to resolve all dependencies. #2669

Open

kzantow mentioned this issue May 9, 2024

Capture licenses for all packages #2861

Open

44 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invoke known tools to gather build-time dependency information #1562

Invoke known tools to gather build-time dependency information #1562

kzantow commented Feb 9, 2023 •

edited

Loading

kzantow commented Feb 9, 2023

wagoodman commented Apr 13, 2023

setchy commented Jun 30, 2023

noqcks commented Sep 12, 2023 •

edited

Loading

kzantow commented Sep 12, 2023

noqcks commented Sep 12, 2023 •

edited

Loading

wagoodman commented Mar 15, 2024

Invoke known tools to gather build-time dependency information #1562

Invoke known tools to gather build-time dependency information #1562

Comments

kzantow commented Feb 9, 2023 • edited Loading

kzantow commented Feb 9, 2023

wagoodman commented Apr 13, 2023

setchy commented Jun 30, 2023

noqcks commented Sep 12, 2023 • edited Loading

kzantow commented Sep 12, 2023

noqcks commented Sep 12, 2023 • edited Loading

wagoodman commented Mar 15, 2024

kzantow commented Feb 9, 2023 •

edited

Loading

noqcks commented Sep 12, 2023 •

edited

Loading

noqcks commented Sep 12, 2023 •

edited

Loading