You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Guac, through the deps.dev collector, currently pulls in numerous dependency false positives.
The Problem
The deps.dev collector attempts to pull in the dependencies of each package it learns about. The result is that Guac ingests all of the packages in any dependency tree. In these trees, there can be many different versions of the same package, and as a result all of these packages are ingested as dependencies into Guac.
This is, generally, not how executables are built. Compilers employ dependency resolution algorithms to (in most cases) select a single version of each package from each dependency tree.
As a result, Guac instances can contain (as dependencies) numerous versions of packages that are not actually in the built artifact. Furthermore, different versions of a package can themselves have different dependencies, so entire packages, not just package versions, in Guac can be false positives.
Aside: Wider Perspective
The underlying problem is that dependency resolution is done in specific contexts: of a specific root package, of the dependency resolver, of a specific architecture, etc. The deps.dev data takes into account what probably is the most important of these contexts (the root package and dependency resolver), but there may be cases where it differs from the “ground truth” build (docs).
Ultimately, data from deps.dev will be tied to an actor tree, and as such, it is OK if some dependency false positives are ingested. However, the current behavior of Guac / deps.dev collector, which is to ignore the context of which package is top-level, leads to such a large number of false positives that it is worth handling now, before actor trees are introduced.
The deps.dev collector will pull in the dependencies of github.com/google/wire@v0.5.0 via the deps.dev collector, which include github.com/google/go-cmp@v0.2.0
The Guac instance now incorrectly maintains that both versions of github.com/google/go-cmp are dependencies of the go module. In reality, the go compiler will only use version v0.50.
Larger Example:
Ideally, ingesting a single complete SBOM should not lead to any new dependencies being pulled into Guac through the deps.dev collector. However, this is not the current behavior of Guac. For this example, we can assume SBOMs generated from Go executables are complete (because the compiler inserts into the executable a list of the build dependencies).
After running the deps.dev collector on the Guac instance, roughly 100 new versions of already existing packages appear and roughly 100 completely new packages appear.
This is in contrast to the roughly 170 dependencies originally reported by the SBOM and present in Guac before the running deps.dev collector
This example shows that there can be as many dependency false positives as true positives.
Solutions
The motivation for the following changes is to ingest deps.dev dependencies only when there would be few or no dependency false positives. Note that the deps.dev collector should still run in polling mode by default to pull in source information.
Add a flag to disable pulling in dependencies from deps.dev (#1359)
#### Label Packages as suitable for collecting deps.dev dependencies (#1358)
The text was updated successfully, but these errors were encountered:
After discussion with maintainers, the idea is to treat the dependency information from deps.dev as an SBOM through the HasSBOM node. This depends on the changes in #1367.
Guac, through the deps.dev collector, currently pulls in numerous dependency false positives.
The Problem
The deps.dev collector attempts to pull in the dependencies of each package it learns about. The result is that Guac ingests all of the packages in any dependency tree. In these trees, there can be many different versions of the same package, and as a result all of these packages are ingested as dependencies into Guac.
This is, generally, not how executables are built. Compilers employ dependency resolution algorithms to (in most cases) select a single version of each package from each dependency tree.
As a result, Guac instances can contain (as dependencies) numerous versions of packages that are not actually in the built artifact. Furthermore, different versions of a package can themselves have different dependencies, so entire packages, not just package versions, in Guac can be false positives.
Aside: Wider Perspective
The underlying problem is that dependency resolution is done in specific contexts: of a specific root package, of the dependency resolver, of a specific architecture, etc. The deps.dev data takes into account what probably is the most important of these contexts (the root package and dependency resolver), but there may be cases where it differs from the “ground truth” build (docs).
Ultimately, data from deps.dev will be tied to an actor tree, and as such, it is OK if some dependency false positives are ingested. However, the current behavior of Guac / deps.dev collector, which is to ignore the context of which package is top-level, leads to such a large number of false positives that it is worth handling now, before actor trees are introduced.
Examples
Small and Representative:
github.com/google/go-cmp@v0.5.9
andgithub.com/google/wire@v0.5.0
, both the latest versions. Then, create an SBOM for that module by running Syft on the directory, and ingest it into Guac.github.com/google/wire@v0.5.0
via the deps.dev collector, which includegithub.com/google/go-cmp@v0.2.0
github.com/google/go-cmp
are dependencies of the go module. In reality, the go compiler will only use versionv0.50
.Larger Example:
Ideally, ingesting a single complete SBOM should not lead to any new dependencies being pulled into Guac through the deps.dev collector. However, this is not the current behavior of Guac. For this example, we can assume SBOMs generated from Go executables are complete (because the compiler inserts into the executable a list of the build dependencies).
github.com/golangci/golangci-lint
binary and ingest the SBOM into Guac.This example shows that there can be as many dependency false positives as true positives.
Solutions
The motivation for the following changes is to ingest deps.dev dependencies only when there would be few or no dependency false positives. Note that the deps.dev collector should still run in polling mode by default to pull in source information.
Add a flag to disable pulling in dependencies from deps.dev (#1359)
#### Label Packages as suitable for collecting deps.dev dependencies (#1358)The text was updated successfully, but these errors were encountered: