Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions docs/detectors/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Detectors

- CocoaPods
- [CocoaPods](cocoapods.md)

| Detector | Status |
| -------------------- | ------ |
Expand All @@ -18,10 +18,10 @@
| -------------------------- | ---------- |
| CondaLockComponentDetector | DefaultOff |

- DockerFile
- [Dockerfile](dockerfile.md)

| Detector | Status |
| -------------------------- | ---------- |
| Detector | Status |
| --------------------------- | ---------- |
| DockerfileComponentDetector | DefaultOff |

- [DotNet](dotnet.md)
Expand All @@ -42,7 +42,7 @@
| ----------------------- | ------ |
| GradleComponentDetector | Stable |

- Ivy
- [Ivy](ivy.md)

| Detector | Status |
| ----------- | ------------ |
Expand Down Expand Up @@ -84,7 +84,7 @@
| PipComponentDetector | DefaultOff |
| SimplePipComponentDetector | DefaultOff |

- Pnpm
- [Pnpm](pnpm.md)

| Detector | Status |
| ---------------------------- | ------ |
Expand All @@ -96,7 +96,7 @@
| ----------------------- | ------------ |
| PoetryComponentDetector | Experimental |

- Ruby
- [Ruby](ruby.md)

| Detector | Status |
| --------------------- | ------ |
Expand All @@ -108,7 +108,7 @@
| ---------------- | ------ |
| RustSbomDetector | Stable |

- Spdx
- [Spdx](spdx.md)

| Detector | Status |
| ----------------------- | ---------- |
Expand All @@ -126,13 +126,13 @@
| ----------------------- | ------------ |
| UvLockComponentDetector | Experimental |

- Vcpkg
- [Vcpkg](vcpkg.md)

| Detector | Status |
| ---------------------- | ------ |
| VcpkgComponentDetector | Stable |

- Yarn
- [Yarn](yarn.md)

| Detector | Status |
| ------------------------ | ------ |
Expand Down
23 changes: 23 additions & 0 deletions docs/detectors/cocoapods.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# CocoaPods Detection

## Requirements

CocoaPods detection relies on a `Podfile.lock` file being present. This file is generated by CocoaPods when dependencies are installed.

## Detection strategy

CocoaPods detection is performed by parsing every `Podfile.lock` found under the scan directory. The detector:

- Parses the YAML-formatted `Podfile.lock` file to extract pod dependencies
- Identifies root dependencies from the `DEPENDENCIES` section
- Constructs a dependency graph by traversing pod relationships
- Supports both standard CocoaPods packages and Git-based dependencies
- Normalizes Git repository URIs (e.g., converting `git@` references to `https://`)
- Maps pods to their spec repositories (TRUNK or custom repositories)
- Handles subspecs (e.g., `AFNetworking/Reachability`) by mapping them to their parent podspec

## Known limitations

CocoaPods detection will not work if lock files are not being used or not yet generated. Ensure that `pod install` or `pod update` has been run to generate the `Podfile.lock` file(s) before running the scan.

The detector constructs a full dependency graph based on the relationships present in the `Podfile.lock` file, including transitive dependencies. However, dependency relationships are limited to what CocoaPods records in the lock file at the time of pod installation.
32 changes: 32 additions & 0 deletions docs/detectors/dockerfile.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Dockerfile Detection

## Requirements

Dockerfile detection depends on the following to successfully run:

- One or more Dockerfile files matching the patterns: `dockerfile`, `dockerfile.*`, or `*.dockerfile`

The `DockerfileComponentDetector` is a **DefaultOff** detector and must be explicitly enabled via the `--DetectorArgs` parameter.

## Detection strategy

The Dockerfile detector parses Dockerfile syntax to extract Docker image references from `FROM` and `COPY --from` instructions. It uses the [Valleysoft.DockerfileModel](https://github.com/mthalman/DockerfileModel) library to parse Dockerfile syntax.

### FROM Instruction Detection
The detector extracts base image references from `FROM` instructions and resolves multi-stage build references:
- Direct image references (e.g., `FROM ubuntu:22.04`)
- Multi-stage builds with stage names (e.g., `FROM node:18 AS builder`)
- Stage-to-stage references are tracked to avoid reporting internal build stages as external dependencies

### COPY --from Instruction Detection
The detector extracts image references from `COPY --from=<image>` instructions that reference external images rather than build stages.

### Variable Resolution
The detector attempts to resolve Dockerfile variables using the `ResolveVariables()` method from the parser library. Images with unresolved variables (containing `$`, `{`, or `}` characters) are skipped to avoid reporting incomplete or incorrect references.

## Known limitations

- **DefaultOff Status**: This detector must be explicitly enabled using `--DetectorArgs DockerReference=EnableIfDefaultOff`
- **Variable Resolution**: Image references containing unresolved Dockerfile `ARG` or `ENV` variables are not reported, which may lead to under-reporting in Dockerfiles that heavily use build-time variables
- **No Version Pinning Validation**: The detector does not warn about unpinned image versions (e.g., `latest` tags), which are generally discouraged in production Dockerfiles
- **No Digest Support**: While Docker supports content-addressable image references using SHA256 digests (e.g., `ubuntu@sha256:abc...`), the parsing and reporting of these references depends on the underlying `DockerReferenceUtility.ParseFamiliarName()` implementation
35 changes: 35 additions & 0 deletions docs/detectors/ivy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Ivy Detection

## Requirements

Ivy detection depends on the following to successfully run:

- Apache Ant CLI as part of your PATH (`ant` or `ant.bat` should be runnable from a given command line).
- Java Development Kit (JDK) installed and configured for Ant.
- One or more `ivy.xml` files.
- Optional `ivysettings.xml` files in the same directory as `ivy.xml` for repository configuration.

## Detection strategy

Ivy detection is performed by running Apache Ant to resolve dependencies for each `ivy.xml` file found. The detector:

1. Copies `ivy.xml` (and `ivysettings.xml` if present) to a temporary directory.
2. Creates a synthetic Ant build file with a custom task that invokes Ivy's dependency resolver.
3. Executes `ant resolve-dependencies` to resolve both direct and transitive dependencies.
4. Parses the JSON output produced by the custom Ant task to register components.

Components are identified using Maven's GAV (group, artifact, version) coordinate system, which corresponds to Ivy's (org, name, rev) coordinates. Dependencies with the same organization as the project are treated as first-party dependencies and ignored.

Components tagged as development dependencies are marked appropriately.

Full dependency graph generation is supported.

## Known limitations

Ivy detection will not run if `ant` is unavailable in the PATH.

The `ivy.xml` and `ivysettings.xml` files must be self-contained. Detection will fail if these files:
- Rely on properties defined in the project's `build.xml`
- Use file inclusion mechanisms (e.g., `<include>` tags)

Dependencies that cannot be resolved by Ivy will be logged as package parse failures and not included in the detection results.
82 changes: 62 additions & 20 deletions docs/detectors/pnpm.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,65 @@
# Pnpm detection
# Pnpm Detection

## Requirements

Pnpm detection relies on the presence of lockfiles generated by the pnpm package manager. The detector searches for the following files:

- `pnpm-lock.yaml`
- `shrinkwrap.yaml` (legacy format)

The detector supports lockfile versions 5, 6, and 9, with version 9 being the maximum supported version.

## Detection strategy

Pnpm detection is performed by parsing lockfiles found in the scan directory. The `PnpmComponentDetectorFactory` acts as a version-aware factory that:

1. **Detects the lockfile version** by parsing the `lockfileVersion` field in the YAML file
2. **Delegates to the appropriate version-specific detector**:
- `Pnpm5Detector` for lockfile version 5.x (also handles legacy `shrinkwrapVersion` files)
- `Pnpm6Detector` for lockfile version 6.x
- `Pnpm9Detector` for lockfile version 9.x

Each version-specific detector handles the format differences in pnpm lockfiles:

- **Version 5**: Basic package graph with dependencies listed in the `packages` section
- **Version 6**: Introduced workspace support with `importers` section and improved dependency tracking
- **Version 9**: Changed the structure to use `snapshots` instead of `packages` and removed dev dependency metadata from the lockfile

### Dependency Graph Construction

The detectors build a complete dependency graph by:

1. Registering all packages found in the lockfile as components
2. Creating parent-child relationships based on dependency declarations
3. Marking direct dependencies as explicit references
4. Identifying development dependencies based on:
- Lockfile metadata (versions 5-6)
- Dependency tree position (version 9, where lockfile no longer includes dev dependency flags)

### Workspace Support

Pnpm supports both single-package projects ("dedicated shrinkwrap") and multi-package workspaces ("shared shrinkwrap"). The detectors handle both scenarios:

- Single-package: Dependencies are read directly from the root level
- Workspaces: Dependencies are read from each importer in the `importers` section

## Known limitations

The Pnpm detector doesn't support the resolution of local dependencies
like:

- Link dependencies
```
dependencies:
'@learningclient/common': link:../common
```

- File dependencies
```
dependencies:
file:./projects/gmc-bootstrapper.tgz
```
These kind of components are ignored by the Pnpm detector.

In the case of `link` dependencies that refer to a folder with a `package.json` file
the component is then going to be detected by the `NpmComponentDetector`. This is going to happen
only if the folder is inside the path that is been use for scanning.
1. **Local dependencies are skipped**: Packages referenced with `file:` or `link:` prefixes are not included in detection as they represent local packages rather than external dependencies. In the case of `link` dependencies that refer to a folder with a `package.json` file, the component may be detected by the `NpmComponentDetector` if the folder is inside the scan path.

Example of ignored dependencies:
```yaml
dependencies:
'@learningclient/common': link:../common
file:./projects/gmc-bootstrapper.tgz
```

2. **HTTP/HTTPS dependencies**: In version 9, dependencies referenced via `http:` or `https:` protocols are also skipped as they are treated similarly to local dependencies

3. **Lockfile version support**: Only versions 5, 6, and 9 are supported. If an unsupported version is detected, the file will be skipped and a warning will be logged

4. **Version 9 dev dependency detection**: Lockfile version 9 removed the metadata that explicitly marks packages as dev dependencies. The detector relies on the dependency tree structure to determine dev dependency status, which may be less accurate in complex scenarios

5. **Pnpm dependency path complexity**: Pnpm uses specialized dependency paths that include peer dependency information and other metadata. While the detector handles standard cases, highly complex dependency scenarios with multiple peer dependencies may not be perfectly represented

6. **Automatic root dependency calculation required**: The detector sets `NeedsAutomaticRootDependencyCalculation = true`, indicating that the orchestrator must perform additional analysis to determine root-level dependencies
59 changes: 59 additions & 0 deletions docs/detectors/ruby.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Ruby

## Requirements

The Ruby detector scans for Ruby dependencies defined in Bundler lockfiles.

**File Patterns:** `Gemfile.lock`

**Supported Ecosystems:** RubyGems

## Detection Strategy

The detector parses `Gemfile.lock` files to identify Ruby gems and their dependencies. It processes the lockfile in multiple passes:

### Parsing Approach

1. **Section-based parsing**: The detector reads the lockfile by sections, which are identified by all-caps headings (`GEM`, `GIT`, `PATH`, `BUNDLED WITH`, etc.)

2. **Component registration**: For each section, the detector extracts:
- **GEM section**: Standard RubyGems components with name, version, and remote source
- **GIT section**: Git-based dependencies with remote URL and revision
- **PATH section**: Local path dependencies
- **BUNDLED WITH section**: The Bundler version used to generate the lockfile

3. **Dependency graph construction**: After collecting all components, the detector creates parent-child relationships by:
- Identifying top-level dependencies (4-space indentation)
- Mapping sub-dependencies (6-space indentation) to their parent components
- Using automatic root dependency calculation to determine direct vs transitive dependencies

### Component Types

- **RubyGemsComponent**: Standard gems from RubyGems.org or custom sources
- Properties: name, version, source
- **GitComponent**: Git-based dependencies
- Properties: remote URL, revision

## Known Limitations

### Version Resolution Constraints

- **Relative versions are excluded**: Components with relative version specifiers (starting with `~` or `=`) are skipped and logged as parse failures. Only absolute versions are registered.
- **Fuzzy version handling**: Different sections of the lockfile can reference the same component, but authoritative version information is only stored in specific sections (e.g., the GEM section), requiring cross-section resolution.

### Git Component Naming

- Git components use a Ruby-specific "name" annotation that doesn't map directly to standard GitComponent semantics (remote/version). The detector works around this by maintaining a name-to-component mapping during parsing.

### Root Dependency Detection

- The detector uses **automatic root dependency calculation** rather than parsing the `DEPENDENCIES` section of `Gemfile.lock` (which lists user-specified dependencies from the `Gemfile`).
- This approach may not perfectly distinguish between direct and transitive dependencies in all cases.

### Bundler Source Information

- The `bundler` version is always registered with `"unknown"` as its source, since the lockfile doesn't specify where Bundler originated.

### Excluded Dependencies

- When a parent component has a relative version and is excluded, all of its child dependencies are also excluded from the dependency graph to maintain consistency.
33 changes: 33 additions & 0 deletions docs/detectors/spdx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# SPDX Detection

## Requirements

SPDX detection depends on the following to successfully run:

- One or more `*.spdx.json` files in the scan directory

## Detection strategy

The SPDX detector (`Spdx22ComponentDetector`) discovers SPDX SBOM (Software Bill of Materials) files in JSON format and creates components representing the SPDX documents themselves.

The detector:
- Searches for files matching the pattern `*.spdx.json`
- Validates that the SPDX version is `SPDX-2.2` (currently the only supported version)
- Computes a SHA-1 hash of the SPDX file for identification
- Extracts metadata including:
- Document namespace
- Document name
- SPDX version
- Root element ID from `documentDescribes` (defaults to `SPDXRef-Document` if not specified)
- Creates an `SpdxComponent` to represent the SPDX document

The detector does not parse or register individual packages listed within the SPDX document; it only registers the SPDX document itself as a component.

## Known limitations

- Only SPDX version 2.2 is currently supported
- Only JSON format is supported (`.spdx.json` files)
- The detector is **DefaultOff** and must be explicitly enabled via detector arguments
- If an SPDX document contains multiple elements in `documentDescribes`, only the first element is selected as the root element
- The detector does not create a dependency graph from the packages listed within the SPDX document
- Invalid JSON files or files that cannot be parsed are skipped with a warning
Loading
Loading