Problem Statement
We currently rely on NVIDIA's external nSpect/Syft pipeline to generate SBOMs for our container images, producing CycloneDX JSON files after the fact. We have no in-repo tooling to generate, enrich, or validate SBOMs ourselves. This means we can't catch dependency license issues before merge, we can't produce SBOMs on demand, and the externally-generated SBOMs have significant gaps — 801 out of ~5,800 components across our 6 container image SBOMs had missing or hash-placeholder licenses that needed manual resolution.
We need first-party SBOM tooling integrated into our build system and CI pipeline.
Technical Context
The repo builds 3 container images (server, sandbox, cluster) for 2 architectures (amd64, arm64), totaling 6 image variants. The build system uses mise for task orchestration with TOML task files in tasks/. Python scripts are invoked via uv run python <script>. Docker builds are managed through tasks/docker.toml and helper scripts in tasks/scripts/.
Prototype scripts already exist (built during investigation) that:
- Convert CycloneDX JSON SBOMs to CSV (
sbom_to_csv.py)
- Resolve missing licenses by querying crates.io, npm, PyPI, and known Go/Debian license maps (
resolve_licenses.py) — achieved 751/801 (94%) resolution rate
Syft (Apache-2.0, github.com/anchore/syft) is the scanner already used in the NVIDIA pipeline and is the natural choice for in-repo generation.
Decisions
- UX: Single command —
mise run sbom does everything (generate → resolve → CSV). User harvests CSVs from deploy/sbom/output/. Subtasks exist for CI/advanced use but the primary interface is one command.
- PR license check: Advisory only (warning annotations, not blocking)
- SBOM storage: Release artifacts only — not committed to the repo. Manual generation available via
mise run sbom.
- Unresolved private packages (50): Deferred. Accept gaps for now; the 50 private
@openclaw/*, opencode-*, and pi-extension-* packages will be addressed in a follow-up.
- Agent skill: Add an
sbom agent skill (.agents/skills/sbom/SKILL.md) so agents can generate, resolve, and export SBOMs on demand.
Affected Components
| Component |
Key Files |
Role |
| mise config |
mise.toml |
Tool definitions (line 14-21), task includes (line 47-48), python paths for lint (line 44-45) |
| Task system |
tasks/*.toml |
Task definitions — new tasks/sbom.toml needed |
| Docker build |
tasks/docker.toml |
Build tasks for server (line 31), sandbox (line 20), cluster (line 36) |
| Publish pipeline |
tasks/publish.toml |
Release publishing — SBOM generation hooks post-publish |
| Multi-arch publish |
tasks/scripts/docker-publish-multiarch.sh:115 |
Currently has --sbom=false flag |
| Deploy dir |
deploy/docker/, deploy/helm/ |
Peer directory pattern — deploy/sbom/ fits here |
| License tooling |
tasks/license.toml, scripts/update_license_headers.py |
Existing SPDX header checks (not dependency licenses) |
| Agent skills |
.agents/skills/ |
New sbom skill for on-demand SBOM operations |
| Build docs |
architecture/build-containers.md:408 |
Documents --sbom=false ECR flag |
Technical Investigation
Architecture Overview
The build pipeline flows: mise run docker:build:* → tasks/scripts/docker-build-component.sh → docker buildx build. Multi-arch publishing uses tasks/scripts/docker-publish-multiarch.sh which explicitly disables BuildKit SBOM generation (--sbom=false at line 115) when pushing to ECR. There is no post-build SBOM step.
The mise task system uses one TOML file per domain (e.g., docker.toml, license.toml, python.toml) with colon-namespaced task names. Python scripts are invoked via uv run python <script>. Syft can be added as a mise tool via "ubi:anchore/syft" (same pattern as sccache at line 21).
Agent skills live in .agents/skills/<name>/SKILL.md and are loaded by the agent framework. Existing skills cover areas like cluster debugging, PR creation, security review, and TUI development. An sbom skill would guide agents through generation, resolution, and export workflows.
Code References
| Location |
Description |
mise.toml:14-21 |
Tool definitions — where Syft would be added |
mise.toml:44-45 |
Python paths for lint/format — needs deploy/sbom/*.py added |
mise.toml:47-48 |
Task config includes tasks/*.toml — new sbom.toml auto-discovered |
tasks/docker.toml:20-23 |
docker:build:sandbox task |
tasks/docker.toml:31-33 |
docker:build:server task |
tasks/docker.toml:36-39 |
docker:build:cluster task |
tasks/docker.toml:41-49 |
Multi-arch build and publish tasks |
tasks/scripts/docker-publish-multiarch.sh:115 |
--sbom=false flag to disable BuildKit SBOM |
tasks/license.toml:6-13 |
Existing SPDX header check tasks (source headers, not dep licenses) |
tasks/version.toml:8 |
Example of uv run python tasks/scripts/release.py pattern |
.agents/skills/ |
Existing skill directory — pattern for new sbom skill |
architecture/build-containers.md:408 |
Documents --sbom=false in ECR publishing |
Current Behavior
- Container images are built locally or in CI with no SBOM output
- Multi-arch ECR publishes explicitly disable SBOM generation
- NVIDIA's external nSpect pipeline generates SBOMs post-hoc, but these have significant license gaps
- The existing
tasks/license.toml only checks SPDX copyright headers in source files — it does not scan dependency licenses
- No CI check validates dependency licenses on PRs
- No agent skill exists for SBOM operations
What Would Need to Change
New files:
deploy/sbom/sbom_to_csv.py — CycloneDX JSON to CSV converter
deploy/sbom/resolve_licenses.py — license resolution via public registry APIs + known license maps
tasks/sbom.toml — mise task definitions
.agents/skills/sbom/SKILL.md — agent skill for on-demand SBOM workflows
Modified files:
mise.toml — add Syft to [tools], add deploy/sbom/*.py to python_paths
tasks/publish.toml or tasks/docker.toml — optionally chain SBOM generation after image builds
CI integration (GitHub Actions):
- PR workflow: Syft source scan of
Cargo.lock + uv.lock — advisory annotations for license issues
- Release workflow:
mise run sbom + upload artifacts
Task Design
mise run sbom # <-- the ONE command. does everything, CSVs land in deploy/sbom/output/
Internally chains:
sbom (entry point, not hidden)
depends = ["sbom:generate", "sbom:resolve", "sbom:csv"]
sbom:generate (hidden)
- Runs syft against each container image (server, sandbox, cluster × amd64, arm64)
- Outputs CycloneDX JSON to deploy/sbom/output/*.json
sbom:resolve (hidden)
- Runs resolve_licenses.py against all JSON files in deploy/sbom/output/
- Patches missing/hash licenses in-place
sbom:csv (hidden)
- Runs sbom_to_csv.py against all JSON files in deploy/sbom/output/
- Produces deploy/sbom/output/*.csv
sbom:check (hidden, used in PR CI only)
- Syft source-mode scan of lockfiles
- Advisory annotations, non-blocking
Output directory: deploy/sbom/output/ (gitignored).
Automation Strategy
| Trigger |
What Runs |
Output |
| Developer |
mise run sbom |
CSVs in deploy/sbom/output/ |
| Agent |
Via sbom skill → mise run sbom |
Same CSVs, agent can analyze/report |
| PR CI |
sbom:check subtask only |
Advisory annotations on the PR |
| Release CI |
mise run sbom + artifact upload |
SBOMs attached as release artifacts |
Agent Skill Design
The sbom skill would cover:
- Generate: Run
mise run sbom for end-to-end SBOM generation
- Analyze: Read SBOM JSON/CSV files and answer questions about dependencies, licenses, and compliance
- Trigger keywords: sbom, bill of materials, license scan, dependency license, generate sbom, resolve licenses
Patterns to Follow
- Task file:
tasks/sbom.toml following the one-file-per-domain convention
- Task names: colon-namespaced, top-level
sbom is the public entry point, subtasks are hidden
- Python invocation:
uv run python deploy/sbom/<script>.py (Pattern A from existing tasks)
- Tool installation:
"ubi:anchore/syft" = "<version>" in mise.toml [tools]
- Scripts use only Python stdlib (no extra pip dependencies needed)
- Skill file:
.agents/skills/sbom/SKILL.md following existing skill patterns
- Output dir:
deploy/sbom/output/ added to .gitignore
Proposed Approach
Add Syft as a mise-managed tool and create deploy/sbom/ with the two prototype scripts (refined for repo use). Create tasks/sbom.toml with a single top-level sbom task that chains generate → resolve → CSV, plus a sbom:check subtask for PR CI. Add an agent skill at .agents/skills/sbom/SKILL.md. For CI, add an advisory PR check and a release step that runs mise run sbom and uploads artifacts. Defer resolution of the 50 private/internal package licenses to a follow-up.
Scope Assessment
- Complexity: Medium — multiple files, CI integration, agent skill, but well-scoped
- Confidence: High — prototype scripts exist and are proven, patterns are clear
- Estimated files to change: 8-10 (2 new scripts, 1 new task file, 1 new skill file, 2 modified configs, 1-2 CI workflow files, .gitignore)
- Issue type:
feat
Risks & Open Questions
- Rate limiting on crates.io / npm / PyPI APIs during CI — may need caching or a pre-built license database for reliability
- The
--sbom=false flag in docker-publish-multiarch.sh should be revisited — BuildKit SBOM vs. standalone Syft scan is a design choice
Unresolved private packages — Deferred to follow-up issue
PR check blocking vs. advisory — Decided: advisory
SBOM storage location — Decided: release artifacts only
Single command vs. separate tasks — Decided: mise run sbom does everything
Test Considerations
- Unit test the CSV conversion logic (known input JSON → expected CSV output)
- Unit test the license resolution logic with mocked API responses
- Integration test: run
sbom:generate against a locally-built test image and verify output format
- CI test: verify
sbom:check correctly produces advisory annotations for a known-bad license
- No existing test patterns for deploy tooling scripts — would establish the pattern
Created by spike investigation. Use build-from-issue to plan and implement.
Problem Statement
We currently rely on NVIDIA's external nSpect/Syft pipeline to generate SBOMs for our container images, producing CycloneDX JSON files after the fact. We have no in-repo tooling to generate, enrich, or validate SBOMs ourselves. This means we can't catch dependency license issues before merge, we can't produce SBOMs on demand, and the externally-generated SBOMs have significant gaps — 801 out of ~5,800 components across our 6 container image SBOMs had missing or hash-placeholder licenses that needed manual resolution.
We need first-party SBOM tooling integrated into our build system and CI pipeline.
Technical Context
The repo builds 3 container images (server, sandbox, cluster) for 2 architectures (amd64, arm64), totaling 6 image variants. The build system uses mise for task orchestration with TOML task files in
tasks/. Python scripts are invoked viauv run python <script>. Docker builds are managed throughtasks/docker.tomland helper scripts intasks/scripts/.Prototype scripts already exist (built during investigation) that:
sbom_to_csv.py)resolve_licenses.py) — achieved 751/801 (94%) resolution rateSyft (Apache-2.0, github.com/anchore/syft) is the scanner already used in the NVIDIA pipeline and is the natural choice for in-repo generation.
Decisions
mise run sbomdoes everything (generate → resolve → CSV). User harvests CSVs fromdeploy/sbom/output/. Subtasks exist for CI/advanced use but the primary interface is one command.mise run sbom.@openclaw/*,opencode-*, andpi-extension-*packages will be addressed in a follow-up.sbomagent skill (.agents/skills/sbom/SKILL.md) so agents can generate, resolve, and export SBOMs on demand.Affected Components
mise.tomltasks/*.tomltasks/sbom.tomlneededtasks/docker.tomltasks/publish.tomltasks/scripts/docker-publish-multiarch.sh:115--sbom=falseflagdeploy/docker/,deploy/helm/deploy/sbom/fits heretasks/license.toml,scripts/update_license_headers.py.agents/skills/sbomskill for on-demand SBOM operationsarchitecture/build-containers.md:408--sbom=falseECR flagTechnical Investigation
Architecture Overview
The build pipeline flows:
mise run docker:build:*→tasks/scripts/docker-build-component.sh→docker buildx build. Multi-arch publishing usestasks/scripts/docker-publish-multiarch.shwhich explicitly disables BuildKit SBOM generation (--sbom=falseat line 115) when pushing to ECR. There is no post-build SBOM step.The mise task system uses one TOML file per domain (e.g.,
docker.toml,license.toml,python.toml) with colon-namespaced task names. Python scripts are invoked viauv run python <script>. Syft can be added as a mise tool via"ubi:anchore/syft"(same pattern as sccache at line 21).Agent skills live in
.agents/skills/<name>/SKILL.mdand are loaded by the agent framework. Existing skills cover areas like cluster debugging, PR creation, security review, and TUI development. Ansbomskill would guide agents through generation, resolution, and export workflows.Code References
mise.toml:14-21mise.toml:44-45deploy/sbom/*.pyaddedmise.toml:47-48tasks/*.toml— newsbom.tomlauto-discoveredtasks/docker.toml:20-23docker:build:sandboxtasktasks/docker.toml:31-33docker:build:servertasktasks/docker.toml:36-39docker:build:clustertasktasks/docker.toml:41-49tasks/scripts/docker-publish-multiarch.sh:115--sbom=falseflag to disable BuildKit SBOMtasks/license.toml:6-13tasks/version.toml:8uv run python tasks/scripts/release.pypattern.agents/skills/sbomskillarchitecture/build-containers.md:408--sbom=falsein ECR publishingCurrent Behavior
tasks/license.tomlonly checks SPDX copyright headers in source files — it does not scan dependency licensesWhat Would Need to Change
New files:
deploy/sbom/sbom_to_csv.py— CycloneDX JSON to CSV converterdeploy/sbom/resolve_licenses.py— license resolution via public registry APIs + known license mapstasks/sbom.toml— mise task definitions.agents/skills/sbom/SKILL.md— agent skill for on-demand SBOM workflowsModified files:
mise.toml— add Syft to[tools], adddeploy/sbom/*.pytopython_pathstasks/publish.tomlortasks/docker.toml— optionally chain SBOM generation after image buildsCI integration (GitHub Actions):
Cargo.lock+uv.lock— advisory annotations for license issuesmise run sbom+ upload artifactsTask Design
Internally chains:
Output directory:
deploy/sbom/output/(gitignored).Automation Strategy
mise run sbomdeploy/sbom/output/sbomskill →mise run sbomsbom:checksubtask onlymise run sbom+ artifact uploadAgent Skill Design
The
sbomskill would cover:mise run sbomfor end-to-end SBOM generationPatterns to Follow
tasks/sbom.tomlfollowing the one-file-per-domain conventionsbomis the public entry point, subtasks are hiddenuv run python deploy/sbom/<script>.py(Pattern A from existing tasks)"ubi:anchore/syft" = "<version>"inmise.toml[tools].agents/skills/sbom/SKILL.mdfollowing existing skill patternsdeploy/sbom/output/added to.gitignoreProposed Approach
Add Syft as a mise-managed tool and create
deploy/sbom/with the two prototype scripts (refined for repo use). Createtasks/sbom.tomlwith a single top-levelsbomtask that chains generate → resolve → CSV, plus asbom:checksubtask for PR CI. Add an agent skill at.agents/skills/sbom/SKILL.md. For CI, add an advisory PR check and a release step that runsmise run sbomand uploads artifacts. Defer resolution of the 50 private/internal package licenses to a follow-up.Scope Assessment
featRisks & Open Questions
--sbom=falseflag indocker-publish-multiarch.shshould be revisited — BuildKit SBOM vs. standalone Syft scan is a design choiceUnresolved private packages— Deferred to follow-up issuePR check blocking vs. advisory— Decided: advisorySBOM storage location— Decided: release artifacts onlySingle command vs. separate tasks— Decided:mise run sbomdoes everythingTest Considerations
sbom:generateagainst a locally-built test image and verify output formatsbom:checkcorrectly produces advisory annotations for a known-bad licenseCreated by spike investigation. Use
build-from-issueto plan and implement.