-
-
Notifications
You must be signed in to change notification settings - Fork 128
SPDX 2.2 support and documentDescribes update to reference root element only #1856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
scanpipe/pipes/output.py
Outdated
| files_analyzed=True, | ||
| ) | ||
|
|
||
| packages_as_spdx = [project_as_root_package] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdruez Using project_as_root_package is incorrect imo as it's possible in ScanCode.io to upload multiple archives as such there would be multiple root packages so variable should be projects_as_root_packages as documentDescribes should be array of SPDX packages (one for each archive). E.g upload 5 archives to ScanCode.io in a project than there should be 5 SPDX root elements in documentDescribes of the resulting SPDX file.
In case ScanCode.io is given a single PURL for code repository as it's project input such as pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c then documentDescribes is still an array but should only contain a single package for the code repository that was scanned, see the comment of SPDX maintainer Rose spdx/spdx-spec#395 (comment).
If a single SPDX of Cyclone SBOM was provided as ScanCode.io input for a project then documentDescribes should point to the SPDX package for provided SBOM imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case ScanCode.io is given a single PURL for code repository as it's project input such as pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c then documentDescribes is still an array but should only contain a single package for the code repository that was scanned,
If a single SPDX of Cyclone SBOM was provided as ScanCode.io input for a project then documentDescribes should point to the SPDX package for provided SBOM imo.
@tsteenbe The code was adjusted to use the Project's input as the root package, addressing those 2 points.
The following forms of input are supported:
- Input manually copied to Project's inputs directory
- Input uploaded
- Input fetched:
download_url,purl,docker,git, ...)
Using project_as_root_package is incorrect imo as it's possible in ScanCode.io to upload multiple archives as such there would be multiple root packages so variable should be projects_as_root_packages as documentDescribes should be array of SPDX packages (one for each archive). E.g upload 5 archives to ScanCode.io in a project than there should be 5 SPDX root elements in documentDescribes of the resulting SPDX file.
Now, for the multiple inputs case, this will require additional design work and likely some changes in the SCIO architecture to properly track CodebaseResource and DiscoveredPackage objects back to their input origin.
This will be handled in a separate PR, since it first requires further discussion.
Also, note that projects with multiple inputs (e.g. when using the deploy_to_develop pipeline) are not expected to fetch SPDX documents.
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com>
|
Added support for downloading results as SPDX 2.2 (WebUI, REST API, CLI). |
Signed-off-by: tdruez <tdruez@nexb.com> # Conflicts: # scanpipe/tests/pipes/test_output.py
Signed-off-by: tdruez <tdruez@nexb.com>
Signed-off-by: tdruez <tdruez@nexb.com> # Conflicts: # scanpipe/pipes/spdx.py
Background
The
documentDescribesfield should describe the "root" software artifact(s) represented in the SPDX document—typically just one, such as the top-level project or container—but not all scanned packages or files, as previously implemented in ScanCode.io: aboutcode-org/scancode.io#564This migration of
documentDescribescontent is required to ensure that SPDX output generated by ScanCode.io can be reliably consumed as input by other Software Composition Analysis (SCA) tools. A concrete example is the ORT integration work tracked in #1727, where downstream tools expectdocumentDescribesto reference only the root package rather than all discovered elements.Changes made
Notes: for projects with multiple inputs, a root SPDX package (
project_as_root_package) is used for thedocumentDescribes.SPDX.Documentand itsas_dict()serialization logic to follow this model.test_output.py,test_spdx.py) to assert correct behavior and updated expected counts and data.documentDescribes.Impact
This aligns code with SPDX best practices, improves clarity for consumers, and ensures reliable test coverage.
It also enables interoperability with other SCA tools that depend on SPDX output following this convention.
Related Issues/Discussions