Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose file metadata for image contents #477

Open
Tracked by #444
wagoodman opened this issue Aug 10, 2021 · 3 comments · Fixed by #1383
Open
Tracked by #444

Expose file metadata for image contents #477

wagoodman opened this issue Aug 10, 2021 · 3 comments · Fixed by #1383
Labels
enhancement New feature or request format:spdx SPDX related enhancement or bug I/O Describes bug or enhancement around application input or output

Comments

@wagoodman
Copy link
Contributor

wagoodman commented Aug 10, 2021

Today the package catalogers expose some file information from the cataloging source, not directly about the file on disk (e.g. indirect file metadata from the RPM DB, not metadata gotten directly from the file location in the image archive). It would be interesting to expose out direct (not indirect) file metadata information as artifacts in at least the context of SPDX SBOM format.

This involves looking at the existing file cataloger and understanding if it should be invoked conditionally based on the user output format option, or directly by the presenter object (not ideal), or something else.

@wagoodman wagoodman mentioned this issue Aug 10, 2021
2 tasks
@wagoodman wagoodman changed the title Expose file metadata for image contents [SPDX] Expose file metadata for image contents Aug 10, 2021
@luhring luhring added the enhancement New feature or request label Aug 10, 2021
@wagoodman wagoodman added the I/O Describes bug or enhancement around application input or output label Aug 23, 2021
@wagoodman wagoodman changed the title [SPDX] Expose file metadata for image contents Expose file metadata for image contents Oct 16, 2021
@wagoodman wagoodman added the format:spdx SPDX related enhancement or bug label Oct 16, 2021
@zhill
Copy link
Member

zhill commented Apr 8, 2022

It would be helpful to have the "expected" and "observed" metadata (uid, guid, mode, checksums) for the files so that a user can determine if the pkgdb entry matches the actual content. I'm not sure how much of that is necessary for SPDX in particular, but it would have value beyond that IMO.

@wagoodman
Copy link
Contributor Author

This has effectively been implemented and turned on by default in #1383. Specifically, in the files section of the SBOM we now catalog file metadata and digests for all files that are claimed to be owned by a package by default. The user additionally has the option to change the files reported out by changing the file.metadata.selection to all or none. There isn't any specific claims about if metadata from a package matches that of what was actually observed, however, the document is raising up enough information to be able to discern this now.

@github-project-automation github-project-automation bot moved this to Done in OSS Feb 7, 2024
@wagoodman wagoodman added changelog-ignore Don't include this issue in the release changelog and removed changelog-ignore Don't include this issue in the release changelog labels Feb 7, 2024
@wagoodman wagoodman reopened this Feb 7, 2024
@wagoodman
Copy link
Contributor Author

wagoodman commented Feb 7, 2024

I got a little ahead of myself on claiming a victory here. Though the above comment is true, what is missing is tying this back to what SPDX can express in terms of FilesAnalyzed:

func toPackageChecksums(p pkg.Package) ([]spdx.Checksum, bool) {
filesAnalyzed := false
var checksums []spdx.Checksum
switch meta := p.Metadata.(type) {
// we generate digest for some Java packages
// spdx.github.io/spdx-spec/package-information/#710-package-checksum-field
case pkg.JavaArchive:
// if syft has generated the digest here then filesAnalyzed is true
if len(meta.ArchiveDigests) > 0 {
filesAnalyzed = true
for _, digest := range meta.ArchiveDigests {
algo := strings.ToUpper(digest.Algorithm)
checksums = append(checksums, spdx.Checksum{
Algorithm: spdx.ChecksumAlgorithm(algo),
Value: digest.Value,
})
}
}
case pkg.GolangBinaryBuildinfoEntry:
// because the H1 digest is found in the Golang metadata we cannot claim that the files were analyzed
algo, hexStr, err := helpers.HDigestToSHA(meta.H1Digest)
if err != nil {
log.Debugf("invalid h1digest: %s: %v", meta.H1Digest, err)
break
}
algo = strings.ToUpper(algo)
checksums = append(checksums, spdx.Checksum{
Algorithm: spdx.ChecksumAlgorithm(algo),
Value: hexStr,
})
}
return checksums, filesAnalyzed
}

To really run this to ground we would need to find the elements from the files section of the syft core SBOM model that correspond with the packages claimed to be owned by the package and report out any checksums we may have.

@wagoodman wagoodman moved this from Done to Ready in OSS Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request format:spdx SPDX related enhancement or bug I/O Describes bug or enhancement around application input or output
Projects
Status: Ready
Development

Successfully merging a pull request may close this issue.

3 participants