-
Notifications
You must be signed in to change notification settings - Fork 20
feat(sbom): normalize SBOMs for deterministic builds #281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add getGitCommitTimestamp() to retrieve deterministic timestamp from git commit or SOURCE_DATE_EPOCH environment variable - Add generateDeterministicUUID() to create UUIDv5 from content - Add normalizeCycloneDX() to normalize CycloneDX SBOM timestamps and UUIDs - Add normalizeSPDX() to normalize SPDX SBOM timestamps and UUIDs - Update writeSBOM() to call normalizers after SBOM generation - Add comprehensive unit tests for all normalization functions - Fix sbomSPDXFileExtension and sbomSyftFileExtension constants This enables reproducible builds where the same source produces the same artifact SHA256, supporting reliable caching and attestation verification. Co-authored-by: Ona <no-reply@ona.com>
geropl
approved these changes
Nov 18, 2025
Member
geropl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes and tests LGTM! ✔️
corneliusludmann
approved these changes
Nov 18, 2025
Contributor
corneliusludmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 2 comments with improvements (nothing major). Approve to unblock.
Add context parameter to getGitCommitTimestamp() to support cancellation: - Use exec.CommandContext instead of exec.Command - Allows git command to be terminated if build is cancelled - Prevents orphaned git processes - Respects build timeouts - Follows standard Go pattern for cancellable operations Also add test for context cancellation behavior. Addresses review feedback from Cornelius on PR #281. Co-authored-by: Ona <no-reply@ona.com>
Replace manual string manipulation with regex-based UUID matching: - Use regex pattern to find and replace UUIDs - Validate that a UUID exists before attempting replacement - Handle edge cases: UUID at end, in middle, multiple UUIDs, no UUID - Log warning if no UUID found in namespace (unexpected format) - Add comprehensive test coverage for all edge cases Benefits: - More robust: validates UUID format before replacement - Handles any UUID position in the namespace - Fails gracefully with warning instead of silently - Better test coverage for edge cases Addresses review feedback from Cornelius on PR #281. Co-authored-by: Ona <no-reply@ona.com>
Critical improvements to SPDX UUID replacement: 1. **Type validation**: Check that documentNamespace is a string - Prevents silent corruption when field has wrong type - Returns clear error message with actual type 2. **Empty validation**: Check that documentNamespace is not empty - Prevents invalid SBOM generation - Fails fast instead of silently continuing 3. **UUID validation**: Fail if no UUID found in namespace - Previously logged warning and continued (non-deterministic!) - Now returns error with helpful message - Alerts to potential Syft format changes 4. **Multiple UUID handling**: Log warning when multiple UUIDs found - Documents intentional behavior (replace all with same UUID) - Helps debugging unexpected formats 5. **Comprehensive edge case tests**: - documentNamespace is not a string - documentNamespace is empty - documentNamespace has no UUID - All cases now properly fail with clear errors Benefits: - Fails fast instead of silently producing non-deterministic builds - Better error messages for debugging - Catches unexpected SBOM format changes - Prevents SBOM corruption Addresses critical issues identified in PR review. Co-authored-by: Ona <no-reply@ona.com>
leodido
added a commit
that referenced
this pull request
Nov 18, 2025
Add context parameter to getGitCommitTimestamp() to support cancellation: - Use exec.CommandContext instead of exec.Command - Allows git command to be terminated if build is cancelled - Prevents orphaned git processes - Respects build timeouts - Follows standard Go pattern for cancellable operations Also add test for context cancellation behavior. Addresses review feedback from Cornelius on PR #281. Co-authored-by: Ona <no-reply@ona.com>
leodido
added a commit
that referenced
this pull request
Nov 18, 2025
Replace manual string manipulation with regex-based UUID matching: - Use regex pattern to find and replace UUIDs - Validate that a UUID exists before attempting replacement - Handle edge cases: UUID at end, in middle, multiple UUIDs, no UUID - Log warning if no UUID found in namespace (unexpected format) - Add comprehensive test coverage for all edge cases Benefits: - More robust: validates UUID format before replacement - Handles any UUID position in the namespace - Fails gracefully with warning instead of silently - Better test coverage for edge cases Addresses review feedback from Cornelius on PR #281. Co-authored-by: Ona <no-reply@ona.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
Makes SBOM generation deterministic by normalizing timestamps and UUIDs after Syft generates them. This enables reproducible builds where the same source produces the same artifact SHA256.
Part of https://linear.app/ona-team/issue/CLC-2096/prevent-attestation-overwrites-in-concurrent-builds
Part of https://linear.app/ona-team/issue/CLC-2097/improve-builds-determinism
Changes
Implementation (
pkg/leeway/sbom.go)getGitCommitTimestamp(): Retrieves deterministic timestamp from git commitSOURCE_DATE_EPOCHenvironment variable for reproducible buildsSOURCE_DATE_EPOCHis invalidgenerateDeterministicUUID(): Generates UUIDv5 from contentnormalizeCycloneDX(): Normalizes CycloneDX SBOMnormalizeSPDX(): Normalizes SPDX SBOMUpdated
writeSBOM(): Calls normalizers after SBOM generationTests (
pkg/leeway/sbom_normalize_test.go)Comprehensive test coverage including:
Why These Changes
Problem
SBOMs contained non-deterministic fields:
This prevented:
Solution
Post-process SBOMs after Syft generates them to replace non-deterministic values with deterministic ones based on git commit metadata.
Why Not Wait for Upstream?
Format-Specific Behavior
See anchore/syft#3931 for upstream discussion.
Testing
Unit Tests
All tests pass with coverage for normal operation and error conditions.
Manual Verification
Error Handling
Fails fast with clear error message if deterministic timestamp cannot be obtained:
Breaking Changes
None. This is a non-breaking enhancement that makes builds more deterministic.
Co-authored-by: Ona no-reply@ona.com