Skip to content

feat: Maven repo allow-list with metadata-driven enforcement#1696

Draft
csasalu wants to merge 5 commits intoconforma:mainfrom
csasalu:feature/maven_sbom_check
Draft

feat: Maven repo allow-list with metadata-driven enforcement#1696
csasalu wants to merge 5 commits intoconforma:mainfrom
csasalu:feature/maven_sbom_check

Conversation

@csasalu
Copy link
Copy Markdown
Contributor

@csasalu csasalu commented Mar 12, 2026

Implemented strict repository URL validation with time-based enforcement handled via policy metadata directives.

  • Restricted Maven builds to approved repositories only.
  • Added 'effective_from' metadata annotation to allow for a grace period before hard enforcement.All violations are flagged as warnings until the 'effective_from' date.
  • Support for SPDX and CycloneDX.

Ref: https://redhat.atlassian.net/browse/KONFLUX-10598

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
policy/lib/sbom/maven.rego 100.00% <100.00%> (ø)
policy/lib/sbom/maven_test.rego 100.00% <100.00%> (ø)
policy/release/maven_repos/maven_repos.rego 100.00% <100.00%> (ø)
policy/release/maven_repos/maven_repos_test.rego 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@csasalu csasalu force-pushed the feature/maven_sbom_check branch 6 times, most recently from fc42082 to f8e5e05 Compare March 16, 2026 14:41
@csasalu csasalu marked this pull request as draft March 16, 2026 18:54
@csasalu csasalu force-pushed the feature/maven_sbom_check branch 4 times, most recently from 4d11bcd to 8aae2db Compare March 16, 2026 20:27
@csasalu csasalu marked this pull request as ready for review March 17, 2026 11:53
@csasalu csasalu force-pushed the feature/maven_sbom_check branch from 8aae2db to c939895 Compare March 17, 2026 20:11
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 17, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 502c9b0e-4e7d-4d97-ba9d-e6ee74fd8ae4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Maven SBOM extraction utilities, a release policy enforcing allowed Maven repository URLs (with defaulting to Maven Central), tests for extraction and policy behavior, Antora documentation and nav entries, and a new empty redhat_maven policy collection module.

Changes

Cohort / File(s) Summary
Documentation
antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc, antora/docs/modules/ROOT/pages/release_policy.adoc, antora/docs/modules/ROOT/partials/release_policy_nav.adoc
New Antora page for maven_repos package and two rules; release_policy updated to include maven_repos in redhat and new redhat_maven collection; nav entries added linking to the new rules.
Maven SBOM Extraction Library
policy/lib/sbom/maven.rego, policy/lib/sbom/maven_test.rego
New lib.sbom.maven module extracting Maven packages (purl, name, repository_url) from CycloneDX and SPDX, plus comprehensive unit tests covering extraction, filtering, defaulting, multiple refs, and combined sources.
Maven Repository Policy Enforcement
policy/release/maven_repos/maven_repos.rego, policy/release/maven_repos/maven_repos_test.rego
New release.maven_repos policy that validates allowed_maven_repositories rule-data and denies packages whose effective repository URL (defaults to Maven Central when blank) is not in the allowed list; includes tests for permitted/denied/default/missing-data cases and helper assertions.
Policy Collection Metadata
policy/release/collection/redhat_maven/redhat_maven.rego
Adds collection.redhat_maven module with metadata and rego.v1 import (no rules implemented).

Sequence Diagram(s)

sequenceDiagram
  participant SBOM as SBOM (CycloneDX / SPDX)
  participant Lib as lib.sbom.maven
  participant Policy as release.maven_repos
  participant RuleData as rule_data.allowed_maven_repositories
  participant Results as Policy Engine / Results

  SBOM->>Lib: provide SBOM packages
  Lib->>Policy: emit package objects {purl, name, repository_url}
  Policy->>Policy: for each package -> _get_effective_url(repository_url) (defaults to Maven Central if empty)
  Policy->>RuleData: query allowed_maven_repositories
  RuleData-->>Policy: allowed list or missing/invalid
  Policy->>Policy: _url_is_permitted(effective_url)
  alt permitted
    Policy->>Results: no deny for package
  else not permitted
    Policy->>Results: emit deny_unpermitted_urls result (includes term=purl)
  else missing/invalid rule data
    Policy->>Results: emit policy_data_missing result
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: implementing Maven repository allow-list validation with metadata-driven enforcement timing control.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description accurately describes the key changes: implementing strict Maven repository URL validation, adding metadata-driven enforcement with 'effective_from' grace periods, and supporting both SPDX and CycloneDX formats, all of which align with the substantial file changes documented.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
policy/release/maven_repos/maven_repos.rego (1)

8-16: Package-level metadata appears misplaced or duplicated.

The package-level metadata (lines 1-17) includes rule-specific fields like short_name, failure_msg, and solution, but this metadata won't be used by either deny rule since they each have their own # METADATA blocks with scope: rule. This package-level metadata with short_name: deny_unpermitted_urls_1 appears to be dead configuration.

If this metadata is intended for the package description only, consider removing the rule-specific fields (short_name, failure_msg, solution, effective_on) from the package-level metadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/release/maven_repos/maven_repos.rego` around lines 8 - 16, The
package-level metadata block in maven_repos.rego contains rule-specific fields
that are duplicated by the per-rule `# METADATA` blocks (remove the dead
config): delete the fields `short_name`, `failure_msg`, `solution`, and
`effective_on` from the package-level comment header (you can leave package
description/collections if needed) so only the individual deny rules' `#
METADATA` sections define those rule-specific entries; alternatively move any
intended package-wide description into a minimal package comment and keep rule
metadata inside the `# METADATA` blocks for the deny rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@policy/release/collection/redhat_maven/redhat_rpms.rego`:
- Around line 1-8: The file name redhat_rpms.rego is inconsistent with the
declared package and metadata (package collection.redhat_maven,
title/description referencing Red Hat Maven); rename the file to
redhat_maven.rego (or alternatively change the package/title to match RPM
semantics) so filename, package name (collection.redhat_maven) and metadata
(title/description) are consistent; locate the file currently named
redhat_rpms.rego and perform the rename/update accordingly.

In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 67-76: The test_spdx_default_fail currently only evaluates deny
and won't fail because it doesn't assert the expected outcome and the mocked
SPDX package puts the purl inside externalRefs so maven.packages'
startswith(item.purl, "pkg:maven/") check never triggers; update the test to (a)
include a top-level "purl" field on the package (e.g., "purl":
"pkg:maven/org.base/no-url@1.0") or adjust the mock so item.purl is present, and
(b) add an assertion that deny is non-empty or equals the expected set (use
lib.assert_equal or assert count(deny) > 0) so test_spdx_default_fail fails when
deny is produced by the maven.packages rule.
- Around line 36-45: In the test_default_maven_central_pass test, the mock
binding uses the whole cdx_input object instead of its components array; update
the binding so data.lib.cyclonedx.packages is bound to cdx_input.components
(i.e., replace the current with data.lib.cyclonedx.packages as cdx_input with
with data.lib.cyclonedx.packages as cdx_input.components) so the rule under test
receives the components array; reference test_default_maven_central_pass and the
cdx_input variable when making the change.
- Around line 11-20: The test_cyclonedx_permitted mock binds cdx_input as an
object with a components field but then injects the whole object into
data.lib.cyclonedx.packages; change the binding so the array is passed: use
cdx_input.components when applying the with-modifier to
data.lib.cyclonedx.packages (leave the cdx_input variable and structure as-is,
just pass its .components array), ensuring the rule under test sees the expected
array rather than the wrapper object.
- Around line 22-34: The test defines spdx_input as an object with a "packages"
field but the mock binding expects the packages array itself; update the
test_spdx_permitted so the mock binds correctly by either (A) setting spdx_input
to the packages array directly (i.e., spdx_input := [{ ... }] ) or (B) change
the with binding to use spdx_input.packages (with data.lib.spdx.packages as
spdx_input.packages) so the rule_data mock uses the packages array; adjust the
symbol references test_spdx_permitted, spdx_input, deny, and
data.lib.spdx.packages accordingly.

---

Nitpick comments:
In `@policy/release/maven_repos/maven_repos.rego`:
- Around line 8-16: The package-level metadata block in maven_repos.rego
contains rule-specific fields that are duplicated by the per-rule `# METADATA`
blocks (remove the dead config): delete the fields `short_name`, `failure_msg`,
`solution`, and `effective_on` from the package-level comment header (you can
leave package description/collections if needed) so only the individual deny
rules' `# METADATA` sections define those rule-specific entries; alternatively
move any intended package-wide description into a minimal package comment and
keep rule metadata inside the `# METADATA` blocks for the deny rules.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c034ec79-4ec5-4f7e-a4b4-a0bfc9fead3a

📥 Commits

Reviewing files that changed from the base of the PR and between bc6f4bd and c939895.

📒 Files selected for processing (8)
  • antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc
  • antora/docs/modules/ROOT/pages/release_policy.adoc
  • antora/docs/modules/ROOT/partials/release_policy_nav.adoc
  • policy/lib/sbom/maven.rego
  • policy/lib/sbom/maven_test.rego
  • policy/release/collection/redhat_maven/redhat_rpms.rego
  • policy/release/maven_repos/maven_repos.rego
  • policy/release/maven_repos/maven_repos_test.rego

@csasalu csasalu force-pushed the feature/maven_sbom_check branch from c939895 to ba0a7b8 Compare March 17, 2026 20:26
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
policy/release/maven_repos/maven_repos_test.rego (3)

67-76: ⚠️ Potential issue | 🟠 Major

This SPDX failure case doesn't hit the Maven extractor.

Lines 68-72 omit the top-level purl that policy/lib/sbom/maven.rego filters on, so the mocked package is ignored. Also compare the expected deny set so this test actually proves the default-to-Maven-Central path.

💡 Suggested fix
 test_spdx_default_fail if {
 	spdx_input := {"packages": [{
 		"name": "no-url",
+		"purl": "pkg:maven/org.base/no-url@1.0",
 		"externalRefs": [{"referenceType": "purl", "referenceLocator": "pkg:maven/org.base/no-url@1.0"}],
 		"downloadLocation": "NOASSERTION",
 	}]}
 
-	deny with data.lib.spdx.packages as spdx_input.packages
-		with data.rule_data as {"allowed_maven_repositories": ["https://internal.repo/"]}
+	expected := {{
+		"code": "release.maven_repos.deny_unpermitted_urls",
+		"msg": "Package \"pkg:maven/org.base/no-url@1.0\" (source: \"https://repo.maven.apache.org/maven2/\") is not in the permitted list",
+		"effective_on": "2026-05-10T00:00:00Z",
+		"term": "pkg:maven/org.base/no-url@1.0",
+	}}
+
+	lib.assert_equal(deny, expected) with data.lib.spdx.packages as spdx_input.packages
+		with data.rule_data as {"allowed_maven_repositories": ["https://internal.repo/"]}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/release/maven_repos/maven_repos_test.rego` around lines 67 - 76, The
test_spdx_default_fail case currently constructs a package without the top-level
"purl" field so the matcher in policy/lib/sbom/maven.rego ignores it; update the
mocked spdx_input in test_spdx_default_fail to include the top-level "purl"
(matching the purl format used by data.lib.spdx.packages) and adjust the
assertion for the deny set to check that the deny result includes the Maven
Central default repository path (so the test verifies the
default-to-Maven-Central behavior when using data.lib.spdx.packages with the
rule_data allowed_maven_repositories override).

18-19: ⚠️ Potential issue | 🟠 Major

Bind the CycloneDX tests to cdx_input.components.

policy/lib/sbom/maven.rego iterates data.lib.cyclonedx.packages as the package array. Lines 19 and 44 pass the wrapper object instead, so both tests can miss the extraction path they are meant to cover.

💡 Suggested fix
 	lib.assert_empty(deny) with data.rule_data as mock_data
-		with data.lib.cyclonedx.packages as cdx_input
+		with data.lib.cyclonedx.packages as cdx_input.components
@@
 	lib.assert_empty(deny) with data.rule_data as mock_data
-		with data.lib.cyclonedx.packages as cdx_input
+		with data.lib.cyclonedx.packages as cdx_input.components

Also applies to: 43-44

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/release/maven_repos/maven_repos_test.rego` around lines 18 - 19, The
tests bind the CycloneDX fixture to the wrapper object instead of the actual
components array, so update the test `with` bindings to pass the components
array into the policy path used by `policy/lib/sbom/maven.rego`. Specifically,
change occurrences that read `with data.lib.cyclonedx.packages as cdx_input` to
`with data.lib.cyclonedx.packages.components as cdx_input` for the
`lib.assert_empty(deny) with data.rule_data as mock_data with ... as cdx_input`
assertions (apply the same change to the second occurrence around the later
`lib.assert_empty(deny)`), leaving `lib.assert_empty`, `deny`, and
`data.rule_data` as-is.

32-33: ⚠️ Potential issue | 🟠 Major

Bind the SPDX happy-path test to spdx_input.packages.

policy/lib/sbom/maven.rego iterates data.lib.spdx.packages as the package array. Line 33 passes the wrapper object, so this test does not exercise the intended branch.

💡 Suggested fix
 	lib.assert_empty(deny) with data.rule_data as mock_data
-		with data.lib.spdx.packages as spdx_input
+		with data.lib.spdx.packages as spdx_input.packages
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/release/maven_repos/maven_repos_test.rego` around lines 32 - 33, The
test currently binds data.lib.spdx.packages to spdx_input (the whole wrapper) so
the policy branch that iterates data.lib.spdx.packages isn't exercised; change
the binding to use the inner array by replacing the second "with" to "with
data.lib.spdx.packages as spdx_input.packages" so lib.assert_empty(deny) is run
with data.lib.spdx.packages mapped to the mock's packages field (refer to
lib.assert_empty(deny), data.lib.spdx.packages, and spdx_input.packages).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@policy/lib/sbom/maven.rego`:
- Around line 33-39: The helper _extract_cdx_repo(component) currently uses
"some ref" which can yield multiple urls; change it to select deterministically
(e.g., iterate by index: "some i" and set ref := component.externalRefs[i]" and
pick the smallest i that satisfies ref.type in
["distribution","artifact-repository"] or implement a precedence list and pick
the first match) so the function always returns a single URL for a component;
update the analogous occurrence at lines 52–55 the same way, and add a
regression test that constructs a component with multiple matching externalRefs
to assert only one package is produced with a stable repository_url.

In `@policy/release/maven_repos/maven_repos.rego`:
- Around line 62-73: The _rule_data_errors helper is checking data.rule_data
directly which misses sources that lib.rule_data("allowed_maven_repositories")
handles (config, custom data, defaults); update _rule_data_errors to call
lib.rule_data("allowed_maven_repositories") to retrieve permitted instead of
reading data.rule_data, then pass that result into _is_invalid_data (and keep
_is_invalid_data handling the "UNDEFINED_IN_DATA" sentinel), so the same
rule-data resolution used by _url_is_permitted is used when validating presence.
- Around line 44-56: The rule currently keys violations by
`_repo_url_errors[purl]` which causes unification conflicts when the same `purl`
has multiple different `source` values; change `_repo_url_errors` to emit a set
of error objects (e.g. { "purl": purl, "source": source, "msg": msg }) or use a
composite key like `_repo_url_errors[[purl, source]] := msg` instead of
`_repo_url_errors[purl]`, and update the `deny` construction (the `deny` rule
that uses `_repo_url_errors` and the `result`/`base` creation in `deny contains
result`) to iterate those objects/keys and include both `purl` and `source` (use
`_get_effective_url(pkg.repository_url)`, `_url_is_permitted`, and
`maven.packages` to locate the generation site) so each distinct (purl, source)
violation is reported without unification errors.

---

Duplicate comments:
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 67-76: The test_spdx_default_fail case currently constructs a
package without the top-level "purl" field so the matcher in
policy/lib/sbom/maven.rego ignores it; update the mocked spdx_input in
test_spdx_default_fail to include the top-level "purl" (matching the purl format
used by data.lib.spdx.packages) and adjust the assertion for the deny set to
check that the deny result includes the Maven Central default repository path
(so the test verifies the default-to-Maven-Central behavior when using
data.lib.spdx.packages with the rule_data allowed_maven_repositories override).
- Around line 18-19: The tests bind the CycloneDX fixture to the wrapper object
instead of the actual components array, so update the test `with` bindings to
pass the components array into the policy path used by
`policy/lib/sbom/maven.rego`. Specifically, change occurrences that read `with
data.lib.cyclonedx.packages as cdx_input` to `with
data.lib.cyclonedx.packages.components as cdx_input` for the
`lib.assert_empty(deny) with data.rule_data as mock_data with ... as cdx_input`
assertions (apply the same change to the second occurrence around the later
`lib.assert_empty(deny)`), leaving `lib.assert_empty`, `deny`, and
`data.rule_data` as-is.
- Around line 32-33: The test currently binds data.lib.spdx.packages to
spdx_input (the whole wrapper) so the policy branch that iterates
data.lib.spdx.packages isn't exercised; change the binding to use the inner
array by replacing the second "with" to "with data.lib.spdx.packages as
spdx_input.packages" so lib.assert_empty(deny) is run with
data.lib.spdx.packages mapped to the mock's packages field (refer to
lib.assert_empty(deny), data.lib.spdx.packages, and spdx_input.packages).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a15faa09-66d3-44be-8627-fd4e31d6d519

📥 Commits

Reviewing files that changed from the base of the PR and between c939895 and ba0a7b8.

📒 Files selected for processing (8)
  • antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc
  • antora/docs/modules/ROOT/pages/release_policy.adoc
  • antora/docs/modules/ROOT/partials/release_policy_nav.adoc
  • policy/lib/sbom/maven.rego
  • policy/lib/sbom/maven_test.rego
  • policy/release/collection/redhat_maven/redhat_maven.rego
  • policy/release/maven_repos/maven_repos.rego
  • policy/release/maven_repos/maven_repos_test.rego
🚧 Files skipped from review as they are similar to previous changes (2)
  • antora/docs/modules/ROOT/pages/release_policy.adoc
  • antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc

Comment on lines +44 to +56
deny contains result if {
some purl, msg in _repo_url_errors
base := lib.result_helper(rego.metadata.chain(), [msg])
result := object.union(base, {"term": purl})
}

_repo_url_errors[purl] := msg if {
some pkg in maven.packages
purl := pkg.purl
source := _get_effective_url(pkg.repository_url)
not _url_is_permitted(source)
msg := sprintf("Package %q (source: %q) is not in the permitted list", [purl, source])
}
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n policy/release/maven_repos/maven_repos.rego | sed -n '40,60p'

Repository: conforma/policy

Length of output: 809


🏁 Script executed:

cat -n policy/lib/sbom/maven.rego

Repository: conforma/policy

Length of output: 1681


🏁 Script executed:

# Check if there are any imports or references that clarify package aggregation
rg "packages" policy/lib/sbom/maven.rego -A 3 -B 3

Repository: conforma/policy

Length of output: 655


🏁 Script executed:

# Check for tests that might show the duplicate purl scenario
find . -name "*test*" -o -name "*example*" | grep -i maven | head -20

Repository: conforma/policy

Length of output: 143


🏁 Script executed:

# Search for any existing handling of duplicate purls or conflict resolution
rg "purl" policy/release/maven_repos/ -B 2 -A 2

Repository: conforma/policy

Length of output: 3233


🏁 Script executed:

# Look for any unification or conflict handling in related code
rg "conflict\|unification\|duplicate" policy/ --type rego -i | head -20

Repository: conforma/policy

Length of output: 87


🏁 Script executed:

# Check if there are any comments about this issue in the codebase
rg "repo.*url.*error\|duplicate.*purl" policy/ -i

Repository: conforma/policy

Length of output: 41


🏁 Script executed:

cat -n policy/release/maven_repos/maven_repos_test.rego

Repository: conforma/policy

Length of output: 4141


🏁 Script executed:

# Check what the test coverage actually is
cat -n policy/lib/sbom/maven_test.rego

Repository: conforma/policy

Length of output: 2616


🏁 Script executed:

# Look for any comments or documentation about handling duplicates
rg "duplicate\|conflict\|same purl" policy/release/maven_repos/ -i

Repository: conforma/policy

Length of output: 41


🏁 Script executed:

# Check if there are any other rules or handlers that deal with merging or deduplication
rg "_repo_url_errors" policy/release/maven_repos/ -B 3 -A 3

Repository: conforma/policy

Length of output: 947


🏁 Script executed:

# Verify if there's any recent issue or PR that mentions this scenario
git log --oneline policy/release/maven_repos/ | head -20

Repository: conforma/policy

Length of output: 127


Don't key repo violations by purl alone.

policy/lib/sbom/maven.rego merges packages from both CycloneDX and SPDX sources. If the same Maven artifact appears in both with different unpermitted repository URLs, lines 50-56 will attempt to assign multiple values to _repo_url_errors[purl], causing a unification conflict in Rego. This results in lost violations and incomplete error reporting. Restructure to emit a set of error objects containing both purl and source, or key by (purl, source) tuple instead.

💡 Suggested fix
 deny contains result if {
-	some purl, msg in _repo_url_errors
-	base := lib.result_helper(rego.metadata.chain(), [msg])
-	result := object.union(base, {"term": purl})
+	some err in _repo_url_errors
+	base := lib.result_helper(rego.metadata.chain(), [err.msg])
+	result := object.union(base, {"term": err.purl})
 }
 
-_repo_url_errors[purl] := msg if {
+_repo_url_errors contains err if {
 	some pkg in maven.packages
-	purl := pkg.purl
 	source := _get_effective_url(pkg.repository_url)
 	not _url_is_permitted(source)
-	msg := sprintf("Package %q (source: %q) is not in the permitted list", [purl, source])
+	err := {
+		"purl": pkg.purl,
+		"msg": sprintf("Package %q (source: %q) is not in the permitted list", [pkg.purl, source]),
+	}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
deny contains result if {
some purl, msg in _repo_url_errors
base := lib.result_helper(rego.metadata.chain(), [msg])
result := object.union(base, {"term": purl})
}
_repo_url_errors[purl] := msg if {
some pkg in maven.packages
purl := pkg.purl
source := _get_effective_url(pkg.repository_url)
not _url_is_permitted(source)
msg := sprintf("Package %q (source: %q) is not in the permitted list", [purl, source])
}
deny contains result if {
some err in _repo_url_errors
base := lib.result_helper(rego.metadata.chain(), [err.msg])
result := object.union(base, {"term": err.purl})
}
_repo_url_errors contains err if {
some pkg in maven.packages
source := _get_effective_url(pkg.repository_url)
not _url_is_permitted(source)
err := {
"purl": pkg.purl,
"msg": sprintf("Package %q (source: %q) is not in the permitted list", [pkg.purl, source]),
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/release/maven_repos/maven_repos.rego` around lines 44 - 56, The rule
currently keys violations by `_repo_url_errors[purl]` which causes unification
conflicts when the same `purl` has multiple different `source` values; change
`_repo_url_errors` to emit a set of error objects (e.g. { "purl": purl,
"source": source, "msg": msg }) or use a composite key like
`_repo_url_errors[[purl, source]] := msg` instead of `_repo_url_errors[purl]`,
and update the `deny` construction (the `deny` rule that uses `_repo_url_errors`
and the `result`/`base` creation in `deny contains result`) to iterate those
objects/keys and include both `purl` and `source` (use
`_get_effective_url(pkg.repository_url)`, `_url_is_permitted`, and
`maven.packages` to locate the generation site) so each distinct (purl, source)
violation is reported without unification errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit : 3cd2b68

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
git show 3cd2b68b7679f6deaa4b381f010567a8abd8ce66 -- policy/release/maven_repos/maven_repos.rego

Length of output: 2319


@csasalu, the commit doesn't fully address the reported issue. The change only updated the import from data.lib.sbom.maven to data.lib.sbom, but the _repo_url_errors[purl] := msg keying pattern is still unchanged:

_repo_url_errors[purl] := msg if {
    some pkg in sbom.packages   # ← import changed, but...
    purl := pkg.purl
    source := _get_effective_url(pkg.repository_url)
    not _url_is_permitted(source)
    msg := sprintf(...)         # ← still keyed by purl only
}

The unification conflict remains: if the same purl appears with two distinct unpermitted source URLs, Rego will still fail to unify two different msg values for _repo_url_errors[purl], silently dropping violations. The fix from the suggestion needs to be applied:

deny contains result if {
    some err in _repo_url_errors
    base := lib.result_helper(rego.metadata.chain(), [err.msg])
    result := object.union(base, {"term": err.purl})
}

_repo_url_errors contains err if {
    some pkg in sbom.packages
    source := _get_effective_url(pkg.repository_url)
    not _url_is_permitted(source)
    err := {
        "purl": pkg.purl,
        "msg": sprintf("Package %q (source: %q) is not in the permitted list", [pkg.purl, source]),
    }
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this suggestion is valid

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d471a1a

@csasalu csasalu marked this pull request as draft March 19, 2026 14:21
@csasalu csasalu force-pushed the feature/maven_sbom_check branch 3 times, most recently from c98ca2d to 3e83fd6 Compare March 20, 2026 10:44
@csasalu csasalu marked this pull request as ready for review March 20, 2026 10:44
@csasalu csasalu force-pushed the feature/maven_sbom_check branch from 3e83fd6 to fa5baca Compare March 20, 2026 10:56
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
policy/lib/sbom/maven_test.rego (1)

66-74: Consider adding a test for multiple matching externalRefs.

The test coverage is solid for the common cases. Given the non-deterministic behavior noted in _extract_cdx_repo/_extract_spdx_repo when multiple refs match, consider adding a regression test that asserts expected behavior when a component has multiple externalRefs with matching types.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/lib/sbom/maven_test.rego` around lines 66 - 74, Add a regression test
to cover the case where a CycloneDX/SPDX component has multiple matching
externalRefs so we can lock in the current non-deterministic behavior of
_extract_cdx_repo/_extract_spdx_repo; update or add a test (e.g., alongside
test_combined_sources) that constructs mock_cdx and/or mock_spdx entries where a
single component contains multiple externalRefs of the same matching type and
then assert the output of maven.packages (via the same with-data pattern) equals
the expected deterministic choice (or explicitly documents which ref is
selected) to prevent future regressions in _extract_cdx_repo/_extract_spdx_repo
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@policy/lib/sbom/maven_test.rego`:
- Around line 66-74: Add a regression test to cover the case where a
CycloneDX/SPDX component has multiple matching externalRefs so we can lock in
the current non-deterministic behavior of _extract_cdx_repo/_extract_spdx_repo;
update or add a test (e.g., alongside test_combined_sources) that constructs
mock_cdx and/or mock_spdx entries where a single component contains multiple
externalRefs of the same matching type and then assert the output of
maven.packages (via the same with-data pattern) equals the expected
deterministic choice (or explicitly documents which ref is selected) to prevent
future regressions in _extract_cdx_repo/_extract_spdx_repo behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d2eb81fe-528e-4f7e-bdb4-de5c852a9e87

📥 Commits

Reviewing files that changed from the base of the PR and between 3e83fd6 and fa5baca.

📒 Files selected for processing (8)
  • antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc
  • antora/docs/modules/ROOT/pages/release_policy.adoc
  • antora/docs/modules/ROOT/partials/release_policy_nav.adoc
  • policy/lib/sbom/maven.rego
  • policy/lib/sbom/maven_test.rego
  • policy/release/collection/redhat_maven/redhat_maven.rego
  • policy/release/maven_repos/maven_repos.rego
  • policy/release/maven_repos/maven_repos_test.rego
✅ Files skipped from review due to trivial changes (3)
  • antora/docs/modules/ROOT/partials/release_policy_nav.adoc
  • policy/release/collection/redhat_maven/redhat_maven.rego
  • antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc
🚧 Files skipped from review as they are similar to previous changes (2)
  • antora/docs/modules/ROOT/pages/release_policy.adoc
  • policy/release/maven_repos/maven_repos_test.rego

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
policy/release/maven_repos/maven_repos_test.rego (1)

43-44: ⚠️ Potential issue | 🟠 Major

Fix CycloneDX mock binding to avoid vacuous pass

cdx_input is a wrapper object, but the override targets data.lib.cyclonedx.packages (array). Passing the wrapper can make this test pass without evaluating any package.

Proposed fix
-	lib.assert_empty(deny) with data.rule_data as mock_data
-		with data.lib.cyclonedx.packages as cdx_input
+	lib.assert_empty(deny) with data.rule_data as mock_data
+		with data.lib.cyclonedx.packages as cdx_input.components

As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policy/release/maven_repos/maven_repos_test.rego` around lines 43 - 44, The
test overrides the array at data.lib.cyclonedx.packages with the wrapper object
cdx_input, which is a mismatch and can make lib.assert_empty(deny) vacuously
pass; change the mock binding to override the wrapper itself so the policy sees
the correct structure — replace the second `with` in the failing test so it
binds `data.lib.cyclonedx` to `cdx_input` (or alternatively bind
`data.lib.cyclonedx.packages` to `cdx_input.packages`) so the `deny` evaluation
uses the actual packages array.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 112-132: The test test_cyclonedx_multiple_refs_behavior currently
only checks that a package's repository_url is one of the expected URLs; change
it to assert the exact set and count so regressions to a single URL fail: after
computing pkg_list := data.lib.sbom.maven.packages with data.lib.cyclonedx as
mock_cdx, find the specific package (e.g., by name or purl) and assert that the
set of repository_url values equals
{"https://first.repo.com","https://second.repo.com"} and that the
length/cardinality of that set is 2; update any similar test (the one around
lines 134-154) the same way so both the URL set and exact cardinality are
enforced.

---

Duplicate comments:
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 43-44: The test overrides the array at data.lib.cyclonedx.packages
with the wrapper object cdx_input, which is a mismatch and can make
lib.assert_empty(deny) vacuously pass; change the mock binding to override the
wrapper itself so the policy sees the correct structure — replace the second
`with` in the failing test so it binds `data.lib.cyclonedx` to `cdx_input` (or
alternatively bind `data.lib.cyclonedx.packages` to `cdx_input.packages`) so the
`deny` evaluation uses the actual packages array.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e2970946-3644-4f74-b76c-109af8356348

📥 Commits

Reviewing files that changed from the base of the PR and between fa5baca and e99ada8.

📒 Files selected for processing (3)
  • policy/lib/sbom/maven.rego
  • policy/lib/sbom/maven_test.rego
  • policy/release/maven_repos/maven_repos_test.rego

@csasalu csasalu force-pushed the feature/maven_sbom_check branch 2 times, most recently from 1d0d0dd to 207242c Compare March 20, 2026 20:16
@st3penta
Copy link
Copy Markdown
Contributor

Hi @csasalu, do you have a Jira related to this PR? If so, can you link it?
I also noticed you are actively working on this, is it actually ready for review? If you plan to do more work on it you could move it to draft until it's ready

@csasalu
Copy link
Copy Markdown
Contributor Author

csasalu commented Mar 23, 2026

Hi @csasalu, do you have a Jira related to this PR? If so, can you link it? I also noticed you are actively working on this, is it actually ready for review? If you plan to do more work on it you could move it to draft until it's ready

Hi @st3penta, Finished fixing all the code review items today. This is ready for review.
https://redhat.atlassian.net/browse/KONFLUX-10598

Comment on lines +8 to +11
# custom:
# collections:
# - redhat
# - redhat_maven
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we put collections metadata on packages? I'm not sure.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

@csasalu csasalu Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package level collections metadata is inherited by all rules of the package. Here is sample code to verify in the rego playground - https://play.openpolicyagent.org/

# METADATA
# scope: package
# custom:
#   collections:
#   - redhat
#   - redhat_maven
package play.inheritance


# METADATA
# scope: rule
# title: Version Consistency Check
# custom:
#   collections: 
#   - policy_data
deny[msg] if {
	
	# 1. Gather all collections from the metadata chain
	active_collections := {c |
		some item in rego.metadata.chain()
		some c in item.annotations.custom.collections
	}

	msg := sprintf("Collections found: %v", [active_collections])
}

Documentation: https://www.openpolicyagent.org/docs/policy-language#metadata-scope

# provided to the policy, and that it contains a list of authorized
# repository URLs.
# collections:
# - policy_data
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but I think you need redhat_maven here also.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It inherits redhat_maven from the package level collections but if it helps in readability I can define it again at the rule level.

# custom:
# short_name: deny_unpermitted_urls
# failure_msg: '%s'
# effective_on: 2026-05-10T00:00:00Z
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing collections: [redhat_maven] IIUC.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Copy Markdown
Member

@simonbaird simonbaird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty close. Nice work! There are a few style/convention fixes needed in the test coverage and a few other places, and I think the collection metadata needs some attention. Otherwise, looks good.

@csasalu csasalu force-pushed the feature/maven_sbom_check branch from a8fb217 to 5fa895c Compare April 1, 2026 11:27
Comment on lines +8 to +11
# custom:
# collections:
# - redhat
# - redhat_maven
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

# repository URLs.
# collections:
# - policy_data
# severity: failure
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have any reference for this line (severity)? i don't think we support it

# comes from, and that URL must be present in the list of known and permitted
# Maven repositories. If no URL is specified, the package is assumed to come
# from Maven Central.
# scope: rule
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have any reference for this line (scope)? i don't think we support it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one instance where it is used :

# scope: package

# Maven repositories. If no URL is specified, the package is assumed to come
# from Maven Central.
# scope: rule
# custom:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding a solution attribute to this metadata

Comment on lines +54 to +55
base := lib.result_helper(rego.metadata.chain(), [msg])
result := object.union(base, {"term": purl})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use result_helper_with_term here.

Also: i think you will hit some conflicts when rebasing on main, because we changed the packaging of the helper functions, so you'll need to change calls to something like metadata.result_helper_with_term instead of lib.result_helper_with_term

Comment on lines +44 to +56
deny contains result if {
some purl, msg in _repo_url_errors
base := lib.result_helper(rego.metadata.chain(), [msg])
result := object.union(base, {"term": purl})
}

_repo_url_errors[purl] := msg if {
some pkg in maven.packages
purl := pkg.purl
source := _get_effective_url(pkg.repository_url)
not _url_is_permitted(source)
msg := sprintf("Package %q (source: %q) is not in the permitted list", [purl, source])
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this suggestion is valid

@csasalu csasalu marked this pull request as draft April 1, 2026 13:45
@csasalu csasalu force-pushed the feature/maven_sbom_check branch from 5fa895c to 6d11233 Compare April 14, 2026 09:32
csasalu pushed a commit to csasalu/policy that referenced this pull request Apr 14, 2026
Switches violation reporting to a set-based structure to handle
duplicate PURLs across multiple SBOM sources.

fixes coderabbit review:
conforma#1696 (comment)
cmarulas and others added 3 commits April 14, 2026 12:19
Implemented strict repository URL validation with time-based
enforcement handled via policy metadata directives.

- Restricted Maven builds to approved repositories only.
- Added 'effective_from' metadata annotation to allow for a
  grace period before hard enforcement.All violations are
  flagged as warnings until the 'effective_from' date.
- Support for SPDX and CycloneDX.
Update the Maven SBOM library to process all repository URLs found in
a component's external references instead of only the first match.
This aligns with SPDX and CycloneDX schemas which allow 0..N
external references per package.
Co-authored-by: Simon Baird <simon.baird@gmail.com>
csasalu pushed a commit to csasalu/policy that referenced this pull request Apr 14, 2026
Switches violation reporting to a set-based structure to handle
duplicate PURLs across multiple SBOM sources.

fixes coderabbit review:
conforma#1696 (comment)
@csasalu csasalu force-pushed the feature/maven_sbom_check branch from acd3b8a to 286e85d Compare April 14, 2026 11:36
csasalu pushed a commit to csasalu/policy that referenced this pull request Apr 14, 2026
Switches violation reporting to a set-based structure to handle
duplicate PURLs across multiple SBOM sources.

fixes coderabbit review:
conforma#1696 (comment)
@csasalu csasalu force-pushed the feature/maven_sbom_check branch from 286e85d to ad609d3 Compare April 14, 2026 11:47
Switches violation reporting to a set-based structure to handle
duplicate PURLs across multiple SBOM sources.

fixes coderabbit review:
conforma#1696 (comment)
@csasalu csasalu force-pushed the feature/maven_sbom_check branch from ad609d3 to d471a1a Compare April 14, 2026 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants