Conversation
Extends the osv-processor CI tool to generate RHEL OSV artifacts from
Red Hat's GCS data.
- Add --platform flag (ubuntu default, also accepts rhel)
- Add extractRHELVersion() for parsing Red Hat ecosystem strings
- Collapse repository/variant suffixes (appstream, baseos, server,
workstation, etc.) to major version
- Deduplicate CVE+package pairs across ecosystems
- Output: osv-rhel-{VERSION}-{YYYY-MM-DD}.json.gz
There was a problem hiding this comment.
Pull request overview
Extends the cmd/osv-processor tool to support generating RHEL OSV artifacts from Red Hat’s OSV data feed by adding a --platform rhel mode with RHEL ecosystem parsing, CVE extraction, and cross-ecosystem de-duplication.
Changes:
- Added
--platformflag and config plumbing to choose between Ubuntu and RHEL processing paths. - Implemented RHEL artifact generation (
runRHEL) includingenterprise_linuxversion extraction and CVE list extraction per advisory. - Added unit tests covering RHEL ecosystem parsing, CVE extraction, artifact generation, de-duplication, and version filtering.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cmd/osv-processor/main.go | Adds platform selection and new RHEL processing pipeline + artifact format. |
| cmd/osv-processor/main_test.go | Adds test coverage for new RHEL parsing/extraction and artifact generation behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #43277 +/- ##
========================================
Coverage 66.85% 66.85%
========================================
Files 2587 2588 +1
Lines 207405 207707 +302
Branches 9170 9170
========================================
+ Hits 138653 138861 +208
- Misses 56143 56198 +55
- Partials 12609 12648 +39
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
WalkthroughThis PR extends the OSV processor to support RHEL (Red Hat Enterprise Linux) feeds alongside the existing Ubuntu functionality. A new Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/osv-processor/main.go`:
- Around line 660-698: The dedupe uses the raw key (seen[rhelVer][vulnKey{pkg:
packageName, cve: cveID}]) before calling transformVuln, so normalized package
names/CVEs or modified vuln data get dropped; move the de-duplication check to
after calling transformVuln and computing vulnToUse and the transformed pkg
(i.e., perform seen lookup/insert using vulnKey{pkg: pkg, cve: vulnToUse.CVE}),
and when a duplicate is detected compare the existing recorded ProcessedVuln in
artifacts[rhelVer].Vulnerabilities[pkg] to vulnToUse and handle conflicts
explicitly (log or merge) instead of silently skipping to ensure differing
Fixed/Versions are not lost; update references to seen, vulnKey, transformVuln,
vulnToUse, artifacts and RHELArtifactData.Vulnerabilities accordingly.
- Around line 734-757: Function extractCVEIDs currently returns CVEs only from
Upstream or, as a fallback, from Related/ID; change it to build a union across
all supported fields (osv.Upstream, osv.Related, and osv.ID) so no CVEs are
dropped. Iterate all three sources, add any string starting with "CVE-" to the
result while deduplicating (use a map/set keyed by the CVE string) before
returning the slice. Keep the function name extractCVEIDs and ensure the
returned order is stable (e.g., append in the order Upstream, Related, then ID
if not already present).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a44c7e67-93f9-48eb-9516-62a80bec3b0f
📒 Files selected for processing (2)
cmd/osv-processor/main.gocmd/osv-processor/main_test.go
| for _, cveID := range cveIDs { | ||
| // Deduplicate: same CVE+package can appear in baseos, appstream, crb | ||
| if seen[rhelVer] == nil { | ||
| seen[rhelVer] = make(map[vulnKey]struct{}) | ||
| } | ||
| key := vulnKey{pkg: packageName, cve: cveID} | ||
| if _, exists := seen[rhelVer][key]; exists { | ||
| continue | ||
| } | ||
| seen[rhelVer][key] = struct{}{} | ||
|
|
||
| vuln := ProcessedVuln{ | ||
| CVE: cveID, | ||
| Published: osvData.Published, | ||
| Modified: osvData.Modified, | ||
| Introduced: introduced, | ||
| Fixed: fixed, | ||
| Versions: affected.Versions, | ||
| } | ||
|
|
||
| packages, modifiedVuln := transformVuln(packageName, cveID, &vuln) | ||
| if packages == nil { | ||
| continue | ||
| } | ||
| vulnToUse := &vuln | ||
| if modifiedVuln != nil { | ||
| vulnToUse = modifiedVuln | ||
| } | ||
|
|
||
| for _, pkg := range packages { | ||
| if _, exists := artifacts[rhelVer]; !exists { | ||
| artifacts[rhelVer] = &RHELArtifactData{ | ||
| SchemaVersion: "1.0", | ||
| RHELVersion: rhelVer, | ||
| Vulnerabilities: make(map[string][]ProcessedVuln), | ||
| } | ||
| } | ||
| artifacts[rhelVer].Vulnerabilities[pkg] = append(artifacts[rhelVer].Vulnerabilities[pkg], *vulnToUse) | ||
| } |
There was a problem hiding this comment.
The dedupe key doesn't match the record you actually emit.
seen is populated from the raw packageName/cveID pair before transformVuln runs, but the artifact is written with the transformed package/CVE. That means normalization happens after the duplicate check, and a later entry with different Fixed/Versions data is dropped purely because it was encountered second. Deduplicate on the emitted pkg + vulnToUse.CVE instead, and make conflicts explicit instead of silently picking one.
Suggested direction
- key := vulnKey{pkg: packageName, cve: cveID}
- if _, exists := seen[rhelVer][key]; exists {
- continue
- }
- seen[rhelVer][key] = struct{}{}
-
vuln := ProcessedVuln{
CVE: cveID,
Published: osvData.Published,
Modified: osvData.Modified,
Introduced: introduced,
Fixed: fixed,
Versions: affected.Versions,
}
@@
vulnToUse := &vuln
if modifiedVuln != nil {
vulnToUse = modifiedVuln
}
for _, pkg := range packages {
+ key := vulnKey{pkg: pkg, cve: vulnToUse.CVE}
+ if _, exists := seen[rhelVer][key]; exists {
+ continue
+ }
+ seen[rhelVer][key] = struct{}{}
+
if _, exists := artifacts[rhelVer]; !exists {
artifacts[rhelVer] = &RHELArtifactData{
SchemaVersion: "1.0",
RHELVersion: rhelVer,
Vulnerabilities: make(map[string][]ProcessedVuln),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for _, cveID := range cveIDs { | |
| // Deduplicate: same CVE+package can appear in baseos, appstream, crb | |
| if seen[rhelVer] == nil { | |
| seen[rhelVer] = make(map[vulnKey]struct{}) | |
| } | |
| key := vulnKey{pkg: packageName, cve: cveID} | |
| if _, exists := seen[rhelVer][key]; exists { | |
| continue | |
| } | |
| seen[rhelVer][key] = struct{}{} | |
| vuln := ProcessedVuln{ | |
| CVE: cveID, | |
| Published: osvData.Published, | |
| Modified: osvData.Modified, | |
| Introduced: introduced, | |
| Fixed: fixed, | |
| Versions: affected.Versions, | |
| } | |
| packages, modifiedVuln := transformVuln(packageName, cveID, &vuln) | |
| if packages == nil { | |
| continue | |
| } | |
| vulnToUse := &vuln | |
| if modifiedVuln != nil { | |
| vulnToUse = modifiedVuln | |
| } | |
| for _, pkg := range packages { | |
| if _, exists := artifacts[rhelVer]; !exists { | |
| artifacts[rhelVer] = &RHELArtifactData{ | |
| SchemaVersion: "1.0", | |
| RHELVersion: rhelVer, | |
| Vulnerabilities: make(map[string][]ProcessedVuln), | |
| } | |
| } | |
| artifacts[rhelVer].Vulnerabilities[pkg] = append(artifacts[rhelVer].Vulnerabilities[pkg], *vulnToUse) | |
| } | |
| for _, cveID := range cveIDs { | |
| // Deduplicate: same CVE+package can appear in baseos, appstream, crb | |
| if seen[rhelVer] == nil { | |
| seen[rhelVer] = make(map[vulnKey]struct{}) | |
| } | |
| vuln := ProcessedVuln{ | |
| CVE: cveID, | |
| Published: osvData.Published, | |
| Modified: osvData.Modified, | |
| Introduced: introduced, | |
| Fixed: fixed, | |
| Versions: affected.Versions, | |
| } | |
| packages, modifiedVuln := transformVuln(packageName, cveID, &vuln) | |
| if packages == nil { | |
| continue | |
| } | |
| vulnToUse := &vuln | |
| if modifiedVuln != nil { | |
| vulnToUse = modifiedVuln | |
| } | |
| for _, pkg := range packages { | |
| key := vulnKey{pkg: pkg, cve: vulnToUse.CVE} | |
| if _, exists := seen[rhelVer][key]; exists { | |
| continue | |
| } | |
| seen[rhelVer][key] = struct{}{} | |
| if _, exists := artifacts[rhelVer]; !exists { | |
| artifacts[rhelVer] = &RHELArtifactData{ | |
| SchemaVersion: "1.0", | |
| RHELVersion: rhelVer, | |
| Vulnerabilities: make(map[string][]ProcessedVuln), | |
| } | |
| } | |
| artifacts[rhelVer].Vulnerabilities[pkg] = append(artifacts[rhelVer].Vulnerabilities[pkg], *vulnToUse) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/osv-processor/main.go` around lines 660 - 698, The dedupe uses the raw
key (seen[rhelVer][vulnKey{pkg: packageName, cve: cveID}]) before calling
transformVuln, so normalized package names/CVEs or modified vuln data get
dropped; move the de-duplication check to after calling transformVuln and
computing vulnToUse and the transformed pkg (i.e., perform seen lookup/insert
using vulnKey{pkg: pkg, cve: vulnToUse.CVE}), and when a duplicate is detected
compare the existing recorded ProcessedVuln in
artifacts[rhelVer].Vulnerabilities[pkg] to vulnToUse and handle conflicts
explicitly (log or merge) instead of silently skipping to ensure differing
Fixed/Versions are not lost; update references to seen, vulnKey, transformVuln,
vulnToUse, artifacts and RHELArtifactData.Vulnerabilities accordingly.
| // extractCVEIDs returns all CVE IDs from an OSV entry. | ||
| // RHEL advisories list CVEs in the "upstream" field (same as Ubuntu). | ||
| func extractCVEIDs(osv *OSVData) []string { | ||
| var cves []string | ||
| for _, upstream := range osv.Upstream { | ||
| if strings.HasPrefix(upstream, "CVE-") { | ||
| cves = append(cves, upstream) | ||
| } | ||
| } | ||
| // Fallback: check Related field | ||
| if len(cves) == 0 { | ||
| for _, related := range osv.Related { | ||
| if strings.HasPrefix(related, "CVE-") { | ||
| cves = append(cves, related) | ||
| } | ||
| } | ||
| } | ||
| // Fallback: check ID itself | ||
| if len(cves) == 0 { | ||
| if strings.HasPrefix(osv.ID, "CVE-") { | ||
| cves = append(cves, osv.ID) | ||
| } | ||
| } | ||
| return cves |
There was a problem hiding this comment.
extractCVEIDs is dropping valid CVEs from mixed-field advisories.
The helper says it returns all CVE IDs, but Related and ID are only consulted when Upstream produced none. An advisory with Upstream=["CVE-1"] and Related=["CVE-2"] will emit only CVE-1, so the generated RHEL artifact is incomplete. Build a union across all supported fields and dedupe as you append.
Suggested fix
func extractCVEIDs(osv *OSVData) []string {
var cves []string
+ seen := make(map[string]struct{})
+ add := func(id string) {
+ if !strings.HasPrefix(id, "CVE-") {
+ return
+ }
+ if _, ok := seen[id]; ok {
+ return
+ }
+ seen[id] = struct{}{}
+ cves = append(cves, id)
+ }
+
for _, upstream := range osv.Upstream {
- if strings.HasPrefix(upstream, "CVE-") {
- cves = append(cves, upstream)
- }
+ add(upstream)
}
- // Fallback: check Related field
- if len(cves) == 0 {
- for _, related := range osv.Related {
- if strings.HasPrefix(related, "CVE-") {
- cves = append(cves, related)
- }
- }
+
+ for _, related := range osv.Related {
+ add(related)
}
- // Fallback: check ID itself
- if len(cves) == 0 {
- if strings.HasPrefix(osv.ID, "CVE-") {
- cves = append(cves, osv.ID)
- }
- }
+
+ add(osv.ID)
return cves
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // extractCVEIDs returns all CVE IDs from an OSV entry. | |
| // RHEL advisories list CVEs in the "upstream" field (same as Ubuntu). | |
| func extractCVEIDs(osv *OSVData) []string { | |
| var cves []string | |
| for _, upstream := range osv.Upstream { | |
| if strings.HasPrefix(upstream, "CVE-") { | |
| cves = append(cves, upstream) | |
| } | |
| } | |
| // Fallback: check Related field | |
| if len(cves) == 0 { | |
| for _, related := range osv.Related { | |
| if strings.HasPrefix(related, "CVE-") { | |
| cves = append(cves, related) | |
| } | |
| } | |
| } | |
| // Fallback: check ID itself | |
| if len(cves) == 0 { | |
| if strings.HasPrefix(osv.ID, "CVE-") { | |
| cves = append(cves, osv.ID) | |
| } | |
| } | |
| return cves | |
| // extractCVEIDs returns all CVE IDs from an OSV entry. | |
| // RHEL advisories list CVEs in the "upstream" field (same as Ubuntu). | |
| func extractCVEIDs(osv *OSVData) []string { | |
| var cves []string | |
| seen := make(map[string]struct{}) | |
| add := func(id string) { | |
| if !strings.HasPrefix(id, "CVE-") { | |
| return | |
| } | |
| if _, ok := seen[id]; ok { | |
| return | |
| } | |
| seen[id] = struct{}{} | |
| cves = append(cves, id) | |
| } | |
| for _, upstream := range osv.Upstream { | |
| add(upstream) | |
| } | |
| for _, related := range osv.Related { | |
| add(related) | |
| } | |
| add(osv.ID) | |
| return cves | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/osv-processor/main.go` around lines 734 - 757, Function extractCVEIDs
currently returns CVEs only from Upstream or, as a fallback, from Related/ID;
change it to build a union across all supported fields (osv.Upstream,
osv.Related, and osv.ID) so no CVEs are dropped. Iterate all three sources, add
any string starting with "CVE-" to the result while deduplicating (use a map/set
keyed by the CVE string) before returning the slice. Keep the function name
extractCVEIDs and ensure the returned order is stable (e.g., append in the order
Upstream, Related, then ID if not already present).
Resolves #43183
Summary
Extends the
osv-processorCI tool to generate RHEL OSV artifacts from Red Hat's GCS-published vulnerability data. Adds a--platform rhelflag that processesRed Hat:enterprise_linuxecosystem entries, collapses repository/variant suffixes (appstream, baseos, server, workstation, etc.) to major version, and deduplicates CVE+package pairs across ecosystems.How to test locally
Download the Red Hat OSV feed from GCS and run the processor:
Expected output (19,084 advisories, ~4 seconds):
Related
--platform rhelstep togenerate-cve.yml)Summary by CodeRabbit
New Features
--platformflag to select between Ubuntu and RHEL processing modesTests