Skip to content

Fixed nondeterministic CPE matching when multiple CPE candidates share the same product name#41649

Merged
getvictor merged 3 commits intomainfrom
victor/39899-deterministic-cpe-matching
Mar 17, 2026
Merged

Fixed nondeterministic CPE matching when multiple CPE candidates share the same product name#41649
getvictor merged 3 commits intomainfrom
victor/39899-deterministic-cpe-matching

Conversation

@getvictor
Copy link
Copy Markdown
Member

@getvictor getvictor commented Mar 13, 2026

Related issue: Resolves #39899

This fix fixes the determinism issue by ordering the results, however, it does not necessarily fix the correctness issue. Another bug opened for that: #41644

That's why you see changes in cpe_test.go that may seem incorrect in some cases. In reality the previous behavior was purely by coincidence (based on insert order).

Checklist for submitter

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.

Testing

  • Added/updated automated tests
  • QA'd all new/changed functionality manually

Summary by CodeRabbit

  • Bug Fixes
    • Fixed nondeterministic CPE matching when multiple candidates share the same product name. CPE selection is now deterministic and prioritizes matches based on vendor alignment with the software being analyzed.

@getvictor
Copy link
Copy Markdown
Member Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

Walkthrough

This pull request addresses nondeterministic CPE matching for the Line messaging app. Changes include adding an ORDER BY clause (vendor, product) to multiple CPE database queries to ensure consistent row ordering. A new helper function checks if a software vendor appears within a CPE vendor field. The CPE selection logic is refactored to collect all matching candidates and apply a vendor-match heuristic to select the best one. Additional test cases and test data validate the vendor-matching behavior and deterministic ordering.

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description includes a related issue reference and checked items from the testing section, but lacks required detail and completeness across several mandatory checklist items. The description should provide more detail on input validation, SQL injection prevention, backward compatibility checks, database migration considerations, and confirm all relevant checklist items have been completed.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: fixing nondeterministic CPE matching when multiple candidates share the same product name.
Linked Issues check ✅ Passed The PR changes implement a deterministic CPE matching solution by adding ORDER BY clauses and vendor-match heuristics, directly addressing issue #39899's requirement for consistent CPE selection across runs.
Out of Scope Changes check ✅ Passed All changes are scoped to CPE matching logic and testing: query ordering, vendor-match helper, deprecation handling, and test additions are all aligned with fixing nondeterministic CPE selection.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch victor/39899-deterministic-cpe-matching
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
server/vulnerabilities/nvd/cpe.go (1)

182-188: Consider lowercasing item.Vendor for defensive consistency.

The function lowercases software.Vendor but not item.Vendor. While CPE vendors in the NVD database are typically lowercase, explicitly lowercasing both sides would make this more robust.

♻️ Suggested improvement
 func cpeVendorMatchesSoftware(item *IndexedCPEItem, software *fleet.Software) bool {
 	sVendor := strings.ToLower(software.Vendor)
-	return sVendor != "" && strings.Contains(sVendor, item.Vendor)
+	return sVendor != "" && strings.Contains(sVendor, strings.ToLower(item.Vendor))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/vulnerabilities/nvd/cpe.go` around lines 182 - 188,
cpeVendorMatchesSoftware currently lowercases software.Vendor but not
item.Vendor, which can cause mismatches; update the function
(cpeVendorMatchesSoftware) to compare lowercased values for both sides by
computing e.g. sVendor := strings.ToLower(software.Vendor) and iVendor :=
strings.ToLower(item.Vendor) and then use strings.Contains(sVendor, iVendor) so
the comparison is case-insensitive and more robust when matching
IndexedCPEItem.Vendor against fleet.Software.Vendor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/vulnerabilities/nvd/cpe_test.go`:
- Line 1820: Add a Homebrew-to-CPython translation rule so Homebrew python
packages map to vendor "python" instead of Microsoft's python; create a rule
matching software.name /^python(@.*)?$/ with source "homebrew_packages" and a
filter that sets product:["python"] and vendor:["python"] (matching the JSON
snippet in the review) and add it alongside the existing Homebrew translation
rules used by the CPE translation logic referenced in cpe_test.go.
- Line 1341: The test shows Python package "requests" being resolved to
"jenkins:requests" (cpe:2.3:a:jenkins:requests...) due to alphabetical CPE
lookup; add a CPE translation rule that maps the PyPI package name "requests"
(or the ecosystem identifier used in translation code) to the correct CPE vendor
for the Python requests library so the produced CPE becomes
cpe:2.3:a:python:requests:...; update the translation rules data structure (the
one exercised by server/vulnerabilities/nvd/cpe_test.go) to include an explicit
mapping from "requests" (PyPI) -> vendor "python" to ensure deterministic
correct matching.

In `@server/vulnerabilities/nvd/cpe.go`:
- Around line 658-667: The code calls resolveDeprecatedCPE(db, results,
software) when hasDeprecatedMatches is true but passes the entire results slice
(including non-matching or irrelevant deprecated items); instead, build a
filtered slice containing only items that both cpeItemMatchesSoftware(...)
returned true for and are deprecated (preserving original ordering), and pass
that filtered slice to resolveDeprecatedCPE (keep using db and software); update
the block around hasDeprecatedMatches and the references to results so
resolveDeprecatedCPE only examines the relevant deprecatedMatches.

---

Nitpick comments:
In `@server/vulnerabilities/nvd/cpe.go`:
- Around line 182-188: cpeVendorMatchesSoftware currently lowercases
software.Vendor but not item.Vendor, which can cause mismatches; update the
function (cpeVendorMatchesSoftware) to compare lowercased values for both sides
by computing e.g. sVendor := strings.ToLower(software.Vendor) and iVendor :=
strings.ToLower(item.Vendor) and then use strings.Contains(sVendor, iVendor) so
the comparison is case-insensitive and more robust when matching
IndexedCPEItem.Vendor against fleet.Software.Vendor.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9f9faf04-e094-4a53-8bfe-7ac9d08fe4b0

📥 Commits

Reviewing files that changed from the base of the PR and between c95e696 and d9eb3df.

📒 Files selected for processing (4)
  • changes/39899-deterministic-cpe-matching
  • server/vulnerabilities/nvd/cpe.go
  • server/vulnerabilities/nvd/cpe_test.go
  • server/vulnerabilities/nvd/testing_utils.go

Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make NVD CPE matching deterministic when multiple CPE candidates match the same software (notably when multiple CPEs share a product name), reducing run-to-run variability in vulnerability detection.

Changes:

  • Adds ORDER BY to CPE candidate queries and updates selection logic to avoid nondeterministic “first row wins” behavior.
  • Introduces a vendor-based tiebreak helper and adds/updates tests to cover the deterministic behavior.
  • Adds a changelog entry for the bugfix.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
server/vulnerabilities/nvd/cpe.go Adds query ordering + best-match selection logic and a vendor tiebreak helper.
server/vulnerabilities/nvd/cpe_test.go Adds deterministic tie tests and updates integration expectations reflecting the new selection order.
server/vulnerabilities/nvd/testing_utils.go Extends the test CPE dictionary with additional line product entries for ambiguity scenarios.
changes/39899-deterministic-cpe-matching User-visible change note for deterministic CPE matching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe.go
Comment thread server/vulnerabilities/nvd/cpe.go
Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe_test.go
Comment thread server/vulnerabilities/nvd/cpe_test.go
@getvictor getvictor marked this pull request as ready for review March 13, 2026 16:27
@getvictor getvictor requested a review from a team as a code owner March 13, 2026 16:27
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.40%. Comparing base (4e35de2) to head (633a261).
⚠️ Report is 111 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #41649      +/-   ##
==========================================
+ Coverage   66.36%   66.40%   +0.04%     
==========================================
  Files        2492     2498       +6     
  Lines      199603   200295     +692     
  Branches     8826     8826              
==========================================
+ Hits       132469   133015     +546     
- Misses      55160    55257      +97     
- Partials    11974    12023      +49     
Flag Coverage Δ
backend 68.21% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@ksykulev ksykulev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but the commit message misspelled deteminism. Should be determinism

Comment thread server/vulnerabilities/nvd/cpe.go Outdated
// pass cpeItemMatchesSoftware.
func cpeVendorMatchesSoftware(item *IndexedCPEItem, software *fleet.Software) bool {
sVendor := strings.ToLower(software.Vendor)
return sVendor != "" && strings.Contains(sVendor, item.Vendor)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reasonable? Are there instances where just doing a strings.Contains might get us into trouble?

Maybe using word boundaries?

if sVendor == "" {
	return false
}

pattern := `\b` + regexp.QuoteMeta(item.Vendor) + `\b`
matched, _ := regexp.MatchString(pattern, sVendor)
return matched

@getvictor getvictor merged commit 3b43629 into main Mar 17, 2026
48 checks passed
@getvictor getvictor deleted the victor/39899-deterministic-cpe-matching branch March 17, 2026 12:22
@coderabbitai coderabbitai Bot mentioned this pull request Mar 24, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Match CPEs is nondeterministic

3 participants