perf: optimize memory usage during Debian CVE conversion#5291
Merged
jess-lowe merged 2 commits intogoogle:masterfrom Apr 30, 2026
Merged
Conversation
The `vulns.LoadAllCVEs` function previously loaded the entire NVD CVE dataset into memory by reading hundreds of thousands of CVE JSON files concurrently. This caused massive memory spikes and high garbage collection overhead during the Debian CVE conversion process. This commit introduces a target-based filtering approach: - Added `LoadTargetCVEs` which accepts a map of required CVE IDs. - Modified the JSON parsing loop to only emit vulnerabilities that are present in the target list. - Refactored `cmd/converters/debian/main.go` to extract the needed CVEs from the Debian Security Tracker data and pass them into `LoadTargetCVEs`. - Preserved the existing `LoadAllCVEs` interface for backward compatibility. Co-authored-by: jess-lowe <86962800+jess-lowe@users.noreply.github.com>
Ly-Joey
previously approved these changes
Apr 30, 2026
michaelkedar
previously approved these changes
Apr 30, 2026
The `vulns.LoadAllCVEs` function previously loaded the entire NVD CVE dataset into memory by reading hundreds of thousands of CVE JSON files concurrently. This caused massive memory spikes and high garbage collection overhead during the Debian and Alpine CVE conversion process. This commit introduces a target-based filtering approach: - Added `LoadTargetCVEs` which accepts a map of required CVE IDs. - Modified the JSON parsing loop to only emit vulnerabilities that are present in the target list. - Refactored `cmd/converters/debian/main.go` and `cmd/converters/alpine/main.go` to extract the needed CVEs from the respective security tracker data and pass them into `LoadTargetCVEs`. - Preserved the existing `LoadAllCVEs` interface for backward compatibility. Co-authored-by: jess-lowe <86962800+jess-lowe@users.noreply.github.com>
tobyhawker
approved these changes
Apr 30, 2026
another-rex
approved these changes
Apr 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I noticed that the Debian CVE converter was OOMing occasionally recently, and it was likely due to the following:
The
vulns.LoadAllCVEsfunction previously loaded the entire NVD CVE dataset into memory by reading hundreds of thousands of CVE JSON files concurrently. This caused massive memory spikes and high garbage collection overhead during the Debian CVE conversion process.This commit introduces a target-based filtering approach:
LoadTargetCVEswhich accepts a map of required CVE IDs.cmd/converters/debian/main.goto extract the needed CVEs from the Debian Security Tracker data and pass them intoLoadTargetCVEs.LoadAllCVEsinterface for backward compatibility. (Alpine still uses it, I believe - I should probably fix this too)