feat(nvd): use go to upload NVD conversion to gcs upon conversion#5099
feat(nvd): use go to upload NVD conversion to gcs upon conversion#5099jess-lowe wants to merge 24 commits intogoogle:masterfrom
Conversation
There was a problem hiding this comment.
We should make the gcs-tools repo generic to only uploading to GCS, but we shouldn't put CVE specific logic into here.
There was a problem hiding this comment.
If uploading to GCS is going to take a while, I would even put the multithreading / concurrency logic in here.
E.g. provide a function that will spin up X number of works, and a "gcs client" that just contains a channel.
Other code can pass the client to their code to upload.
Probably for a separate PR though.
There was a problem hiding this comment.
We should make the gcs-tools repo generic to only uploading to GCS, but we shouldn't put CVE specific logic into here.
Moved these into their own thing in conversion/writer
There was a problem hiding this comment.
If uploading to GCS is going to take a while, I would even put the multithreading / concurrency logic in here. E.g. provide a function that will spin up X number of works, and a "gcs client" that just contains a channel.
Other code can pass the client to their code to upload.
Probably for a separate PR though.
For uploading vulnerability records, this is too nuanced, hence it has its own thing in writer.VulnWorker, but with the NVD data this will be happening in the same thread that converts the record
…sv.dev into refactor/nvd-use-gcs
nvd-cve-osv Cron job doesn't seem to be successfully finishing - it is currently taking forever to upload and go threshold checks. This should speed things up hopefully, while waiting for #5099
another-rex
left a comment
There was a problem hiding this comment.
Nice, mostly looks good. Have you tested it locally and see how much faster it is compared to the script? (Probably not a big impact here, since locally we have a lot of threads compared to the cronjob)
This PR introduces support for immediately uploading NVD conversion records to GCS instead of saving them locally and then syncing, leveraging helper functions discussed in #4984.
Additionally, it refactors the NVD converter to separate record generation from output handling and reorganizes the project's upload and GCS utilities.
Key Changes
NVD Converter
nvd.CVEToOSVto return theVulnerabilityandMetricsobjects instead of writing them to disk directly. This separates the conversion logic from the I/O handling.-upload-to-gcs,-output-bucket, and-gcs-prefixflags to the NVD converter tool to support direct streaming to GCS.Package Reorganization & Utilities
uploadpackage: Relocatedvulnfeeds/uploadtovulnfeeds/conversion/writerto better fit the new output handling structure.gcs-toolspackage: Added a general GCS utility package invulnfeeds/gcs-toolsproviding functions likeUploadToGCS,UploadFile, andDownloadBucket.Other Converters
combine-to-osvand other converters (Alpine, Debian, etc.) to use the newwriterpackage instead of the olduploadpackage.Why this is needed