-
Notifications
You must be signed in to change notification settings - Fork 275
feat: reimplement the exporter in Go #4197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/gemini summary |
|
I've received your requests for a pull request summary and a code review. I will provide these shortly. |
|
Hm, I'm not sure if 'shortly' was the right word... |
|
/gemini summary |
|
Apologies for the delay, |
another-rex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
| "google.golang.org/api/iterator" | ||
| ) | ||
|
|
||
| func main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here explaining the graph of all the channels and workers and how they communicate with each other via the channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added above where the channels are created.
go/cmd/exporter/downloader.go
Outdated
|
|
||
| // Wait to receive an object, or be cancelled. | ||
| select { | ||
| case obj, ok = <-objectCh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for using a select here over directly looping objectCh like in ecosystemRouter for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select {
case obj, ok = <-objectCh:
// ...
case <- ctx.Done():
return
}Allows the loop to be cancelled if the context is cancelled (otherwise it'd block on the objectCh).
ecosystemRouter should probably also follow this pattern.
go/cmd/exporter/worker.go
Outdated
|
|
||
| // Wait to receive a vulnerability, or be cancelled. | ||
| select { | ||
| case v, ok = <-w.ch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm feels like this should also just be a for loop over w.ch
Then all the final writeCSV and writeZIP functions can be put at the bottom of this function after the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to #4197 (comment)
go/cmd/exporter/worker.go
Outdated
| } | ||
| } | ||
|
|
||
| func writeCSV(ctx context.Context, path string, csvData [][]string, writeCh chan<- writeMsg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarify that this is the modified_id csv file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to writeModifiedIDCSV
cuixq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some high-level documentation to each Go file/function?
done |
cuixq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - also let's add a README.md with documentation as well as instructions how to run this locally.
go/cmd/exporter/worker.go
Outdated
| writeVanir(ctx, vanirVulns, outCh) | ||
| } | ||
| logger.Info("ecosystem worker finished processing", slog.String("ecosystem", w.ecosystem)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Break here and put the finalising code at the bottom of the func (same with the other loops)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break-ed
More database migration!
Reimplement the exporter in Go, reading from the GCS proto files instead of from the datastore Bugs.
I've made the whole thing more parallel and all in-memory, which should be a pretty decent performance improvement.
Testing is currently missing - we need a way to mock the GCS buckets.