Skip to content

Conversation

@michaelkedar
Copy link
Member

More database migration!
Reimplement the exporter in Go, reading from the GCS proto files instead of from the datastore Bugs.
I've made the whole thing more parallel and all in-memory, which should be a pretty decent performance improvement.

Testing is currently missing - we need a way to mock the GCS buckets.

@michaelkedar
Copy link
Member Author

/gemini summary
/gemini review

@gemini-code-assist
Copy link
Contributor

I've received your requests for a pull request summary and a code review. I will provide these shortly.

@michaelkedar
Copy link
Member Author

Hm, I'm not sure if 'shortly' was the right word...

@michaelkedar
Copy link
Member Author

/gemini summary
/gemini review

@gemini-code-assist
Copy link
Contributor

Apologies for the delay, michaelkedar. Generating a comprehensive summary and a thorough code review for a pull request of this size can take some time. Your requests are still being processed, and I will provide the summary and review as soon as they are ready. Thank you for your patience!

@michaelkedar michaelkedar marked this pull request as ready for review October 20, 2025 00:54
Copy link
Contributor

@another-rex another-rex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

"google.golang.org/api/iterator"
)

func main() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here explaining the graph of all the channels and workers and how they communicate with each other via the channels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added above where the channels are created.


// Wait to receive an object, or be cancelled.
select {
case obj, ok = <-objectCh:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for using a select here over directly looping objectCh like in ecosystemRouter for example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select {
case obj, ok = <-objectCh:
  // ...
case <- ctx.Done():
  return
}

Allows the loop to be cancelled if the context is cancelled (otherwise it'd block on the objectCh).
ecosystemRouter should probably also follow this pattern.


// Wait to receive a vulnerability, or be cancelled.
select {
case v, ok = <-w.ch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm feels like this should also just be a for loop over w.ch
Then all the final writeCSV and writeZIP functions can be put at the bottom of this function after the loop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to #4197 (comment)

}
}

func writeCSV(ctx context.Context, path string, csvData [][]string, writeCh chan<- writeMsg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify that this is the modified_id csv file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to writeModifiedIDCSV

Copy link
Contributor

@cuixq cuixq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some high-level documentation to each Go file/function?

@michaelkedar
Copy link
Member Author

can you add some high-level documentation to each Go file/function?

done

cuixq
cuixq previously approved these changes Oct 20, 2025
Copy link
Contributor

@cuixq cuixq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - also let's add a README.md with documentation as well as instructions how to run this locally.

writeVanir(ctx, vanirVulns, outCh)
}
logger.Info("ecosystem worker finished processing", slog.String("ecosystem", w.ecosystem))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Break here and put the finalising code at the bottom of the func (same with the other loops)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break-ed

@michaelkedar michaelkedar merged commit a4d1ad0 into google:master Oct 21, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants