Skip to content

Conversation

@michaelkedar
Copy link
Member

Change all(?) the ndb calls used throughout the API handlers to use the async versions, and rewrote a few functions to not block on the completion of futures that it doesn't need to (particularly, converting the Bug to a proto, and populating the related/alias/upstream fields of those protos).

Testing on my private instance, this should be a 3-5x speedup for queries with lots of vulnerabilities.

I've also made the batch queries properly set the modified time from the alias/upstream group (which might actually be a tiny performance loss) to ensure the modified time is always the same for the batch query and the get by ID.

I've added @ndb.synctasklet to the rpc handlers to let us use yield instead of .result(), just to discourage usage of it (using .result() in a tasklet can cause stack overflows)

another-rex
another-rex previously approved these changes Jul 7, 2025
Copy link
Contributor

@another-rex another-rex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIP types, all ndb.Futures now. LGTM, great fix!

return vulnerability_pb2.Vulnerability(id=self.id(), modified=modified)

def to_vulnerability(self,
include_source=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something better suited to a followup PR, but I would suggest removing all these flags and force callers to use the async version if they want to include stuff.

return vulnerability_pb2.Vulnerability(id=self.id(), modified=modified)

@ndb.tasklet
def to_vulnerability_minimal_async(self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also be converting to_vulnerability_minimal to just call the async version then blocking so they both return the same data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might just leave it for now, since it's kind of the opposite of your other comment.
(I don't think to_vulnerability_minimal() is even used anymore anyway)

michaelkedar added a commit that referenced this pull request Jul 8, 2025
Bump the max concurrent requests from 5 to 10 for the API backend, to
help with container instance scaling.
This is not for the batch query backend, which still is just 1.

Hopefully this won't have a notable performance impact. I don't really
want to change this and #3638 at the same time, since I want to be able
to measure the individual impacts of both. We could probably push this
out with the release this week, and let #3638 stage until next week.
@michaelkedar michaelkedar merged commit 1f7c419 into google:master Jul 8, 2025
13 checks passed
michaelkedar added a commit to michaelkedar/osv.dev that referenced this pull request Jul 10, 2025
Change all(?) the ndb calls used throughout the API handlers to use the
async versions, and rewrote a few functions to not block on the
completion of futures that it doesn't need to (particularly, converting
the Bug to a proto, and populating the related/alias/upstream fields of
those protos).

Testing on my private instance, this should be a 3-5x speedup for
queries with lots of vulnerabilities.

I've also made the batch queries properly set the modified time from the
alias/upstream group (which might actually be a tiny performance loss)
to ensure the modified time is always the same for the batch query and
the get by ID.

I've added `@ndb.synctasklet` to the rpc handlers to let us use `yield`
instead of `.result()`, just to discourage usage of it (using
`.result()` in a tasklet can cause stack overflows)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants