fix(NODE-4139): allow monitor to yield control #3243

durran · 2022-05-11T12:25:20Z

Description

Changes the monitor to process incoming hellos when the streaming protocol is enabled in a timeout, so that it can yield the process to other potential commands.

What is changing?

Adds a setTimeout in the callback after the hello command is run. This has been shown on aws lambda to reduce processing time after a function is idle for hours from over 20 seconds to ~400ms.

Is there new documentation needed for these changes?

None

What is the motivation for this change?

NODE-4140

Double check the following

Ran npm run check:lint script
Self-review completed using the steps outlined here
PR title follows the correct format: <type>(NODE-xxxx)<!>: <description>
Changes are covered by tests
New TODOs have a related JIRA ticket

dariakp · 2022-05-11T18:41:25Z

src/sdam/monitor.ts

-          new ServerHeartbeatStartedEvent(monitor.address)
-        );
-        start = now();
+        setTimeout(function () {


can we add a test for the control yield behavior or any test that captures the intended behavior here? the change itself seems straightforward enough; since nothing in our code is listening for this, I don't think this would have any sort of adverse impact on existing monitoring and monitoring-adjacent logic, but I want to make sure, if this is indeed solving the heartbeat issue, that we have some sort of documentation for posterity that states why this is important to have in a timeout; at a minimum I'd like a comment explaining why we want to yield control specifically here prior to returning, but preferably a regression (unit) test that demonstrates the important flow

if a test is possible but may take a while, I don't mind it being in a follow up PR if we're trying to expedite this PR specifically

mbroadst · 2022-05-11T19:51:35Z

src/sdam/monitor.ts

-          new ServerHeartbeatStartedEvent(monitor.address)
-        );
-        start = now();
+        setTimeout(function () {


I'm concerned this isn't the right place for the fix. I think the intuition here is that the receive buffer is full of incoming heartbeat responses when the lambda instance is "unfrozen", and so you end up with a thundering herd effect - trying to process too many of these messages at the same time results in event loop starvation.

I was initially confused that slightly deferring the emission of a monitoring event would fix this issue, since it's not used internally at all. Also, we emit a different event a few lines above on L274 so why wouldn't we also apply the fix there? (Note: I think there is actually a bug with the code as presented here, in that a started event can now appear after a succeeded event).

Here's what I think is going on:
Streamable hello streams responses to clients using the moreToCome flag. To ease the implementation at the call site, I implemented support for this by requeuing a provided callback each time moreToCome is true (I also did a really bad job of documenting this, so I would not be surprised if this behavior is unexpected to you!). If we indeed have a thundering herd of incoming hello messages, then I think it's plausible that the callback is being called in a hot loop, and starving the event loop. The fix here I think is to wrap the call to the callback in a setTimeout/process.nextTick (and maybe only do this IFF moreToCome is set), so that we are able to do additional work between streamed calls.

Thanks for the explanation! I certainly wasn't aware of the moreToCome flag for the streaming hello protocol. Is the streaming hello the only server context where this is used?

Looking at this implementation more, I'm actually not sure how it's accomplishing the performance improvement, since the only thing being delayed is the emission of the SERVER_HEARTBEAT_STARTED events. If what you are describing is accurate and I understand it correctly (i.e., this callback is basically being called in a loop with no async breaks), I would expect all of those SERVER_HEARTBEAT_SUCCEEDED events to get fired in order, before any SERVER_HEARTBEAT_STARTED events, since they are now in a settimeout, which, presumably, would get handled at the end of that loop; and, moreover, I would expect the duration associated with each of the SUCCEEDED events to also increase, since the start would no longer be reset. @durran is this what we observe happening from the logs? The only extra processing associated with the event emission would be logging - so is that what we are trying to delay here @durran ? Am I reading this correctly?

I think, in that case, Matt's suggestion does make sense - we want to async-ify the calling of the handler for each event. I am still not fully following, however, how the "synthetic" event processing works. 1) The comment above there says that the responseTo changes on each response - does it get set to the previous message's requestId? 2) If we do put the callback in a timeout, won't it mean that we are still going to end up with a stack of callbacks that all get executed back to back on the next pass through the event loop? Essentially, we'd only be buying time between reading the "stampede" of hello messages and actually executing the set of the handlers for the hello responses - is that what we want?

Following that thought - if we already can delay the callback execution for moreToCome, could we detect if we have an unprocessed pile of moreToCome callbacks and only queue the latest? If we have queued a setTimeout, we could clear it if we have a more recent one incoming. Would that be too dangerous?

Alternatively, we could track time between invocations on the monitoring callback itself and return if it has already been recently invoked (similar to how we debounce the wake calls for non-streaming)

Is the streaming hello the only server context where this is used?

No, moreToCome is also supported on exhaust cursors, but drivers don't implement support for those today. You can learn more about this flag here.

Looking at this implementation more, I'm actually not sure how it's accomplishing the performance improvement...

That's right, you're describing the bug I mentioned in my comment. It should be fairly easy to prove visually that this is happening (if you're printing out messages on the events).

I am still not fully following, however, how the "synthetic" event processing works.

Each time the driver sends a request, it enqueues a callback in its response queue waiting for a response with a responseTo which correlates with the request's requestId. When the response comes in we dequeue its operationDescription and call the callback stored therein. When the response indicates it has moreToCome, we immediately re-enqueue the callback, but with an expected responseTo (the "correlation id") of the response's requestId (this is an implementation detail of the server, and most likely because the server implements streaming support by running a polling cursors server-side, so there are many request/responses but just happening on the server).

If we do put the callback in a timeout, won't it mean that we are still going to end up with a stack of callbacks that all get executed back to back on the next pass through the event loop?

Yes, that's correct, but I also do think that's what you want. The stampede of messages isn't inherently something the driver can't handle, the problem is that we are starving the event loop and preventing the runtime from processing other messages, which is an "async sin". I suspect the reason the changes in this PR convey a performance boost is that:

the test application is printing out a string on serverHeartbeatEvent

every moreToCome request that comes in results in calling at least one console.log, so now you have event loop starvation while calling I/O

Following that thought - if we already can delay the callback execution for moreToCome, could we detect if we have an unprocessed pile of moreToCome callbacks and only queue the latest..

I would be nervous to try do something like this. If this is a situation where the data is queued in kernel read buffers, then you can't make this decision in your driver code since the driver needs to read the request from the TCP stream first. You don't know a) when you are about to receive a "herd", and b) when it stops.

we could track time between invocations on the monitoring callback itself and return if it has already been recently invoked

This is similarly dangerous. If you don't know when you'll receive the last one you might debounce the callback and never "receive" the last hello.

I think the easiest thing to do here is to wrap the call to the callback with setTimeout/process.nextTick and let the node runtime do what it does best.

const result = operationDescription.fullResult ? message : message.documents[0]; if ('moreToCome' in message && message.moreToCome) { setTimeout(() => callback(undefined, result), 0); return; } callback(undefined, result);

@baileympearson not exactly

when the event loop enters a given phase, it will perform any operations specific to that phase, then execute callbacks in that phase's queue until the queue has been exhausted or the maximum number of callbacks has executed.

Once the poll queue is empty the event loop will check for timers whose time thresholds have been reached. If one or more timers are ready, the event loop will wrap back to the timers phase to execute those timers' callbacks.

https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/

setTimeout() I believe is the correct way to go in this case as process.nextTick() could potentially have the same issue as we are currently seeing as it would not exit the poll phase until all the i/o is completed. What @mbroadst did point out is correct in the ordering of the heartbeats and I need to move this around to make sure that each started heartbeat gets the appropriate subsequent succeeded/failed heartbeat immediately after it. That's currently not the case in the logs:

@dariakp it might work, but I think at this point it might make sense to develop some microbenchmarks to ascertain the effect of different solutions in this relatively hot code path. A few thoughts:

I think you are making an assumption that all messages are processed at exactly the same time. If that's what happens, you might be right that deferring processing each of them to the next tick might just be delaying the inevitable. I was assuming that batches of messages would be processed together, and so we would free up the event loop by allowing it to process a batch-at-a-time. Maybe we can test this theory by putting a stream in front of a large amount of in-memory BSON bytes (simulating the full kernel read buffer) and observing the behavior?

Streaming with moreToCome is originally intended to allow for streaming results of queries, streaming hello just takes advantage of the protocol bits for a slightly different purpose. We do have an assumption that monitoring events are monotonically increasing, so your idea to debounce those events works, so long as we limit the debouncing just to monitoring events (I think today that would mean implementing this in the callback, not in connection.ts where you call the callback). I think we should also consider a solution that works for the general streaming case where you do want to process all batches.

Thanks for confirming our suspicions @durran. Were you also able to confirm that moving the setTimeout to the callback in connect.ts resolves the issue?

@mbroadst Just to update what we ended up discussion in a meeting. Adding setTimeout to the callback did allow control to yield and allow actual commands to process and not be blocked by all the heartbeats. However, all the heartbeats did still need to be processed, which was still taking significant time when the lambda functions were at their default resource allocation (128mb ram/2 cores). What we decided to do to help with that issue, was to change the MessageStream to know if it was for a monitoring connection, and when reading incoming messages from the buffer would only ever actually process the last message on the buffer. (4b9ad77). This has alleviated the time to process issue on the small functions. (Note when I tested on lambda functions with more resources the problem never initially happened. For example giving the function 1gb ram, I could never see this issue)

@durran ok, thanks for the update!

durran · 2022-05-18T21:31:36Z

Per our discussion this solution is gone away in favour of a message stream optimisation in lambda environments.

fix(NODE-4140): allow monitor to yield control

01107af

addaleax approved these changes May 11, 2022

View reviewed changes

durran requested review from nbbeeken, baileympearson and dariakp May 11, 2022 15:16

durran changed the title ~~fix(NODE-4140): allow monitor to yield control~~ fix(NODE-4139): allow monitor to yield control May 11, 2022

dariakp requested changes May 11, 2022

View reviewed changes

dariakp added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label May 11, 2022

mbroadst reviewed May 11, 2022

View reviewed changes

durran closed this May 18, 2022

durran deleted the NODE-4140 branch May 18, 2022 21:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(NODE-4139): allow monitor to yield control #3243

fix(NODE-4139): allow monitor to yield control #3243

durran commented May 11, 2022

dariakp May 11, 2022

mbroadst May 11, 2022 •

edited

Loading

dariakp May 11, 2022

dariakp May 11, 2022

dariakp May 11, 2022

mbroadst May 12, 2022 •

edited

Loading

dariakp May 12, 2022

durran May 12, 2022

mbroadst May 16, 2022

durran May 25, 2022

mbroadst May 25, 2022

durran commented May 18, 2022

fix(NODE-4139): allow monitor to yield control #3243

fix(NODE-4139): allow monitor to yield control #3243

Conversation

durran commented May 11, 2022

Description

What is changing?

Is there new documentation needed for these changes?

What is the motivation for this change?

Double check the following

Choose a reason for hiding this comment

mbroadst May 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbroadst May 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

durran commented May 18, 2022

mbroadst May 11, 2022 •

edited

Loading

mbroadst May 12, 2022 •

edited

Loading