Ensure heartbeats aren't flushed on successful activity completion#1019
Ensure heartbeats aren't flushed on successful activity completion#1019Sushisource merged 4 commits intomasterfrom
Conversation
85f9c23 to
798aec8
Compare
| let should_flush = !known_not_found | ||
| && !matches!( | ||
| &status, | ||
| aer::Status::Completed(_) | aer::Status::WillCompleteAsync(_) | ||
| ); |
There was a problem hiding this comment.
Hrmm, digging into known_not_found, I'm thinking we also should not send heartbeat if this was a canceled with a cancel reason of "timed out" or "reset", or if this was a canceled reason of "cancelled" and we're bubbling it out as such.
May be worth confirming if Go and Java do the same with regards to how they choose whether to flush heartbeat.
There was a problem hiding this comment.
Go's logic is the same as this. IMO it's not worth trying to be extra clever here. Timed out I agree is probably just wasted, but even the reset case might have some edge situation where you'd want to keep it (like resetting just attempts, not progress - don't know if that's possible but could be at some point).
There was a problem hiding this comment.
I can see canceled somehow wanting to send since the RPC would be a success (even if it is dropped a few ms later when we respond with canceled). For timed out and reset, those are known-not-found IIUC (meaning the activity task is gone) and the heartbeat RPC will always fail IIUC. I think we should consider updating SDKs to not call the heartbeat RPC in situations they know will fail on the server, but it is technically harmless IIUC. Won't block the PR on it, but it's inefficient.
There was a problem hiding this comment.
In the timeout case known_not_found should already end up being true here
cretz
left a comment
There was a problem hiding this comment.
I still think we're sending too many unnecessary heartbeats for users, but it's non-blocking
What was changed
In title
Why?
Pointless, since these details are just erased after the activity completes successfully, and it imposes extra cost on cloud users for no reason.
Checklist
Closes [Feature Request] Do not send heartbeats on activity success #1017
How was this tested:
Added unit test. Can't integration test because I can't prove a negative, but, I seemed to discover a server bug with the case where flushing should happen in the process.
Any docs updates needed?