New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup/simplify shutdown orchestration #97
Comments
The Bottom line is, |
I've been poking around a bit at this. I have a basic idea which I just wanted to put down in writing somewhere, so I figured where better than here 😄 So, here's a rundown of things I think we need to address, one way or another:
I think the safest thing to do, (as things are implemented currently), is upon retire, immediately reject all queued messages(similar to how things would need to be rejected in the above point). This gives the The graceful shutdown would then be a signal for "in flight" operations. IE: did everything that was in flight succeed, and were all unexecuted queued message successfully returned to the I know this is a bit of a departure from what was originally envisioned, but personally, I feel like we need to do our absolute best to make sure Kanaloa never drops work. One interesting thing to note, and one thing we might want to explore, is that there actually might be diverging behaviors on shutdown between the pulling and pushing models. One other thing to note is that based on our earlier chats about this, there actually might be the case for implementing multiple types of shutdown behavior. I think there were many good points bought up about both the current and potential implementations, and they don't need to be mutually exclusive. |
The orchestration of shutdown between QueueProcessor, Queue and Workers is a bit duplicative in terms of who is watching who.
Currently the QueueProcessor receives the initial
Shutdown
message, and then it will tell the Queue to shutdown. The Queue then messages to the the WorkersNoWorkLeft
, which begins their termination. However if the Queue still hasWork
messages queued, it is unclear what happens to them(I think they just get dropped?). The Queue then enters the retiring state, where it waits for aRetiringTimeout
message. Once it receives this, it then sends moreNoWorkLeft
messages to the Workers and then shuts itself down. This termination is then watched by both the QueueProcessor and the Workers, which they then further react to. The QueueProcessor will sendRetire
signals to the Workers.I don't think there are any real adverse effects here, other than the potential Work being dropped, which I need to verify, but this could be simplified a bit.
The text was updated successfully, but these errors were encountered: