-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shutdown revamp #68
Shutdown revamp #68
Conversation
07c2b8c
to
dd8ab53
Compare
46c3f8f
to
fa27ed6
Compare
// non-blocking manner | ||
func (s *Server) shutdown() { | ||
// stop accepting new consumers | ||
s.managerCancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a blocking call. What about something like s.manager.StopAcceptingConsumers()
to push this behavior to the ConsumerManager
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
e9e5663
to
2b414af
Compare
// shutdown closes current clients and also stops accepting new clients in a | ||
// non-blocking manner | ||
func (s *Server) shutdown() { | ||
s.manager.StopAcceptingConsumers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still racy (not trully blocking). As implemented the mutex protects the teardown
variable, the map
is not protected, so we still can end up with a new consumer when the map
iteration takes place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
bc5789c
to
ea337b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the steps look logical, and having a server.shutdown()
that orchestrates & documents the shutdown sequence is definitely a win. I believe the code is easier to follow now.
👍
This patch aims to minimize the downtime for producers during server upgrades and generally make the shutdown process more robust. The flow is now the following: 1) shutdown signal is received 2) stop accepting new consumers and close clients with at least 1 consumer 3) stop accepting new producers and close rest of the clients 4) shutdown and restart This way, producer downtime is reduced to the time elapsed between (3) and (4), which should be less than a second. Nevertheless, clients should still have sane retry defaults configured anyway, because some downtime will still occur. This patch also changes the client IDs (Client.ID) to be unique per-client. These are internal and should not affect the consumer logic, since we introduce Client.consID and we use that for consumer IDs.
424ac6a
to
5e402e1
Compare
This patch aims to minimize the downtime for producers and make the
shutdown process generally more robust. The flow is now the following:
This way, producer downtime is reduced to the time elapsed between (3)
and (5), which should be less than a second. Nevertheless, clients
should still have sane retry defaults configured anyway, because some
downtime will still occur.
Also added a hard timeout of 5 seconds around the shutdown process.