You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also, the Operator Controller should handle killing itself, especially so in a single noded cluster! It should not prevent itself from getting scheduled in a Cluster.
The text was updated successfully, but these errors were encountered:
jahkeup
changed the title
dogswatch: prevent Controller from unscheduleable conditions
dogswatch: protect Controller from unscheduleable conditions
Nov 11, 2019
webern
transferred this issue from bottlerocket-os/bottlerocket
Feb 26, 2020
jahkeup
changed the title
dogswatch: protect Controller from unscheduleable conditions
Protect controller from becoming unscheduleable
Feb 27, 2020
One way to protect the controller could be to have the controller save its hosting node for to be updated last. Once it updated through the other nodes, the controller would delete its Pod to be rescheduled and only once started elsewhere would it continue to update that last node.
The controller's deployment should then include bottlerocket.aws/update-available in its antiAffinity weighted selector (preferring update-available==false) so that it lands on updated hosts first.
This method wouldn't account for a single noded cluster or one where only a node was Ready and Schedulable. The controller will have to check that it considers itself to be reschedulable prior to stopping its Pod.
We want to add the ability to allow brupop to update many nodes simultaneously, which makes this more important. Adding this to the 1.0.0 release milestone.
Also, the Operator Controller should handle killing itself, especially so in a single noded cluster! It should not prevent itself from getting scheduled in a Cluster.
Originally posted by @jahkeup in bottlerocket-os/bottlerocket#239 (comment)
The text was updated successfully, but these errors were encountered: