New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubectl drain doesn't delete pods created by PetSet #33727
Comments
I can reproduce it. |
Drain is just not implemented for petset, you can cordon the node (kubectl cordon) and delete pets on it. Simply draining the node is risky because you might end up deleting all your quorum members at once, for example. Pets require extra care, if you're running something in a petset that doesn't require such care, perhaps you can get by with a replica set. To implement drain on petset with reduced risk:
The first point means kubectl needs to delete a pet, wait for it to completely finish its The easiest way to do this right now is:
Having a human in the loop obviously helps ordering because the human can make sure we don't keep deleting the master. |
@bprashanth If we need to delete n pods on a node, there will be at most n pods unavailable during some period of time, is that OK? |
It's preferable to have at most 1 pod unavailabe at one time. That pod will have a procedure to leave the cluster, eg nodetool decom. Most docs will be written in a way that describes this process for a single node (eg: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_replace_live_node.html). You will also probably find docs on how to recover the cluster, should other nodes end up dying/coming up when this happens. The goal here is to not invoke the manual healing process, but stick as close as possible to the documented, remove single node, readd single node version. |
@bprashanth I can understand that delete pets in more complicated than delete other pods. But currently, the getController() function returns an error so that the kubectl will exit with an error without delete any pods. I think at least the command should delete all other pods and maybe have a flag to force delete pets (like current |
I don't want petset pods to be special. I think we may need to be using On Oct 3, 2016, at 7:01 PM, Kevin Wang notifications@github.com wrote: @bprashanth https://github.com/bprashanth I can understand that delete — |
Yes my proposal was if you want to drive the drain from kubectl |
Should kubectl look at disruption budgets too? On Oct 3, 2016, at 7:32 PM, Prashanth B notifications@github.com wrote: Yes my proposal was if you want to drive the drain from kubectl — |
don't see why not, I think @pwittrock and @ymqytw are working on getting kubectl to respect disruption budget in general. This might mean the petset controller needs to create a default budget. |
@bprashanth Could the |
preStop is take out of deletion grace, so yeah you can in theory use either of those and preferably use the more standard one (deletion grace). Some downsides of only having one "tear down" event were discussed here: #28706 (comment) However this alone doesn't solve the problem that more than 1 pet shouldn't decom simultaneously. |
We may not need to "pause" the petset controller at all. Expanding on his earlier points, the progression would be as follows:
This may happen when:
The proposed way of doing this for 1.5 as per discussion is as follows:
|
So in the master PetSets are now StatefulSets We need this in 1.5 as StatefulSets and 1.4.x as PetSets. Getting it back ported is a show stopper. No idea how, @foxish ideas? |
backporting api changes will break people in a minor release. generally a recipe for disaster. |
@bprashanth that is amazing news ... Not. Does drain work via the API? |
Looks like we're still doing the first, but the second regarding PDBs and defaults needs further discussion. |
or are we changing the examples/docs to encourage including an explict PDB, and not special-casing StatefulSets? |
Based on an offline discussion, we are not going to change client-side code to support petsets especially and instead advice people to setup a PDB if they want special behavior. For 1.5, the eviction behavior will be the same for all pods, including those which are part of a statefulset. See #35318 (comment). @ymqytw @smarterclayton |
Sgtm special case client is dangerous precedent and isn't our long term On Nov 1, 2016, at 7:12 PM, Anirudh Ramanathan notifications@github.com Based on an offline discussion, we are not going to change client-side code @ymqytw https://github.com/ymqytw @smarterclayton — |
How are upgrades of k8s going to work with stateful sets that is for instance zoo keeper? |
Let me put some constraints around this. Stateful Sets have to automatically evict on a drain. Otherwise how are upgrades at scale are going to be automated? |
@chrislovecnm The plan is to evict them using the eviction subresource. if there is one petset pod per node, then there is no issue when doing node drains. If one has a specific requirement where only N petset pods can be down at any given time, the right way for now would be to create a PodDisruptionBudget to reflect that. The eviction subresource respects the PDB. We plan on updating the documentation so that folks building production applications can create an explicit PDB for now. |
s/petset/statefulset/g |
SGTM - the feature formally know as petset |
Foxish this is specific to 1.5 though? |
Yes, this is specific to beta. We don't expect that the PDB/eviction mechanism for carrying out node drains will change even in GA, but we may have defaults, or some other way of specifying them, which will likely be proposed/discussed after 1.5. |
Automatic merge from submit-queue Fix kubectl drain for statefulset Support deleting pets for `kubectl drain`. Use evict to delete pods. Fixes: #33727 ```release-note Adds support for StatefulSets in kubectl drain. Switches to use the eviction sub-resource instead of deletion in kubectl drain, if server supports. ``` @foxish @caesarxuchao
Kubernetes version (use
kubectl version
):1.4.0
What happened:
When use
kubectl drain node
, it shows errorUnknown controller kind "PetSet":
What you expected to happen:
kubectl should remove all pods on that node created by PetSet
The text was updated successfully, but these errors were encountered: