[Question]node lifecycle management #334

hchenxa · 2019-04-28T07:36:20Z

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

What did you expect to happen:
I have a question about the cluster-api lifecycle management.

currently, when I using cluster-api to deploy a cluster, the behavior seems like below:

the cluster and master node resources will be created on bootstrap node.
the master node will start a controller to start the worker node.
the bootstrap node will not aware the new created resources.

and if I want to clean up the cluster, I need :

If I want to delete the worker node, I need to login sub-cluster master node to delete all worker node first.
After the steps 1 finished, I need to jump to bootstrap node to delete the cluster and master node then.

so the problem here is:

The bootstrap node can not control the new created resources from sub-cluster
I need to keep the bootstrap node to control the sub-cluster master node
If delete the master node on bootstrap node, the woerk nodes which created by sub-cluster will out-of-controller
If sub-cluster was deleted by it's self, I can not boot-up the master node on bootstrap node currently.
...

@gyliu513 @jichenjc , any comments and suggestion here?

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

Cluster-api version:
Minikube/KIND version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

The text was updated successfully, but these errors were encountered:

gyliu513 · 2019-04-29T01:44:11Z

@hchenxa For deleting a worker node, you can run kubectl delete machine xxx on the provisioned cluster; But I was not clear how to delete a master node, and there are some discussion at #294 (comment) , can you help check more for this?

jichenjc · 2019-04-29T16:06:33Z

I think we should not allow a delete from master cluster, this means delete the master from itself, cluster-api has a env variable to be defined and it can prevent itself from deletion
https://github.com/kubernetes-sigs/cluster-api/blob/master/pkg/controller/machine/controller.go#L255

and

I need to keep the bootstrap node to control the sub-cluster master node
I guess maybe this is a good question to ask in one of 4 working groups in cluster-api.. maybe https://docs.google.com/document/d/1E7N_JcSsDnnJk2yqMCCriBFwRZZc8nnx4kxSt99z7As/edit# ? some the extension group?

hchenxa · 2019-05-05T13:49:55Z

it looks like the current behavior was changed, when the master node finish the provision, the CR and CRD will be deleted from the bootstrap nodes.

cc @gyliu513 @jichenjc

gyliu513 · 2019-05-05T13:54:09Z

@hchenxa so the bootstrap cluster will not able to get the CR any more after provision finished and the end user has to logon to the new provisioned cluster to do some operations. Can you help check how to delete a master node in such condition?

gyliu513 · 2019-05-05T13:54:11Z

@hchenxa so the bootstrap cluster will not able to get the CR any more after provision finished and the end user has to logon to the new provisioned cluster to do some operations. Can you help check how to delete a master node in such condition?

jichenjc · 2019-05-05T21:22:33Z

yes, let's do some test and see whether we can 'kubectl del machine xxxx-master' ? I think we do but need double check

fejta-bot · 2019-08-03T22:22:56Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

sbueringer · 2019-08-28T04:37:09Z

@jichenjc What's the status of this issue? :)

jichenjc · 2019-08-28T05:06:11Z

this might need open for a while, see whether v1alpha2 can mitigate the issue

sbueringer · 2019-08-28T05:35:57Z

But is it specific to the Openstack provider? Seems to be an issue with clusterctl Workflow which then should be tracked in the cluster-api repo because it affects all providers

jichenjc · 2019-08-28T05:50:17Z

ok..

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 28, 2019

hchenxa changed the title ~~[Question]cluster lifecycle management~~ [Question]node lifecycle management Apr 28, 2019

gyliu513 mentioned this issue May 7, 2019

cluster api controllers are not deleted after k8s master provision finished kubernetes-sigs/cluster-api#876

Closed

hchenxa mentioned this issue May 27, 2019

Check how to delete master node kubernetes-sigs/cluster-api-provider-ibmcloud#187

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2019

jichenjc closed this as completed Aug 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]node lifecycle management #334

[Question]node lifecycle management #334

hchenxa commented Apr 28, 2019 •

edited

gyliu513 commented Apr 29, 2019

jichenjc commented Apr 29, 2019

hchenxa commented May 5, 2019

gyliu513 commented May 5, 2019

gyliu513 commented May 5, 2019

jichenjc commented May 5, 2019

fejta-bot commented Aug 3, 2019

sbueringer commented Aug 28, 2019

jichenjc commented Aug 28, 2019

sbueringer commented Aug 28, 2019

jichenjc commented Aug 28, 2019

[Question]node lifecycle management #334

[Question]node lifecycle management #334

Comments

hchenxa commented Apr 28, 2019 • edited

gyliu513 commented Apr 29, 2019

jichenjc commented Apr 29, 2019

hchenxa commented May 5, 2019

gyliu513 commented May 5, 2019

gyliu513 commented May 5, 2019

jichenjc commented May 5, 2019

fejta-bot commented Aug 3, 2019

sbueringer commented Aug 28, 2019

jichenjc commented Aug 28, 2019

sbueringer commented Aug 28, 2019

jichenjc commented Aug 28, 2019

hchenxa commented Apr 28, 2019 •

edited