Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]node lifecycle management #334

Closed
hchenxa opened this issue Apr 28, 2019 · 11 comments
Closed

[Question]node lifecycle management #334

hchenxa opened this issue Apr 28, 2019 · 11 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@hchenxa
Copy link
Contributor

hchenxa commented Apr 28, 2019

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

What did you expect to happen:
I have a question about the cluster-api lifecycle management.

currently, when I using cluster-api to deploy a cluster, the behavior seems like below:

  1. the cluster and master node resources will be created on bootstrap node.
  2. the master node will start a controller to start the worker node.
  3. the bootstrap node will not aware the new created resources.

and if I want to clean up the cluster, I need :

  1. If I want to delete the worker node, I need to login sub-cluster master node to delete all worker node first.
  2. After the steps 1 finished, I need to jump to bootstrap node to delete the cluster and master node then.

so the problem here is:

  1. The bootstrap node can not control the new created resources from sub-cluster
  2. I need to keep the bootstrap node to control the sub-cluster master node
  3. If delete the master node on bootstrap node, the woerk nodes which created by sub-cluster will out-of-controller
  4. If sub-cluster was deleted by it's self, I can not boot-up the master node on bootstrap node currently.
    ...

@gyliu513 @jichenjc , any comments and suggestion here?

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version:
  • Minikube/KIND version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 28, 2019
@hchenxa hchenxa changed the title [Question]cluster lifecycle management [Question]node lifecycle management Apr 28, 2019
@gyliu513
Copy link
Contributor

@hchenxa For deleting a worker node, you can run kubectl delete machine xxx on the provisioned cluster; But I was not clear how to delete a master node, and there are some discussion at #294 (comment) , can you help check more for this?

@jichenjc
Copy link
Contributor

I think we should not allow a delete from master cluster, this means delete the master from itself, cluster-api has a env variable to be defined and it can prevent itself from deletion
https://github.com/kubernetes-sigs/cluster-api/blob/master/pkg/controller/machine/controller.go#L255

and

I need to keep the bootstrap node to control the sub-cluster master node
I guess maybe this is a good question to ask in one of 4 working groups in cluster-api.. maybe https://docs.google.com/document/d/1E7N_JcSsDnnJk2yqMCCriBFwRZZc8nnx4kxSt99z7As/edit# ? some the extension group?

@hchenxa
Copy link
Contributor Author

hchenxa commented May 5, 2019

it looks like the current behavior was changed, when the master node finish the provision, the CR and CRD will be deleted from the bootstrap nodes.

cc @gyliu513 @jichenjc

@gyliu513
Copy link
Contributor

gyliu513 commented May 5, 2019

@hchenxa so the bootstrap cluster will not able to get the CR any more after provision finished and the end user has to logon to the new provisioned cluster to do some operations. Can you help check how to delete a master node in such condition?

1 similar comment
@gyliu513
Copy link
Contributor

gyliu513 commented May 5, 2019

@hchenxa so the bootstrap cluster will not able to get the CR any more after provision finished and the end user has to logon to the new provisioned cluster to do some operations. Can you help check how to delete a master node in such condition?

@jichenjc
Copy link
Contributor

jichenjc commented May 5, 2019

yes, let's do some test and see whether we can 'kubectl del machine xxxx-master' ? I think we do but need double check

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2019
@sbueringer
Copy link
Member

@jichenjc What's the status of this issue? :)

@jichenjc
Copy link
Contributor

this might need open for a while, see whether v1alpha2 can mitigate the issue

@sbueringer
Copy link
Member

But is it specific to the Openstack provider? Seems to be an issue with clusterctl Workflow which then should be tracked in the cluster-api repo because it affects all providers

@jichenjc
Copy link
Contributor

ok..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

6 participants