Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update KEP kubeadm join --master #2331

Merged

Conversation

fabriziopandini
Copy link
Member

Update to the kubeadm join --master KEP

/area kubeadm
/sig sig-cluster-lifecycle

@kubernetes/sig-cluster-lifecycle-pr-reviews

/CC @luxas @timothysc

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 29, 2018
@chuckha
Copy link
Contributor

chuckha commented Jun 29, 2018

How would folks feel about kubeadm join --control-plane ?

Thumbs up/Thumbs down would work as a response

@neolit123
Copy link
Member

How would folks feel about kubeadm join --control-plane ?

i think --master is fine, unless --control-plane is considered more descriptive.

Copy link
Member

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @fabriziopandini!
The initial reaction is that this looks good, but I have comments/questions on how to manage the certs and identities in this scenario.

- "@fabriziopandini"
owning-sig: sig-cluster-lifecycle
reviewers:
- "@cha”
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+luxas

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on github @chuckha

owning-sig: sig-cluster-lifecycle
reviewers:
- "@cha”
- "@jdtibier"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using kubeadm in combination with some scripts and/or automation tools (e.g.
[this](https://kubernetes.io/docs/setup/independent/high-availability/)), this KEP was
designed with the objective to introduce an upstream simple and reliable solution for
achieving the same goal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... with important non-goals. Say clearly (also in the summary) that this doesn't solve every case or even the full end-to-end flow automatically

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

user stories for creating an highly available Kubernetes cluster, but instead
focuses on:

- Defining a generic and extensible flow for bootstrapping an HA cluster, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a cluster consisting of multiple masters instead of HA?


Higher-level tools could create nodes in parallel (both masters and workers)
for reducing the overall cluster startup time.
`kubeadm join --master` should support natively this practice without requiring
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and how does this work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create in parallel bootstrap master, secondary masters and workers because:

  1. all the joining nodes (secondary masters and workers) use the same discovery mechanism, that "automatically" waits for the bootstrap master to complete its setup
  2. the joining nodes (secondary masters and workers) can join in any order


#### Static workflow (advertise-address != `controlplaneAddress`)

In case of a static bootstrap workflow the final layout of the controlplane - the number, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is technically the difference between these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed in some kubeadm office hours meetings, and I wrote also a blog post with all the details https://blog.heptio.com/kubernetes-ha-under-x-ray-5d05f552c9f

(`kubeadm join --master` will fail immediately), but nothing in this proposal should
prevent to address this in subsequent phases.

#### Strategies for distributing cluster certificates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please dive into how the CSR API could be utilized in an external CA-world.
It'd be cool to just use the CSR API, but we're stuck needing the front-proxy clientcert for the API server and the SA key that is not bound to the CA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep this out of scope of this scope, because it does not apply specifically to the multi-master scenario (e.g. it is relevant also for the existing kubeadm join workflow).

Never the less I totally agree the we should document how the CSR API could be used in an external CA-world, but I don't have real expertise in this. Eventually, let's raise the topic at the sig-meeting

Nothing in this proposal prevents implementation of `kubeadm upgrade` for HA cluster.

Further detail will be provided in a subsequent release of this KEP when all the detail
of the `v1Beta1` release of kubeadm api will be available (including a proper modeling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: v1beta1


## Drawbacks

The kubeadm join --master workflow requires that some condition are satisfied at `kubeadm-init` time,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubeadm init

proposal (`kubeadm join --master` will fail immediately), but nothing in this proposal
should prevent to address this in subsequent phases.

#### `kubeadm upgrade` for HA clusters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you dive into exactly how and in which order the upgrade should be performed.
i.e. upgrade etcd fully first (in some way out of scope for this proposal probably), then remove one master from the loadbalancer, upgrade it, (add it back to the LB, or not). Upgrades without downtime supported officially or not? etc. etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only prior art I'm aware of is https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-ha/ , but I'm not sure there is already a consensus about how to do this, and so I prefer to get a first release of the KEP approved and the iterate on the upgrade part.

@luxas
Copy link
Member

luxas commented Jun 29, 2018

/assign
/assign @timothysc

@fabriziopandini
Copy link
Member Author

fabriziopandini commented Jul 2, 2018

@luxas I provided answer to your questions and addressed all comments. Let me know if there is something that requires further deep dives.
PS. I'm planning to discuss briefly this KEP during kubeadm office hours on wendesday, as well

Copy link
Member

@timothysc timothysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

I'd like @chuckha and @detiber to lgtm .

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 3, 2018
@fabriziopandini
Copy link
Member Author

@chuckha @detiber kindly ping

Copy link
Contributor

@chuckha chuckha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand exactly how this is different than kubeadm init and I'm having a bit of trouble.

Under the section that outlines kubeadm join --master [New step], I don't see anything new.

The static workflow case is taken care of today with a carefully crafted kubeadm init --config=config.yaml. This is documented in official docs. Its almost exactly this document:

  • A user must copy certificates across nodes
  • A user must run some command on the new node

The dynamic case is more interesting, but I think this document handwaves over the manual steps the user will have to take regardless. For instance, the api-serving-cert will have to be regenerated on the bootstrapping master or copied over from the new master to the bootstrapping master.

What are we getting from a kubeadm join --master that kubeadm init --config=config.yaml is not getting us?


- Provide support both for dynamic and static bootstrap flow

At the time a user is running `kubeadm init`, he might not know what
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use gender neutral pronouns throughout. This should be

At the time a user is running kubeadm init, they might ...

in advance the target layout of the controlplane instances (the number, the name and the IP
of master nodes).

- Support different etcd deployment scenarios, and more specifically run master nodes components
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're calling these stacked control plane nodes in the upstream documentation if you want to use that language here for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

- This proposal doesn't include a solution for etcd cluster management (but nothing in this proposal should
prevent to address this in future).

> At the time of writing, the CoreOS recommended approach for etcd is to run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section adds confusion -- where etcd is run entirely depends on what the goals of the cluster are. Kubeadm provides instructions for both, single etcd (traditional kubeadm workflow), external etcd (bootstrapped with kubeadm) and stacked control planes.

I think you can delete this paragraph if you want.

neither in the initial proposal nor in the foreseeable future (but nothing in this proposal should
explicitly prevent to reconsider this in future as well).

- This proposal doesn't provide an automated solution for transferring the CA key and other required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot wait until we solve this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


- if the user is not planning to distribute the apiserver certificate among masters, kubeadm
will generate a new apiserver serving certificate with the required SANS
- if the user is planning to distribute the apiserver certificate among masters, he/she/the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you delete the "he/she" part? This reads fine with "the operator".

@fabriziopandini
Copy link
Member Author

@chuckha thanks for your comments!
Feedbacks are addressed;, with regards to you general question, yes, you are right, the proposed solution re-uses a lot of what already exists in kubeadm init and kubeadm join, but this is intentional.

What we are getting on top is the fact that we are going to simplify what the user has to do for getting a multi-master cluster up and running, and at the same time, we are reducing the probability to make errors along the way. This is basically achieved by:

  • delegating to kubeadm the task to ensure that exactly the same kubeadm-config_configMap is used on all masters
  • checking that the cluster/the kubeadm-config is properly configured for HA (kind of pre-flight checks on the "carefully creafted config.yaml" )
  • blocking users trying to create multi masters with configurations we don't want to support as a sig (e.g. HA with self-hosted control plane)

On top of that :

  • There are also things that kubeadm join --master doesn't do with respect to kubeadm init (it doesn't create a new token, it doesn't re-write / re-create all the RBAC rules, config maps, addons)
  • We can provide a better/cleaner solution for steps that should be done in a slightly different way on a secondary master with respect to the bootstrap master (e.g. updating the kubeadm-config map adding info about the new master instead of creating a new configMap from scratch *this is related to API redesing).

Does this sound good to you?

PS. the above points are documented in the alternatives paragraphs; let me know if this should be further clarified

@neolit123
Copy link
Member

BTW, during one of the meeting last week we've agreed that --master should to be used, and --control-plane is fine instead.
should that be reflected in the KEP?

@chuckha
Copy link
Contributor

chuckha commented Jul 13, 2018

/lgtm

Thanks for the detailed response @fabianofranz. I think the alternatives section is good. I think you have to intimately know the code to get kubeadm init --config config.yaml to function correctly and kubeadm join --master will be a great addition.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 13, 2018
@k8s-ci-robot k8s-ci-robot merged commit ae474ac into kubernetes:master Jul 13, 2018
@pbarker
Copy link
Contributor

pbarker commented Jul 13, 2018

@fabianofranz I believe this KEP should be number 14 and the NEXT_KEP_NUMBER should be 15 looking at https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md#title

@fabriziopandini fabriziopandini deleted the kubeadm-join-master2 branch August 1, 2018 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants