Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: Write kubelet config file to disk and persist in-cluster #63887

Merged
merged 2 commits into from May 23, 2018

Conversation

@luxas
Copy link
Member

luxas commented May 15, 2018

What this PR does / why we need it:
In order to make configuration flow from the cluster level to node level, we need a way for kubeadm to tell the kubelet what config to use. As of v1.10 (I think) the kubelet can read --config using the kubelet Beta ComponentConfiguration API, so now we have an interface to talk to the kubelet properly.

This PR:

  • Writes the kubelet ComponentConfig to /var/lib/kubelet/config.yaml on init and join
  • Writes an environment file to source in the kubelet systemd dropin /var/lib/kubelet/kubeadm-flags.env. This file contain runtime flags that should be passed to the kubelet.
  • Uploads a ConfigMap with the name kubelet-config-1.X
  • Patches the node object so that it starts using the ConfigMap with updates using Dynamic Kubelet Configuration, only if the feature gate is set (currently alpha and off by default, not intended to be switched on in v1.11)
  • Updates the phase commands to reflect this new flow

The kubelet dropin file I used now looks like this:

# v1.11.x dropin as-is at HEAD
# /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile-=/var/lib/kubelet/kubeadm-flags.env
# Should default to 0 in v1.11: https://github.com/kubernetes/kubernetes/pull/63881, and hence not be here in the real v1.11 manifest
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
# Should be configurable via the config file: https://github.com/kubernetes/kubernetes/issues/63878, and hence be configured using the file in v1.11
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
---
# v1.11.x dropin end goal
# /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile-=/var/lib/kubelet/kubeadm-flags.env
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
---
# Environment file dynamically created at runtime by "kubeadm init"
# /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes/kubeadm#822
Fixes kubernetes/kubeadm#571

Special notes for your reviewer:

Release note:

"kubeadm init" now writes a structured and versioned kubelet ComponentConfiguration file to `/var/lib/kubelet/config.yaml` and an environment file with runtime flags (you can source this file in the systemd kubelet dropin) to `/var/lib/kubelet/kubeadm-flags.env`.

@kubernetes/sig-cluster-lifecycle-pr-reviews @mtaufen

@luxas luxas changed the title kubeadm: Write kubelet config file to disk and persist in-cluster [WIP] kubeadm: Write kubelet config file to disk and persist in-cluster May 15, 2018

@k8s-ci-robot k8s-ci-robot requested review from detiber and stealthybox May 15, 2018

@timothysc
Copy link
Member

timothysc left a comment

So in general this looks like an overall cleanup improvement.. But upgrading from an old unit file seems fraught with peril

/cc @detiber

return fmt.Errorf("error creating base kubelet configuration: %v", err)
}

// NOTE: flag "--dynamic-config-dir" should be specified in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

This comment has been minimized.

@timothysc

timothysc May 16, 2018

Member

b/c the kubelet will write to this location it should be /var/lib/kubelet ...

This comment has been minimized.

@timothysc

timothysc May 16, 2018

Member

How are you going to upgrade into this? You will need a systemctl daemon-reload + restart.


// DownloadKubeletConfig downloads the kubelet configuration from a ConfigMap and writes it to disk.
// Used at "kubeadm join" time
func DownloadKubeletConfig(kubeletKubeConfig string) error {

This comment has been minimized.

@timothysc

timothysc May 16, 2018

Member

shouldn't we have a download-config subcommand as well then for debugging?

// KubeletBaseConfigurationFile specifies the file name on the node which stores initial remote configuration of kubelet
KubeletBaseConfigurationFile = "kubelet"
// KubeletConfigurationFile specifies the file name on the node which stores initial remote configuration of kubelet
KubeletConfigurationFile = "/var/lib/kubelet/config.yaml"

This comment has been minimized.

@detiber

detiber May 16, 2018

Member

If I am understanding the docs for Dynamic Kubelet Config, it looks like the dynamic config is downloaded from the config map to the directory specified by --dynamic-config-dir automatically and the config pulled down overrides the values that are set through --config. With that being the case, shouldn't we write the static config file (or initial config in the case of dynamic config) to /etc/kubernetes and leave /var/lib/kubelet for the content that is dynamically managed at runtime by the kubelet?

@detiber

This comment has been minimized.

Copy link
Member

detiber commented May 16, 2018

The kubelet dropin file I used now looks like this:

# v1.11.x dropin
# /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile=/etc/kubernetes/kubelet-cri.env
# Should default to 0 in v1.11: https://github.com/kubernetes/kubernetes/pull/63881, and hence not be here in the real v1.11 manifest
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
# Should be configurable via the config file: https://github.com/kubernetes/kubernetes/issues/63878, and hence be configured using the file in v1.11
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_CONTAINER_RUNTIME_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_RESOLVER_ARGS $KUBELET_EXTRA_ARGS
---
# /etc/kubernetes/kubelet-cri.env
# In this file, you can easily add support for other CRI runtimes and specify flags needed for that CRI
# By default these arguments are here as they are specific to the docker runtime
KUBELET_CONTAINER_RUNTIME_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin

The addition of an environment file is great here, but if we go forward with this it would probably be a good idea to stick with standard env file paths for the OSs we ship packages for. I know for CentOS/Fedora this would be /etc/sysconfig/kubelet and /etc/default/kubelet for Ubuntu

An aside while I'm on the topic of systemd unit configuration managed by the packaging, we really shouldn't be placing packaged managed unit files and drop-in files in /etc/systemd/system, instead they should be placed in /usr/lib/systemd/system

I'm still concerned with how we handle this transition during upgrades, since there is a bit of chicken/egg involved between the package managed dropin and the configuration done by kubeadm upgrade. If we force the update with the package upgrade of kubeadm, then we risk breaking the running kubelet prior to install. However, we could transition the ownership of the drop-in file to kubeadm, which would allow for kubeadm itself to mutate the drop-in, perform a systemctl daemon-reload and then restart the kubelet. This would also give us a way to backup the drop-in and revert it if needed as well.

@randomvariable

This comment has been minimized.

Copy link
Member

randomvariable commented May 16, 2018

Agree that we should follow the distribution standards for placement for environment files and using /usr/lib/systemd over /etc/systemd/system

There's a potential issue at present in that the docs ask people to modify 10-kubeadm.conf to set the cgroup driver, and yet we haven't also marked these files as conffiles in Bazel, so package updates will blow the user changes away and would probably break their system if the cgroup driver is non-default?

Moving management of that particular config to something kubeadm controls so that the drop-ins are truly static makes more sense.

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented May 16, 2018

@randomvariable

There's a potential issue at present in that the docs ask people to modify 10-kubeadm.conf to set the cgroup driver, and yet we haven't also marked these files as conffiles in Bazel, so package updates will blow the user changes away and would probably break their system if the cgroup driver is non-default?

the already modified kubelet config by the user (e.g. cgroup driver updated) is a problem. i think the documentation and the suggested workflow for the kubelet would have been to never modify the kublet config but handle extra arguments from the environment only. if the same location is written to, a backup of the user config should be created - e.g. $cp 10-kubeadm.conf 10-kubeadm.conf.user every time.

@detiber

I'm still concerned with how we handle this transition during upgrades, since there is a bit of chicken/egg involved between the package managed dropin and the configuration done by kubeadm upgrade. If we force the update with the package upgrade of kubeadm, then we risk breaking the running kubelet prior to install. However, we could transition the ownership of the drop-in file to kubeadm, which would allow for kubeadm itself to mutate the drop-in, perform a systemctl daemon-reload and then restart the kubelet. This would also give us a way to backup the drop-in and revert it if needed as well.

do you have something in mind on how the mutator would work? if the user has modified his kubelet config a lot, doing a diff -u <old> <new> > patchfile; patch < patchfile might not be sufficient. perhaps a line by line parser / comparator of sorts is needed here?

@detiber

This comment has been minimized.

Copy link
Member

detiber commented May 16, 2018

One option that we have is to have the upgrade scripts in packaging preserve the existing file and create the new version of the file under /usr/lib/systemd and require the user to reconcile the differences. This is a more manual step, but would get us to the place that we want to be... We could also tell users that if they haven't modified the file that it is safe to just blow it away.

As for documentation on how to modify the unit config, we could tell users to just use systemctl edit kubelet which will open an editor and create a drop-in file under /etc/systemd/system/kubelet.service.d for the user.

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented May 16, 2018

going back to this:

An aside while I'm on the topic of systemd unit configuration managed by the packaging, we really shouldn't be placing packaged managed unit files and drop-in files in /etc/systemd/system, instead they should be placed in /usr/lib/systemd/system

this is so true. usr/lib is the de-facto vendor specific location for packaged software.

One option that we have is to have the upgrade scripts in packaging preserve the existing file and create the new version of the file under /usr/lib/systemd and require the user to reconcile the differences. This is a more manual step, but would get us to the place that we want to be... We could also tell users that if they haven't modified the file that it is safe to just blow it away.

i think the files should be placed in the same folder and for the old one a suffix should be appended, like .user or .old . having the old / new in different locations is even more confusing.

As for documentation on how to modify the unit config, we could tell users to just use systemctl edit kubelet which will open an editor and create a drop-in file under /etc/systemd/system/kubelet.service.d for the user.

this seems like a good solution.

so in terms of the docs and website i'm seeing a bad trend where a lot of the reports are about the cgroup driver issue. ideally the user should never edit system service configuration files as part of the UX...ever.
yet i don't see this being possible unless things like --cgroup-driver=auto are the default option.

@detiber

This comment has been minimized.

Copy link
Member

detiber commented May 16, 2018

One option that we have is to have the upgrade scripts in packaging preserve the existing file and create the new version of the file under /usr/lib/systemd and require the user to reconcile the differences. This is a more manual step, but would get us to the place that we want to be... We could also tell users that if they haven't modified the file that it is safe to just blow it away.

i think the files should be placed in the same folder and for the old one a suffix should be appended, like .user or .old . having the old / new in different locations is even more confusing.

I do agree that it is confusing, but it would ensure that the existing drop-in would override the package provided one... We could always attempt to do a diff against the known previous version and only leave the diff as a drop-in under /etc/systemd/system/kubelet.service.d. We could also provide a preflight check to force the user to take some action prior to upgrade as well.

I think anything we do for the transition is going to be awkward, but as long as we provide sufficient documentation and tooling to assist the user through the transition then we can limit the impact.

@luxas luxas force-pushed the luxas:kubeadm_kubelet_integration branch 2 times, most recently from 948c497 to 0ba1430 May 17, 2018

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented May 21, 2018

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

2 similar comments
@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented May 22, 2018

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented May 22, 2018

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

kubeadm: Write kubelet config file to disk and persist in-cluster. Al…
…so write runtime environment file and fixup the kubelet phases command

@luxas luxas force-pushed the luxas:kubeadm_kubelet_integration branch from 916cd0a to 5382cf7 May 22, 2018

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented May 22, 2018

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm label May 22, 2018

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented May 22, 2018

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: luxas, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@luxas luxas force-pushed the luxas:kubeadm_kubelet_integration branch from 5382cf7 to 60b0eeb May 22, 2018

@luxas luxas added the lgtm label May 22, 2018

@luxas

This comment has been minimized.

Copy link
Member Author

luxas commented May 22, 2018

/retest

2 similar comments
@luxas

This comment has been minimized.

Copy link
Member Author

luxas commented May 22, 2018

/retest

@luxas

This comment has been minimized.

Copy link
Member Author

luxas commented May 22, 2018

/retest

@luxas

This comment has been minimized.

Copy link
Member Author

luxas commented May 22, 2018

/retest

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented May 23, 2018

Automatic merge from submit-queue (batch tested with PRs 63914, 63887, 64116, 64026, 62933). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 773ced1 into kubernetes:master May 23, 2018

18 checks passed

Submit Queue Queued to run github e2e tests a second time.
Details
cla/linuxfoundation luxas authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details

krzyzacy pushed a commit to krzyzacy/kubernetes that referenced this pull request May 30, 2018

Kubernetes Submit Queue
Merge pull request kubernetes#64210 from luxas/kubeadm_kubelet_extraargs
Automatic merge from submit-queue (batch tested with PRs 64322, 64210, 64458, 64232, 64370). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm: Move .NodeName and .CRISocket to a common sub-struct

**What this PR does / why we need it**:
Regroups some common fields for `kubeadm init` and `kubeadm join` only used for the initial node registration.
Lets the user specify ExtraArgs to the kubelet.
Now also runs the dynamic env file creation for `kubeadm join`.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes kubernetes/kubeadm#847
Follows-up kubernetes#63887
Related to kubernetes/kubeadm#822

**Special notes for your reviewer**: WIP, but please review so we can finalize the direction of the PR

**Release note**:

```release-note
[action required] `.NodeName` and `.CRISocket` in the `MasterConfiguration` and `NodeConfiguration` v1alpha1 API objects are now `.NodeRegistration.Name` and `.NodeRegistration.CRISocket` respectively in the v1alpha2 API. The `.NoTaintMaster` field has been removed in the v1alpha2 API.
```
@kubernetes/sig-cluster-lifecycle-pr-reviews @liztio

@randomvariable randomvariable referenced this pull request Sep 21, 2018

Closed

REQUEST: New membership for randomvariable #105

6 of 6 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.