Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: Write kubelet config file to disk and persist in-cluster #63887

Merged
merged 2 commits into from
May 23, 2018

Conversation

luxas
Copy link
Member

@luxas luxas commented May 15, 2018

What this PR does / why we need it:
In order to make configuration flow from the cluster level to node level, we need a way for kubeadm to tell the kubelet what config to use. As of v1.10 (I think) the kubelet can read --config using the kubelet Beta ComponentConfiguration API, so now we have an interface to talk to the kubelet properly.

This PR:

  • Writes the kubelet ComponentConfig to /var/lib/kubelet/config.yaml on init and join
  • Writes an environment file to source in the kubelet systemd dropin /var/lib/kubelet/kubeadm-flags.env. This file contain runtime flags that should be passed to the kubelet.
  • Uploads a ConfigMap with the name kubelet-config-1.X
  • Patches the node object so that it starts using the ConfigMap with updates using Dynamic Kubelet Configuration, only if the feature gate is set (currently alpha and off by default, not intended to be switched on in v1.11)
  • Updates the phase commands to reflect this new flow

The kubelet dropin file I used now looks like this:

# v1.11.x dropin as-is at HEAD
# /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile-=/var/lib/kubelet/kubeadm-flags.env
# Should default to 0 in v1.11: https://github.com/kubernetes/kubernetes/pull/63881, and hence not be here in the real v1.11 manifest
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
# Should be configurable via the config file: https://github.com/kubernetes/kubernetes/issues/63878, and hence be configured using the file in v1.11
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
---
# v1.11.x dropin end goal
# /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile-=/var/lib/kubelet/kubeadm-flags.env
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
---
# Environment file dynamically created at runtime by "kubeadm init"
# /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes/kubeadm#822
Fixes kubernetes/kubeadm#571

Special notes for your reviewer:

Release note:

"kubeadm init" now writes a structured and versioned kubelet ComponentConfiguration file to `/var/lib/kubelet/config.yaml` and an environment file with runtime flags (you can source this file in the systemd kubelet dropin) to `/var/lib/kubelet/kubeadm-flags.env`.

@kubernetes/sig-cluster-lifecycle-pr-reviews @mtaufen

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels May 15, 2018
@luxas luxas changed the title kubeadm: Write kubelet config file to disk and persist in-cluster [WIP] kubeadm: Write kubelet config file to disk and persist in-cluster May 15, 2018
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2018
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm labels May 15, 2018
Copy link
Member

@timothysc timothysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in general this looks like an overall cleanup improvement.. But upgrading from an old unit file seems fraught with peril

/cc @detiber

return fmt.Errorf("error creating base kubelet configuration: %v", err)
}

// NOTE: flag "--dynamic-config-dir" should be specified in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b/c the kubelet will write to this location it should be /var/lib/kubelet ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you going to upgrade into this? You will need a systemctl daemon-reload + restart.


// DownloadKubeletConfig downloads the kubelet configuration from a ConfigMap and writes it to disk.
// Used at "kubeadm join" time
func DownloadKubeletConfig(kubeletKubeConfig string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we have a download-config subcommand as well then for debugging?

// KubeletBaseConfigurationFile specifies the file name on the node which stores initial remote configuration of kubelet
KubeletBaseConfigurationFile = "kubelet"
// KubeletConfigurationFile specifies the file name on the node which stores initial remote configuration of kubelet
KubeletConfigurationFile = "/var/lib/kubelet/config.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am understanding the docs for Dynamic Kubelet Config, it looks like the dynamic config is downloaded from the config map to the directory specified by --dynamic-config-dir automatically and the config pulled down overrides the values that are set through --config. With that being the case, shouldn't we write the static config file (or initial config in the case of dynamic config) to /etc/kubernetes and leave /var/lib/kubelet for the content that is dynamically managed at runtime by the kubelet?

@detiber
Copy link
Member

detiber commented May 16, 2018

The kubelet dropin file I used now looks like this:

# v1.11.x dropin
# /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
---
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile=/etc/kubernetes/kubelet-cri.env
# Should default to 0 in v1.11: https://github.com/kubernetes/kubernetes/pull/63881, and hence not be here in the real v1.11 manifest
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
# Should be configurable via the config file: https://github.com/kubernetes/kubernetes/issues/63878, and hence be configured using the file in v1.11
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_CONTAINER_RUNTIME_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_RESOLVER_ARGS $KUBELET_EXTRA_ARGS
---
# /etc/kubernetes/kubelet-cri.env
# In this file, you can easily add support for other CRI runtimes and specify flags needed for that CRI
# By default these arguments are here as they are specific to the docker runtime
KUBELET_CONTAINER_RUNTIME_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin

The addition of an environment file is great here, but if we go forward with this it would probably be a good idea to stick with standard env file paths for the OSs we ship packages for. I know for CentOS/Fedora this would be /etc/sysconfig/kubelet and /etc/default/kubelet for Ubuntu

An aside while I'm on the topic of systemd unit configuration managed by the packaging, we really shouldn't be placing packaged managed unit files and drop-in files in /etc/systemd/system, instead they should be placed in /usr/lib/systemd/system

I'm still concerned with how we handle this transition during upgrades, since there is a bit of chicken/egg involved between the package managed dropin and the configuration done by kubeadm upgrade. If we force the update with the package upgrade of kubeadm, then we risk breaking the running kubelet prior to install. However, we could transition the ownership of the drop-in file to kubeadm, which would allow for kubeadm itself to mutate the drop-in, perform a systemctl daemon-reload and then restart the kubelet. This would also give us a way to backup the drop-in and revert it if needed as well.

@randomvariable
Copy link
Member

Agree that we should follow the distribution standards for placement for environment files and using /usr/lib/systemd over /etc/systemd/system

There's a potential issue at present in that the docs ask people to modify 10-kubeadm.conf to set the cgroup driver, and yet we haven't also marked these files as conffiles in Bazel, so package updates will blow the user changes away and would probably break their system if the cgroup driver is non-default?

Moving management of that particular config to something kubeadm controls so that the drop-ins are truly static makes more sense.

@neolit123
Copy link
Member

@randomvariable

There's a potential issue at present in that the docs ask people to modify 10-kubeadm.conf to set the cgroup driver, and yet we haven't also marked these files as conffiles in Bazel, so package updates will blow the user changes away and would probably break their system if the cgroup driver is non-default?

the already modified kubelet config by the user (e.g. cgroup driver updated) is a problem. i think the documentation and the suggested workflow for the kubelet would have been to never modify the kublet config but handle extra arguments from the environment only. if the same location is written to, a backup of the user config should be created - e.g. $cp 10-kubeadm.conf 10-kubeadm.conf.user every time.

@detiber

I'm still concerned with how we handle this transition during upgrades, since there is a bit of chicken/egg involved between the package managed dropin and the configuration done by kubeadm upgrade. If we force the update with the package upgrade of kubeadm, then we risk breaking the running kubelet prior to install. However, we could transition the ownership of the drop-in file to kubeadm, which would allow for kubeadm itself to mutate the drop-in, perform a systemctl daemon-reload and then restart the kubelet. This would also give us a way to backup the drop-in and revert it if needed as well.

do you have something in mind on how the mutator would work? if the user has modified his kubelet config a lot, doing a diff -u <old> <new> > patchfile; patch < patchfile might not be sufficient. perhaps a line by line parser / comparator of sorts is needed here?

@detiber
Copy link
Member

detiber commented May 16, 2018

One option that we have is to have the upgrade scripts in packaging preserve the existing file and create the new version of the file under /usr/lib/systemd and require the user to reconcile the differences. This is a more manual step, but would get us to the place that we want to be... We could also tell users that if they haven't modified the file that it is safe to just blow it away.

As for documentation on how to modify the unit config, we could tell users to just use systemctl edit kubelet which will open an editor and create a drop-in file under /etc/systemd/system/kubelet.service.d for the user.

@neolit123
Copy link
Member

neolit123 commented May 16, 2018

going back to this:

An aside while I'm on the topic of systemd unit configuration managed by the packaging, we really shouldn't be placing packaged managed unit files and drop-in files in /etc/systemd/system, instead they should be placed in /usr/lib/systemd/system

this is so true. usr/lib is the de-facto vendor specific location for packaged software.

One option that we have is to have the upgrade scripts in packaging preserve the existing file and create the new version of the file under /usr/lib/systemd and require the user to reconcile the differences. This is a more manual step, but would get us to the place that we want to be... We could also tell users that if they haven't modified the file that it is safe to just blow it away.

i think the files should be placed in the same folder and for the old one a suffix should be appended, like .user or .old . having the old / new in different locations is even more confusing.

As for documentation on how to modify the unit config, we could tell users to just use systemctl edit kubelet which will open an editor and create a drop-in file under /etc/systemd/system/kubelet.service.d for the user.

this seems like a good solution.

so in terms of the docs and website i'm seeing a bad trend where a lot of the reports are about the cgroup driver issue. ideally the user should never edit system service configuration files as part of the UX...ever.
yet i don't see this being possible unless things like --cgroup-driver=auto are the default option.

@detiber
Copy link
Member

detiber commented May 16, 2018

One option that we have is to have the upgrade scripts in packaging preserve the existing file and create the new version of the file under /usr/lib/systemd and require the user to reconcile the differences. This is a more manual step, but would get us to the place that we want to be... We could also tell users that if they haven't modified the file that it is safe to just blow it away.

i think the files should be placed in the same folder and for the old one a suffix should be appended, like .user or .old . having the old / new in different locations is even more confusing.

I do agree that it is confusing, but it would ensure that the existing drop-in would override the package provided one... We could always attempt to do a diff against the known previous version and only leave the diff as a drop-in under /etc/systemd/system/kubelet.service.d. We could also provide a preflight check to force the user to take some action prior to upgrade as well.

I think anything we do for the transition is going to be awkward, but as long as we provide sufficient documentation and tooling to assist the user through the transition then we can limit the impact.

@luxas luxas force-pushed the kubeadm_kubelet_integration branch 2 times, most recently from 948c497 to 0ba1430 Compare May 17, 2018 16:59
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 17, 2018
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

3 similar comments
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

…so write runtime environment file and fixup the kubelet phases command
@luxas luxas force-pushed the kubeadm_kubelet_integration branch from 916cd0a to 5382cf7 Compare May 22, 2018 06:17
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 22, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: luxas, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@luxas luxas force-pushed the kubeadm_kubelet_integration branch from 5382cf7 to 60b0eeb Compare May 22, 2018 06:31
@luxas luxas added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 22, 2018
@luxas
Copy link
Member Author

luxas commented May 22, 2018

/retest

2 similar comments
@luxas
Copy link
Member Author

luxas commented May 22, 2018

/retest

@luxas
Copy link
Member Author

luxas commented May 22, 2018

/retest

@luxas
Copy link
Member Author

luxas commented May 22, 2018

/retest

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 63914, 63887, 64116, 64026, 62933). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 773ced1 into kubernetes:master May 23, 2018
krzyzacy pushed a commit to krzyzacy/kubernetes that referenced this pull request May 30, 2018
Automatic merge from submit-queue (batch tested with PRs 64322, 64210, 64458, 64232, 64370). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm: Move .NodeName and .CRISocket to a common sub-struct

**What this PR does / why we need it**:
Regroups some common fields for `kubeadm init` and `kubeadm join` only used for the initial node registration.
Lets the user specify ExtraArgs to the kubelet.
Now also runs the dynamic env file creation for `kubeadm join`.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes kubernetes/kubeadm#847
Follows-up kubernetes#63887
Related to kubernetes/kubeadm#822

**Special notes for your reviewer**: WIP, but please review so we can finalize the direction of the PR

**Release note**:

```release-note
[action required] `.NodeName` and `.CRISocket` in the `MasterConfiguration` and `NodeConfiguration` v1alpha1 API objects are now `.NodeRegistration.Name` and `.NodeRegistration.CRISocket` respectively in the v1alpha2 API. The `.NoTaintMaster` field has been removed in the v1alpha2 API.
```
@kubernetes/sig-cluster-lifecycle-pr-reviews @liztio
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make kubeadm start controlling kubelet Config via a versioned file Graduate kubelet config to beta
8 participants