Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix kubelet localStorageCapacityIsolation option #15336

Conversation

prezha
Copy link
Collaborator

@prezha prezha commented Nov 9, 2022

fixes: #14728
fixes: #15099 (btrfs + k8s 1.25.2 + docker)
fixes: #15050 (btrfs + k8s 1.25.0 + docker)

it might help with #14819 (btrfs + docker, but k8s version is not shared: issue was reported on 20th Aug, so it, in theory, could be k8s v1.25.0-beta.0, but minikube is v1.26.1 shipped with DefaultKubernetesVersion = "v1.24.3")

i don't think i ever had issues with btrfs + docker + k8s/minikube and also, official OverlayFS and Docker Performance claims that:

In certain circumstances, overlay2 may perform better than btrfs as well. However, ...

but i know some users reported problems earlier, so, in case of such a combination, this pr should automatically:

  • enable LocalStorageCapacityIsolation as feature flag for k8s before v1.25.0-beta.0 and
  • enable localStorageCapacityIsolation as config option for k8s v1.25.0-beta.0 and newer

should they wish, users will also have the option to override this by passing eg, --extra-config="kubelet.localStorageCapacityIsolation=true" (which is default) for k8s >= v1.25.0-beta.0 (last example below demonstrates that case)

before

$ docker exec -ti docker /bin/bash
root@docker:/# journalctl -f

...
Nov 09 01:31:37 docker systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Nov 09 01:31:37 docker systemd[1]: kubelet.service: Failed with result 'exit-code'.
Nov 09 01:31:38 docker systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 3802.
Nov 09 01:31:38 docker systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Nov 09 01:31:38 docker systemd[1]: Started kubelet: The Kubernetes Node Agent.
Nov 09 01:31:38 docker kubelet[76607]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Nov 09 01:31:38 docker kubelet[76607]: Flag --feature-gates has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 09 01:31:38 docker kubelet[76607]: Flag --runtime-request-timeout has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 09 01:31:38 docker kubelet[76607]: E1109 01:31:38.711920   76607 run.go:74] "command failed" err="failed to set feature gates from initial flags-based config: cannot set feature gate LocalStorageCapacityIsolation to false, feature is locked to true"
Nov 09 01:31:38 docker systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Nov 09 01:31:38 docker systemd[1]: kubelet.service: Failed with result 'exit-code'.
...

after

$ time minikube start --driver=docker --kubernetes-version=1.25.3 -p btrfs-docker-k8s1.25.3

πŸ˜„  [btrfs-docker-k8s1.25.3] minikube v1.28.0 on Opensuse-Tumbleweed
✨  Using the docker driver based on user configuration
❗  docker is currently using the btrfs storage driver, consider switching to overlay2 for better performance
πŸ“Œ  Using Docker driver with root privileges
πŸ‘  Starting control plane node btrfs-docker-k8s1.25.3 in cluster btrfs-docker-k8s1.25.3
🚜  Pulling base image ...
πŸ”₯  Creating docker container (CPUs=2, Memory=15900MB) ...
🐳  Preparing Kubernetes v1.25.3 on Docker 20.10.20 ...
    β–ͺ kubelet.localStorageCapacityIsolation=false
    β–ͺ Generating certificates and keys ...
    β–ͺ Booting up control plane ...
    β–ͺ Configuring RBAC rules ...
πŸ”Ž  Verifying Kubernetes components...
    β–ͺ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
πŸ„  Done! kubectl is now configured to use "btrfs-docker-k8s1.25.3" cluster and "default" namespace by default

real    0m31.322s
user    0m2.170s
sys     0m1.091s

$ docker exec -ti btrfs-docker-k8s1.25.3 cat /var/lib/kubelet/config.yaml | grep -A1 "kind: KubeletConfiguration"

kind: KubeletConfiguration
localStorageCapacityIsolation: false

$ docker exec -ti btrfs-docker-k8s1.25.3 cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf | grep -i "localStorageCapacityIsolation"

(null)

$ time minikube start --driver=docker --kubernetes-version=1.24.7 -p btrfs-docker-k8s1.24.7

πŸ˜„  [btrfs-docker-k8s1.24.7] minikube v1.28.0 on Opensuse-Tumbleweed
✨  Using the docker driver based on user configuration
❗  docker is currently using the btrfs storage driver, consider switching to overlay2 for better performance
πŸ“Œ  Using Docker driver with root privileges
πŸ‘  Starting control plane node btrfs-docker-k8s1.24.7 in cluster btrfs-docker-k8s1.24.7
🚜  Pulling base image ...
πŸ”₯  Creating docker container (CPUs=2, Memory=15900MB) ...
🐳  Preparing Kubernetes v1.24.7 on Docker 20.10.20 ...
    β–ͺ Generating certificates and keys ...
    β–ͺ Booting up control plane ...
    β–ͺ Configuring RBAC rules ...
πŸ”Ž  Verifying Kubernetes components...
    β–ͺ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
πŸ„  Done! kubectl is now configured to use "btrfs-docker-k8s1.24.7" cluster and "default" namespace by default

real    0m28.820s
user    0m2.220s
sys     0m1.144s

$ docker exec -ti btrfs-docker-k8s1.24.7 cat /var/lib/kubelet/config.yaml | grep -A1 "kind: KubeletConfiguration"

kind: KubeletConfiguration
logging:

$ docker exec -ti btrfs-docker-k8s1.24.7 cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf | grep -i "localStorageCapacityIsolation"

ExecStart=/var/lib/minikube/binaries/v1.24.7/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=/var/run/cri-dockerd.sock --feature-gates=LocalStorageCapacityIsolation=false --hostname-override=btrfs-docker-k8s1.24.7 --image-service-endpoint=/var/run/cri-dockerd.sock --kubeconfig=/etc/kubernetes/kubelet.conf --node-ip=192.168.58.2 --runtime-request-timeout=15m

$ time minikube start --driver=docker --kubernetes-version=1.25.3 -p btrfs-docker-k8s1.25.3-ec --extra-config="kubelet.localStorageCapacityIsolation=true"

πŸ˜„  [btrfs-docker-k8s1.25.3-ec] minikube v1.28.0 on Opensuse-Tumbleweed
✨  Using the docker driver based on user configuration
❗  docker is currently using the btrfs storage driver, consider switching to overlay2 for better performance
πŸ“Œ  Using Docker driver with root privileges
πŸ‘  Starting control plane node btrfs-docker-k8s1.25.3-ec in cluster btrfs-docker-k8s1.25.3-ec
🚜  Pulling base image ...
πŸ”₯  Creating docker container (CPUs=2, Memory=15900MB) ...
🐳  Preparing Kubernetes v1.25.3 on Docker 20.10.20 ...
    β–ͺ kubelet.localStorageCapacityIsolation=true
    β–ͺ Generating certificates and keys ...
    β–ͺ Booting up control plane ...
    β–ͺ Configuring RBAC rules ...
πŸ”Ž  Verifying Kubernetes components...
    β–ͺ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: default-storageclass, storage-provisioner
πŸ„  Done! kubectl is now configured to use "btrfs-docker-k8s1.25.3-ec" cluster and "default" namespace by default

real    0m29.488s
user    0m2.293s
sys     0m1.169s

$ docker exec -ti btrfs-docker-k8s1.25.3-ec cat /var/lib/kubelet/config.yaml | grep -A1 "kind: KubeletConfiguration"

kind: KubeletConfiguration
localStorageCapacityIsolation: true

$ docker exec -ti btrfs-docker-k8s1.25.3-ec cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf | grep -i "localStorageCapacityIsolation"

(null)

$ minikube profile list

|---------------------------|-----------|---------|--------------|------|---------|---------|-------|--------|
|          Profile          | VM Driver | Runtime |      IP      | Port | Version | Status  | Nodes | Active |
|---------------------------|-----------|---------|--------------|------|---------|---------|-------|--------|
| btrfs-docker-k8s1.24.7    | docker    | docker  | 192.168.58.2 | 8443 | v1.24.7 | Running |     1 |        |
| btrfs-docker-k8s1.25.3    | docker    | docker  | 192.168.49.2 | 8443 | v1.25.3 | Running |     1 |        |
| btrfs-docker-k8s1.25.3-ec | docker    | docker  | 192.168.67.2 | 8443 | v1.25.3 | Running |     1 |        |
|---------------------------|-----------|---------|--------------|------|---------|---------|-------|--------|

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 9, 2022
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 9, 2022
@medyagh
Copy link
Member

medyagh commented Nov 11, 2022

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Nov 11, 2022
@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 15336) |
+----------------+----------+---------------------+
| minikube start | 54.3s    | 54.2s               |
| enable ingress | 28.1s    | 27.0s               |
+----------------+----------+---------------------+

Times for minikube start: 54.4s 54.5s 54.6s 54.1s 53.8s
Times for minikube (PR 15336) start: 55.9s 53.3s 53.3s 54.2s 54.3s

Times for minikube ingress: 29.1s 25.6s 29.6s 29.7s 26.6s
Times for minikube (PR 15336) ingress: 27.1s 28.1s 30.6s 25.2s 24.1s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 15336) |
+----------------+----------+---------------------+
| minikube start | 25.5s    | 26.9s               |
| enable ingress | 23.0s    | 21.8s               |
+----------------+----------+---------------------+

Times for minikube start: 24.5s 25.9s 25.9s 25.7s 25.6s
Times for minikube (PR 15336) start: 26.8s 26.0s 27.8s 25.3s 28.6s

Times for minikube ingress: 20.9s 25.4s 25.4s 20.9s 22.4s
Times for minikube (PR 15336) ingress: 21.5s 21.9s 21.9s 21.5s 21.9s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 15336) |
+----------------+----------+---------------------+
| minikube start | 22.2s    | 24.9s               |
| enable ingress | 27.3s    | 27.2s               |
+----------------+----------+---------------------+

Times for minikube start: 21.1s 21.8s 22.6s 21.2s 24.5s
Times for minikube (PR 15336) start: 36.0s 22.5s 21.8s 22.0s 22.3s

Times for minikube ingress: 27.4s 26.9s 27.5s 26.9s 27.4s
Times for minikube (PR 15336) ingress: 27.4s 26.9s 27.4s 26.9s 27.4s

@minikube-pr-bot
Copy link

These are the flake rates of all failed tests.

Environment Failed Tests Flake Rate (%)
Hyper-V_Windows TestNetworkPlugins/group/bridge/Start (gopogh) 17.22 (chart)
Hyper-V_Windows TestNetworkPlugins/group/kubenet/Start (gopogh) 22.52 (chart)
Docker_macOS TestMultiNode/serial/RestartKeepsNodes (gopogh) 41.67 (chart)
Docker_Windows TestNetworkPlugins/group/enable-default-cni/DNS (gopogh) 42.28 (chart)
Docker_Windows TestStartStop/group/newest-cni/serial/Pause (gopogh) 53.02 (chart)
Docker_Windows TestNetworkPlugins/group/bridge/DNS (gopogh) 64.19 (chart)
Docker_Linux_containerd TestNetworkPlugins/group/enable-default-cni/DNS (gopogh) 78.12 (chart)
Docker_Linux_containerd TestNetworkPlugins/group/bridge/DNS (gopogh) 80.00 (chart)
Docker_Linux TestNetworkPlugins/group/calico/Start (gopogh) 83.66 (chart)
Docker_Linux TestNetworkPlugins/group/false/DNS (gopogh) 84.97 (chart)
Hyper-V_Windows TestNoKubernetes/serial/StartWithStopK8s (gopogh) 98.96 (chart)
Docker_Linux_containerd TestNetworkPlugins/group/calico/Start (gopogh) 99.17 (chart)
Docker_Linux_containerd TestKubernetesUpgrade (gopogh) 100.00 (chart)
Docker_Linux_containerd TestPreload (gopogh) 100.00 (chart)
Docker_Linux TestNetworkPlugins/group/kubenet/HairPin (gopogh) 100.00 (chart)
Docker_macOS TestIngressAddonLegacy/serial/ValidateIngressAddonActivation (gopogh) 100.00 (chart)
Docker_macOS TestIngressAddonLegacy/serial/ValidateIngressAddons (gopogh) 100.00 (chart)
Docker_macOS TestIngressAddonLegacy/serial/ValidateIngressDNSAddonActivation (gopogh) 100.00 (chart)
Docker_macOS TestIngressAddonLegacy/StartLegacyK8sCluster (gopogh) 100.00 (chart)
Docker_macOS TestKubernetesUpgrade (gopogh) 100.00 (chart)
Docker_macOS TestMissingContainerUpgrade (gopogh) 100.00 (chart)
Docker_macOS TestNetworkPlugins/group/kubenet/HairPin (gopogh) 100.00 (chart)
Docker_macOS TestRunningBinaryUpgrade (gopogh) 100.00 (chart)
Docker_macOS TestStartStop/group/old-k8s-version/serial/AddonExistsAfterStop (gopogh) 100.00 (chart)
Docker_macOS TestStartStop/group/old-k8s-version/serial/DeployApp (gopogh) 100.00 (chart)
Docker_macOS TestStartStop/group/old-k8s-version/serial/EnableAddonWhileActive (gopogh) 100.00 (chart)
Docker_macOS TestStartStop/group/old-k8s-version/serial/FirstStart (gopogh) 100.00 (chart)
Docker_macOS TestStartStop/group/old-k8s-version/serial/SecondStart (gopogh) 100.00 (chart)
Docker_macOS TestStartStop/group/old-k8s-version/serial/UserAppExistsAfterStop (gopogh) 100.00 (chart)
Docker_macOS TestStoppedBinaryUpgrade/Upgrade (gopogh) 100.00 (chart)
More tests... Continued...

Too many tests failed - See test logs for more details.

To see the flake rates of all tests by environment, click here.

Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prezha looks good, thanks for the fix :) and good to see your contribution again

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, prezha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@medyagh medyagh merged commit ccb569f into kubernetes:master Nov 14, 2022
@prezha
Copy link
Collaborator Author

prezha commented Nov 27, 2022

@prezha looks good, thanks for the fix :) and good to see your contribution again

my pleasure, @medyagh ! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
4 participants