Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral storage not available when using (docker/podman) rootless providers #3359

Open
josecastillolema opened this issue Sep 11, 2023 · 10 comments · May be fixed by #3360
Open

Ephemeral storage not available when using (docker/podman) rootless providers #3359

josecastillolema opened this issue Sep 11, 2023 · 10 comments · May be fixed by #3360
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@josecastillolema
Copy link

What happened:

When using kind with rootless docker/podman pods with ephemeral storage requests never get scheduled, i.e.:

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - command:
    - sleep
    - infinity
    image: centos
    name: compute
    resources:
      requests:
        ephemeral-storage: 50M

What you expected to happen:
The pod gets properly scheduled, as it does with rootfull docker/podman

How to reproduce it (as minimally and precisely as possible):

  1. Create a kind cluster with rootless docker/podman
  2. Try to create the pod

Anything else we need to know?:
With rootfull docker/podman:

kubectl describe node ..
Capacity:
  cpu:                8
  ephemeral-storage:  71658616Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32629028Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  71658616Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32629028Ki
  pods:               110

With rootless docker/podman (no ephemeral storage):

Capacity:
  cpu:            8
  hugepages-1Gi:  0
  hugepages-2Mi:  0
  memory:         32629028Ki
  pods:           110
Allocatable:
  cpu:            8
  hugepages-1Gi:  0
  hugepages-2Mi:  0
  memory:         32629028Ki
  pods:           110

Environment:

  • kind version: (use kind version): v0.20.0 go1.20.7 linux/amd64
  • Runtime info: (use docker info or podman info):
    • Docker: 24.0.1
    • Podman: 4.6.1
  • OS (e.g. from /etc/os-release): Fedora Linux
@josecastillolema josecastillolema added the kind/bug Categorizes issue or PR as related to a bug. label Sep 11, 2023
@giuseppe
Copy link
Member

it seems to be done on purpose with rootless, I've tried the following patch and it solves the problem for me:

diff --git a/pkg/cluster/internal/kubeadm/config.go b/pkg/cluster/internal/kubeadm/config.go
index 6aa17581..0a17f64e 100644
--- a/pkg/cluster/internal/kubeadm/config.go
+++ b/pkg/cluster/internal/kubeadm/config.go
@@ -79,10 +79,6 @@ type ConfigData struct {
 	// RootlessProvider is true if kind is running with rootless mode
 	RootlessProvider bool
 
-	// DisableLocalStorageCapacityIsolation is typically set true based on RootlessProvider
-	// based on the Kubernetes version, if true kubelet localStorageCapacityIsolation is set false
-	DisableLocalStorageCapacityIsolation bool
-
 	// DerivedConfigData contains fields computed from the other fields for use
 	// in the config templates and should only be populated by calling Derive()
 	DerivedConfigData
@@ -422,7 +418,6 @@ evictionHard:
 {{ range $index, $gate := .SortedFeatureGates }}
   "{{ (StructuralData $gate.Name) }}": {{ $gate.Value }}
 {{end}}{{end}}
-{{if .DisableLocalStorageCapacityIsolation}}localStorageCapacityIsolation: false{{end}}
 {{if ne .KubeProxyMode "None"}}
 ---
 apiVersion: kubeproxy.config.k8s.io/v1alpha1
@@ -468,16 +463,6 @@ func Config(data ConfigData) (config string, err error) {
 			return "", errors.Errorf("version %q is not compatible with rootless provider (hint: kind v0.11.x may work with this version)", ver)
 		}
 		data.FeatureGates["KubeletInUserNamespace"] = true
-
-		// For avoiding err="failed to get rootfs info: failed to get device for dir \"/var/lib/kubelet\": could not find device with major: 0, minor: 41 in cached partitions map"
-		// https://github.com/kubernetes-sigs/kind/issues/2524
-		if ver.LessThan(version.MustParseSemantic("v1.25.0-alpha.3.440+0064010cddfa00")) {
-			// this feature gate was removed in v1.25 and replaced by an opt-out to disable
-			data.FeatureGates["LocalStorageCapacityIsolation"] = false
-		} else {
-			// added in v1.25 https://github.com/kubernetes/kubernetes/pull/111513
-			data.DisableLocalStorageCapacityIsolation = true
-		}
 	}
 
 	// assume the latest API version, then fallback if the k8s version is too low

not sure how many other things I break this way though 😄

@BenTheElder
Copy link
Member

Rootless is generally not expected to have full parity.

Maybe podman or kubelet have since landed a patch that makes enabling this viable again?

At the time this simply didn't work on rootless and led to a kubelet crash on startup.

#2525 / #2524.

@giuseppe giuseppe linked a pull request Sep 12, 2023 that will close this issue
@giuseppe
Copy link
Member

I've played a bit with it and could not spot any failure so perhaps it is worth dropping this special handling and have one difference less with rootful mode.

I've opened a PR so we can discuss it better there: #3360

giuseppe added a commit to giuseppe/kind that referenced this issue Sep 12, 2023
it was used to workaround a kubelet crash issue with rootless
providers.

The Kubelet seems to work fine now with localStorageCapacityIsolation
enabled in a user namespace so drop the special handling.  After this
change, ephemeral storage can be used in a rootless cluster.

Closes: kubernetes-sigs#3359

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@josecastillolema
Copy link
Author

Are those differences between rootful and rootles mode listed somewhere?
Thanks!

@BenTheElder
Copy link
Member

They should be at https://kind.sigs.k8s.io/docs/user/rootless/, but I don't personally use rootless and CI doesn't cover the entire surface of Kubernetes, historically things just haven't worked e.g. in Kubernetes like this feature.

KIND is the only rootless coverage for Kubernetes CI AFAIK, but I don't think it's terribly extensive in that regard and I'm not sure to the extent that e.g. SIG Node even officially supports this versus permitting patches related to rootless. I don't think we have rootless node_e2e for example.

@josecastillolema
Copy link
Author

Thanks for you quick response @BenTheElder !
Correct me if I am wrong but the restrictions listed on https://kind.sigs.k8s.io/docs/user/rootless/#restrictions correspond to docker, not to kind.
While I totally get that rootless is generally not expected to have full parity, it would be nice to list those differences (i.e.: the ephemeral storage one) somewhere.

@BenTheElder
Copy link
Member

BenTheElder commented Sep 12, 2023

My point is we don't know all of them, and that restrictions from docker/podman/kubernetes remain true irrespective of kind.

This would be the only current unlisted known issue for us off the top of my head and it was actually widely true for rootless kubernetes, not just kind, it was also the case in other projects, so we didn't think to add something.

EDIT: I agree they should be listed, and that would be the page to add them to.

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Sep 17, 2023

KIND is the only rootless coverage for Kubernetes CI AFAIK

minikube does too: https://github.com/kubernetes/minikube/blob/319886a38d56668e5141fa19afd7ad3ace1962d7/.github/workflows/pr.yml#L288

I don't think we have rootless node_e2e for example.

I was waiting for the upstream CI to switch to cgroup v2, now I should find my time to work on this...

@BenTheElder
Copy link
Member

minikube does too: https://github.com/kubernetes/minikube/blob/319886a38d56668e5141fa19afd7ad3ace1962d7/.github/workflows/pr.yml#L288

So it's true that minikube is running Kubernetes in rootless mode in their CI, I reached out to them about this field previously ... but minikube only supports Kubernetes releases and is not part of Kubernetes's CI, in general they're doing their own testing independent of SIG Testing/Release/... on tagged, built k8s releases and are not part of release signal for SIG Node, Release, etc.

I was waiting for the upstream CI to switch to cgroup v2, now I should find my time to work on this...

👍

I do think we need more coverage for this. My point was just that the broader k8s project isn't tightly tracking this sort of thing at the moment, so for us to document things that may not work in rootless k8s anywhere requires kind to first go and identify these things ourselves at the moment, there are not docs covering this for core kubernetes or minikube.

@BenTheElder
Copy link
Member

(which is also why we don't know what changed to make this feature start working xref #3360)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants