Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local Ephemeral Storage limit not working #78865

Open
dashpole opened this issue Jun 10, 2019 · 10 comments

Comments

Projects
None yet
6 participants
@dashpole
Copy link
Contributor

commented Jun 10, 2019

@arunbpt7 opened an issue in kubernetes/enhancements. I am moving it here.
/kind bug
/priority important-longterm
/sig node

As discussed in #361 , Looking for a solution to restrict ephemeral storage for pods usage . As it is found that ephemeral storage is shared across all the pods and that is going to be fill up /var/lib/docker frequently based on the pods writable layer and logs. This is causing the high utilization on /var/lib/docker file system frequently. If there is a solution to restrict ephemeral storage for pods , for an example set a defined size (lets say 20G) for the pods , that particular pods only can use 20G on ephemeral storage and defined persistant volumes for more storage requirement. So that other pods can use available space on /var/lib/docker which again restrict them to use other 20G for each pods.

have defined ephemeral-storage request and limit in resources (spec.hard.requests.ephemeral-storage , spec.hard.limits.ephemeral-storage) on the deployment and verified that evictionHard: is enabled for "imagefs and "nodefs" on the node . but when when deploying the pod and it is not restricting the pod to use the defined ephemeral storage . when creating large file inside the container it is still able to create files more that the ephemeral-storage request and limit.

evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%

containers:

  • name: busybox
    image:
    resources:
    requests:
    ephemeral-storage: "500Mi"
    limits:
    ephemeral-storage: "500Mi"
@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented Jun 10, 2019

Can you show your monitoring that shows that the pod exceeds its limit for an extended period of time (a few minutes?)

@arunbpt7

This comment has been minimized.

Copy link

commented Jun 10, 2019

containers:
- name: busybox
image:
resources:
requests:
ephemeral-storage: "500Mi"
limits:
ephemeral-storage: "500Mi"

State: Running
Started: Mon, 10 Jun 2019 12:48:57 -0400
Ready: True
Restart Count: 0
Limits:
ephemeral-storage: 500Mi
Requests:
ephemeral-storage: 500Mi
Environment:

kubectl get po busybox-7cc68d968c-mb47z -n testns
NAME READY STATUS RESTARTS AGE
busybox-7cc68d968c-mb47z 1/1 Running 0 82m

kubectl exec -it busybox-7cc68d968c-mb47z -n testns -- bash
bash-4.2$ fallocate -l 2G /var/tmp/test2
bash-4.2$ du -sh /var/tmp/*
1.0G /var/tmp/test
2.0G /var/tmp/test2

bash-4.2$ exit

kubectl get po busybox-7cc68d968c-mb47z -n testns
NAME READY STATUS RESTARTS AGE
busybox-7cc68d968c-mb47z 1/1 Running 0 83m

@pickledrick

This comment has been minimized.

Copy link

commented Jun 13, 2019

I was able to recreate this issue on a 1.13.5 cluster

For the node that was being tested an EBS volume was attached to the instance and mounted as an xfs volume to /var/lib/docker

The deployment with resources set had a pod scheduled to the node in question.

"Execing" into the pod and running.

fallocate -l 2G /var/tmp/test1

Created a file larger than the set ephemeral storage limit of 500Mi. The pod was not evicted. Even waiting up to 10 minutes.

Starting again with a fresh volume and deployment.

Initially creating a 4G file within the new pod with an underlying volume of 5G mounted on /var/lib/docker

fallocate -l 4G /var/tmp/test1

caused imageGCManager to kick in due to the node DiskPressure condition rather than honouring ephemeral limits and evicting that one pod first

Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: W0613 02:47:35.771392 27133 eviction_manager.go:333] eviction manager: attempting to reclaim ephemeral-storage
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.771424 27133 container_gc.go:85] attempting to delete unused containers
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.782369 27133 image_gc_manager.go:317] attempting to delete unused images
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.794272 27133 eviction_manager.go:344] eviction manager: must evict pod(s) to reclaim ephemeral-storage
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.794493 27133 eviction_manager.go:362] eviction manager: pods ranked for eviction: debug-887cd4775-2brw9_test(d9f7c5ee-8d84-11e9-b987-02f54a20dc4c), canal-js4tm_kube-system(53c05e18-8d66-11e9-b987-02f54a20dc4c), debug-887cd4775-fwvhq_test(a7ac5932-8d84-11e9-b987-02f54a20dc4c), debug-887cd4775-r9zcc_test(0695b54b-8d85-11e9-b987-02f54a20dc4c), debug-887cd4775-l6kxx_test(d18c07b5-8d84-11e9-b987-02f54a20dc4c), debug-887cd4775-rvv

The ranking for eviction, however, was correct and the pod debug-887cd4775-2brw9 was the pod used to create the volume.

@msau42

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

@jingxu97

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

@pickledrick @arunbpt7 Could you please share your pod yaml file? You can also email me jinxu at google.com if you prefer. Thanks!

@arunbpt7

This comment has been minimized.

Copy link

commented Jun 13, 2019

@jingxu97

apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
securityContext:
runAsUser: 99
fsGroup: 99
containers:
- name: busybox
image:
resources:
requests:
ephemeral-storage: “500Mi”
limits:
ephemeral-storage: “500Mi”

@jingxu97

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

@arunbpt7 did you miss some part of yaml file?

@pickledrick

This comment has been minimized.

Copy link

commented Jun 13, 2019

@jingxu97

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: debug
  name: debug
spec:
  selector:
    matchLabels:
      app: debug
  template:
    metadata:
      labels:
        app: debug
    spec:
      containers:
      - image: quay.io/pickledrick/debug
        imagePullPolicy: Always
        name: debug
        resources:
          limits:
            ephemeral-storage: 500Mi
          requests:
            ephemeral-storage: 500Mi

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented Jun 13, 2019

@arunbpt7 can you query the summary api (localhost:10255/stats/summary) from the node that pod is running on to make sure it is measuring disk space correctly?

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented Jun 14, 2019

Our tests for this are not super consistent: https://k8s-testgrid.appspot.com/sig-node-kubelet#node-kubelet-serial&include-filter-by-regex=LocalStorageCapacityIsolationEviction, but are mostly green. I'll try and bump the timeout on the serial tests to see if we can get a clearer signal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.