-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support more volume types in stats/summary endpoint including configm… #70172
Support more volume types in stats/summary endpoint including configm… #70172
Conversation
…ap, secret, projected
Du on atomic volume types is interesting because we keep "snapshots" of older versions around, so the size usage will always grow. What should be the intended reported metric? Should it be total usage + all the snapshots? Or just the size of the current? @kubernetes/sig-storage-pr-reviews |
We should probably run some stress tests to make sure that adding these new sources of du won't excasperate #62917. I believe the configmap max size is 1mb? 1mb really shouldn't be a problem but a pod could have many configmaps. Do we cleanup old snapshots periodically or only when a pod is deleted? |
@msau42 I am not sure which snapshots feature you mean here. Is that about https://github.com/kubernetes-incubator/external-storage/tree/master/snapshot?
|
I think one possible reason is executing "find" command with no nice level set: kubernetes/pkg/volume/util/fs/fs.go Line 81 in 689df20
We can see that similar "du" invoke in k8s has a nice level 19 set: kubernetes/pkg/volume/util/fs/fs.go Line 61 in 689df20
And also in cadvisor, the nice level was set to 19: https://github.com/google/cadvisor/blob/c5510abcd7bf38e1db6b9bfe3bf6728139d8c3e1/fs/fs.go#L619 https://github.com/google/cadvisor/blob/c5510abcd7bf38e1db6b9bfe3bf6728139d8c3e1/fs/fs.go#L572 I will do a test to verify if my guess was correct.
|
Ah sorry, I don't mean the volume snapshot feature. What I meant was that when a configmap gets updated, for the volume mount, we make a new timestamped directory with all the contents and then just flip the symlink to point to the new directory. But I don't think we cleanup the older directories until the pod is deleted. So over time, the actual configmap volume directory will keep on growing and never shrink. |
Hello, I tried many ways to observe the snapshot but I didn't. Did I miss something? Or need any feature gates on?
|
@msau42
|
Hi @WanLinghao thanks for the thorough testing! Could you also try the following?
|
@msau42 Hello, I have test them like this:
|
@gnufied PTAL thanks |
@gnufied friendly ping |
@gnufied PTAL thanks |
I am bit wary of emitting ephemeral volume stats as metrics. if I understand correctly, this will cause explosion of timer series data in large enough cluster, because most things use ephemeral storage. Also - some of those ephemeral volumes are auto generated and have same lifecycle as pod. |
Not to mention - how will a user use these metrics? Say, if I am running out of diskspace in configmap "foo", what is user supposed to do? You can't resize a configmap. You can probably just reduce the amount of data you put in. But since this is limited to 1MB and you get a validation error during creating and updating a configmap - we should be fine. Also IIRC - most of these ephemeral volume types are read-only now, so they can't grow on the node. |
@gnufied So should we remove the metrics calculation in secret volume? kubernetes/pkg/volume/secret/secret.go Line 102 in 50e02fd
|
The strange thing is that since ephemeral volumes are on the root disk, the used capacity will be miniscule compared to the boot disk. Even though volume is mounted read-only, i think it will grow. If you update the config map, then we will create a new directory and update the symlink and remount. I also think we don't delete old directories, so the size will constantly grow if you update the objects frequently |
@WanLinghao hmm, I think one thing I got wrong was - these metrics do not automatically show up in prometheus end point. I will verify tomorrow, but I am thinking we should be fine. @msau42 @WanLinghao also yeah I think symlink may cause volume to grow on disk (as long as volume is on same node). I will take a look again, thanks. |
@gnufied what's your view now~ |
@gnufied PTAL thanks |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied, WanLinghao The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind feature |
Support more volume types in stats/summary endpoint including configmap, secret, projected