Support more volume types in stats/summary endpoint including configm… #70172

WanLinghao · 2018-10-24T06:25:14Z

Support more volume types in stats/summary endpoint including configmap, secret, projected

/kind bug

NONE

…ap, secret, projected

msau42 · 2018-10-25T00:42:16Z

Du on atomic volume types is interesting because we keep "snapshots" of older versions around, so the size usage will always grow.

What should be the intended reported metric? Should it be total usage + all the snapshots? Or just the size of the current?

@kubernetes/sig-storage-pr-reviews

msau42 · 2018-10-25T00:50:37Z

We should probably run some stress tests to make sure that adding these new sources of du won't excasperate #62917. I believe the configmap max size is 1mb? 1mb really shouldn't be a problem but a pod could have many configmaps. Do we cleanup old snapshots periodically or only when a pod is deleted?

WanLinghao · 2018-10-25T07:55:10Z

@msau42 I am not sure which snapshots feature you mean here. Is that about https://github.com/kubernetes-incubator/external-storage/tree/master/snapshot?

Du on atomic volume types is interesting because we keep "snapshots" of older versions around, so the size usage will always grow.

What should be the intended reported metric? Should it be total usage + all the snapshots? Or just the size of the current?

@kubernetes/sig-storage-pr-reviews

WanLinghao · 2018-10-25T09:09:02Z

I think one possible reason is executing "find" command with no nice level set:

kubernetes/pkg/volume/util/fs/fs.go

Line 81 in 689df20

findCmd := exec.Command("find", path, "-xdev", "-printf", ".")

We can see that similar "du" invoke in k8s has a nice level 19 set:

kubernetes/pkg/volume/util/fs/fs.go

Line 61 in 689df20

    
           out, err := exec.Command("nice", "-n", "19", "du", "-s", "-B", "1", path).CombinedOutput()

And also in cadvisor, the nice level was set to 19:
https://github.com/google/cadvisor/blob/c5510abcd7bf38e1db6b9bfe3bf6728139d8c3e1/fs/fs.go#L619
https://github.com/google/cadvisor/blob/c5510abcd7bf38e1db6b9bfe3bf6728139d8c3e1/fs/fs.go#L572
I will do a test to verify if my guess was correct.

We should probably run some stress tests to make sure that adding these new sources of du won't excasperate #62917. I believe the configmap max size is 1mb? 1mb really shouldn't be a problem but a pod could have many configmaps. Do we cleanup old snapshots periodically or only when a pod is deleted?

msau42 · 2018-10-25T14:43:10Z

Ah sorry, I don't mean the volume snapshot feature. What I meant was that when a configmap gets updated, for the volume mount, we make a new timestamped directory with all the contents and then just flip the symlink to point to the new directory. But I don't think we cleanup the older directories until the pod is deleted. So over time, the actual configmap volume directory will keep on growing and never shrink.

WanLinghao · 2018-10-29T06:19:48Z

Hello, I tried many ways to observe the snapshot but I didn't. Did I miss something? Or need any feature gates on?

Ah sorry, I don't mean the volume snapshot feature. What I meant was that when a configmap gets updated, for the volume mount, we make a new timestamped directory with all the contents and then just flip the symlink to point to the new directory. But I don't think we cleanup the older directories until the pod is deleted. So over time, the actual configmap volume directory will keep on growing and never shrink.

WanLinghao · 2018-10-30T12:38:49Z

@msau42
I have test this describe as follows:

Deploy a test cluster with two nodes.
Each node has 100 pods running. Each pod mounts 10 configmap volumes.
Each configmap has about 1MB unique data.
Run a shell script which simulates busy I/O request on one node, while keeping the other node untouched.
Request /stats/summary endpoint on each node for many times.
Observe the difference behaviours between two nodes.
The conclusion is: everything goes well on both node, no panic, no error. It seems the configmap volume metrics calculation would not trigger the panic you mentioned above.

We should probably run some stress tests to make sure that adding these new sources of du won't excasperate #62917. I believe the configmap max size is 1mb? 1mb really shouldn't be a problem but a pod could have many configmaps. Do we cleanup old snapshots periodically or only when a pod is deleted?

msau42 · 2018-10-30T16:26:17Z

Hi @WanLinghao thanks for the thorough testing! Could you also try the following?

Make the configmap full of very many small files
After the pod has started, update the configmap frequently

WanLinghao · 2018-11-01T07:40:39Z

@msau42 Hello, I have test them like this:

For small configmaps, I test a cluster with 100 pods running on each node, and each pod bounded with 100 configmap volumes, each configmap volume has 50 byte data injected. And no error or panic happened.
For frequent update configmap, I write a shell script which update configmaps one by one in a dead loop. And I also noticed no error or panic when I try to access stats/summary endpoint.
As a conclusion, my test proves that configmap volume metrics calculation would not trigger error or panic on most circumstances.

msau42 · 2018-11-01T21:19:13Z

Thanks!
/lgtm

/assign @gnufied
cc @dashpole

WanLinghao · 2018-11-05T11:21:40Z

@gnufied PTAL thanks

WanLinghao · 2018-11-09T09:01:07Z

@gnufied friendly ping

WanLinghao · 2018-11-20T01:29:32Z

@gnufied PTAL thanks

gnufied · 2018-11-20T02:14:33Z

I am bit wary of emitting ephemeral volume stats as metrics. if I understand correctly, this will cause explosion of timer series data in large enough cluster, because most things use ephemeral storage. Also - some of those ephemeral volumes are auto generated and have same lifecycle as pod.

gnufied · 2018-11-20T02:20:25Z

Not to mention - how will a user use these metrics? Say, if I am running out of diskspace in configmap "foo", what is user supposed to do? You can't resize a configmap. You can probably just reduce the amount of data you put in. But since this is limited to 1MB and you get a validation error during creating and updating a configmap - we should be fine.

Also IIRC - most of these ephemeral volume types are read-only now, so they can't grow on the node.

WanLinghao · 2018-11-20T02:26:35Z

@gnufied So should we remove the metrics calculation in secret volume?

kubernetes/pkg/volume/secret/secret.go

Line 102 in 50e02fd

    
           volume.NewCachedMetrics(volume.NewMetricsDu(getPath(pod.UID, spec.Name(), plugin.host))),

msau42 · 2018-11-20T02:28:12Z

The strange thing is that since ephemeral volumes are on the root disk, the used capacity will be miniscule compared to the boot disk.

Even though volume is mounted read-only, i think it will grow. If you update the config map, then we will create a new directory and update the symlink and remount. I also think we don't delete old directories, so the size will constantly grow if you update the objects frequently

gnufied · 2018-11-20T03:48:14Z

@WanLinghao hmm, I think one thing I got wrong was - these metrics do not automatically show up in prometheus end point. I will verify tomorrow, but I am thinking we should be fine.

@msau42 @WanLinghao also yeah I think symlink may cause volume to grow on disk (as long as volume is on same node). I will take a look again, thanks.

WanLinghao · 2018-11-26T15:22:07Z

@gnufied what's your view now～

WanLinghao · 2018-12-05T13:56:18Z

@gnufied PTAL thanks

gnufied · 2018-12-06T02:40:06Z

/approve

k8s-ci-robot · 2018-12-06T02:40:35Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, WanLinghao

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/volume/OWNERS~~ [gnufied]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

WanLinghao · 2018-12-10T01:26:30Z

/kind feature

Support more volume types in stats/summary endpoint including configm…

83a9db5

…ap, secret, projected

k8s-ci-robot requested review from davidz627 and msau42 October 24, 2018 06:25

k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 24, 2018

k8s-ci-robot assigned gnufied and msau42 Nov 1, 2018

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 1, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 6, 2018

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Dec 10, 2018

k8s-ci-robot merged commit 13985d2 into kubernetes:master Dec 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support more volume types in stats/summary endpoint including configm… #70172

Support more volume types in stats/summary endpoint including configm… #70172

WanLinghao commented Oct 24, 2018

msau42 commented Oct 25, 2018

msau42 commented Oct 25, 2018

WanLinghao commented Oct 25, 2018 •

edited

WanLinghao commented Oct 25, 2018 •

edited

msau42 commented Oct 25, 2018

WanLinghao commented Oct 29, 2018 •

edited

WanLinghao commented Oct 30, 2018 •

edited

msau42 commented Oct 30, 2018

WanLinghao commented Nov 1, 2018

msau42 commented Nov 1, 2018

WanLinghao commented Nov 5, 2018

WanLinghao commented Nov 9, 2018

WanLinghao commented Nov 20, 2018

gnufied commented Nov 20, 2018 •

edited

gnufied commented Nov 20, 2018

WanLinghao commented Nov 20, 2018

msau42 commented Nov 20, 2018

gnufied commented Nov 20, 2018

WanLinghao commented Nov 26, 2018

WanLinghao commented Dec 5, 2018

gnufied commented Dec 6, 2018

k8s-ci-robot commented Dec 6, 2018

WanLinghao commented Dec 10, 2018

Support more volume types in stats/summary endpoint including configm… #70172

Support more volume types in stats/summary endpoint including configm… #70172

Conversation

WanLinghao commented Oct 24, 2018

msau42 commented Oct 25, 2018

msau42 commented Oct 25, 2018

WanLinghao commented Oct 25, 2018 • edited

WanLinghao commented Oct 25, 2018 • edited

msau42 commented Oct 25, 2018

WanLinghao commented Oct 29, 2018 • edited

WanLinghao commented Oct 30, 2018 • edited

msau42 commented Oct 30, 2018

WanLinghao commented Nov 1, 2018

msau42 commented Nov 1, 2018

WanLinghao commented Nov 5, 2018

WanLinghao commented Nov 9, 2018

WanLinghao commented Nov 20, 2018

gnufied commented Nov 20, 2018 • edited

gnufied commented Nov 20, 2018

WanLinghao commented Nov 20, 2018

msau42 commented Nov 20, 2018

gnufied commented Nov 20, 2018

WanLinghao commented Nov 26, 2018

WanLinghao commented Dec 5, 2018

gnufied commented Dec 6, 2018

k8s-ci-robot commented Dec 6, 2018

WanLinghao commented Dec 10, 2018

WanLinghao commented Oct 25, 2018 •

edited

WanLinghao commented Oct 25, 2018 •

edited

WanLinghao commented Oct 29, 2018 •

edited

WanLinghao commented Oct 30, 2018 •

edited

gnufied commented Nov 20, 2018 •

edited