specify garbage collection #13087

dalanlan · 2015-08-24T12:13:03Z

k8s-bot · 2015-08-24T12:15:07Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

dalanlan · 2015-08-25T00:12:59Z

Copied from #13065

My main concerns about gc comes to

What would kubelet do with the volumes (hostPath, for instance) of the container precisely when it's gonna to remove a exited container. l'll bet It wont please the users whether kubelet cleans the volumes or not. Greater customization should be allowed here.
Should warn the users that an external garbage collection tool is not recommended since it would break kubelet's behavior.
Since flag like --maximum-dead-containers-per-container has been provided to the users, what if we set it to 0? Would kubelet just accept this happily even it could hurt?
Is there still race condition (like break the pulling-image-process when gc) existing?

It would be nice to consider these for anyone who is going to review this pr.

pwittrock · 2015-08-25T01:59:54Z

I'll take a look. Thanks Emma.

On Mon, Aug 24, 2015 at 5:13 PM Emma He notifications@github.com wrote:

Copied from #13065 #13065

My main concerns about gc comes to

What would kubelet do with the volumes (hostPath, for instance) of
the container precisely when it's gonna to remove a exited container. *l'll
bet It wont please the users whether kubelet cleans the volumes or not.
Greater customization should be allowed here. *

Should warn the users that an external garbage collection tool is
not recommended since it would break kubelet's behavior.

Since flag like --maximum-dead-containers-per-container has been
provided to the users, what if we set it to 0? Would kubelet just accept
this happily even it could hurt?

Is there still race condition (like break the pulling-image-process
when gc) existing? It would be nice to consider these for anyone who
is going to review this pr.

—
Reply to this email directly or view it on GitHub
#13087 (comment)
.

pwittrock · 2015-08-25T03:04:33Z

docs/user-guide/garbage-collection.md

+kubernetes manages lifecycle of all images through `imageManager`。
+The policy for garbage collecting images we apply takes two factors into consideration,
+`HighThresholdPercent` and `LowThresholdPercent`. Disk usage above the the high threshold
+will trigger garbage collection,  and never down to low threshold.


nit:

, and never down to low threshold.

Is this strictly true? Garbage collection attempts to free space by deleting images until "usage - (int64(im.policy.LowThresholdPercent) * capacity / 100)" bytes are freed. Maybe something like "Disk usage above the the high threshold will trigger garbage collection, which will attempt to delete unused images until the LowThresholdPercent is met"?

nit:
Maybe also mention that least recently used images are deleted first.

pwittrock · 2015-08-25T17:18:59Z

Hey Emma,

with respect to the concerns you listed, are you concerned about what the current behavior is and how to document it, or are you asking how we can improve the existing behavior?

bgrant0607 · 2015-08-26T04:43:42Z

docs/user-guide/garbage-collection.md

+
+<!-- END MUNGE: UNVERSIONED_WARNING -->
+
+# Garbage Collection


Please move this doc to docs/admin and add a link to the README.md in that directory, as a sub-bullet under "The kubelet binary"

bgrant0607 · 2015-08-26T04:43:59Z

cc @kubernetes/goog-node

bgrant0607 · 2015-08-26T04:44:24Z

cc @vishh re. disk management

bgrant0607 · 2015-08-26T04:46:18Z

docs/user-guide/garbage-collection.md

+container takes up some disk space.
+Values less than zero for the last two are regarded as no limit.
+
+> Note that we don't recommend external garbage collection tool, since it could break the behavior


Was the leading > intentional?

It was. Deprecated now (though).

dalanlan · 2015-08-26T05:18:32Z

with respect to the concerns you listed, are you concerned about what the current behavior is and how to document it, or are you asking how we can improve the existing behavior?

Mostly the latter one.

What would kubelet do with the volumes (hostPath, for instance) of the container precisely when it's gonna to remove a exited container. l'll bet It wont please the users whether kubelet cleans the volumes or not. Greater customization should be allowed here.

Cleaning up all of the volumes roughly seems a little unfriendly to me. What do you say?

Since flag like --maximum-dead-containers-per-container has been provided to the users, what if we set it to 0? Would kubelet just accept this happily even it could hurt?

kubelet relies on the dead containers to make the strategy sort of. Hence zero should be a unwanted param value theoretically, yet permitted now.

Is there still race condition (like break the pulling-image-process when gc) existing?

This is simply a question. Inspired by #3393

/cc @pwittrock The comments has been addressed. Please have a double check:)

dalanlan · 2015-08-26T05:20:26Z

Fix #8416

pwittrock · 2015-08-26T18:05:39Z

@dalanlan Thanks for writing this documentation up. I am sure folks will find it very helpful.

What would kubelet do with the volumes (hostPath, for instance) of the container precisely when it's gonna to remove a exited container. l'll bet It wont please the users whether kubelet cleans the volumes or not. Greater customization should be allowed here.

I am still learning about how GC interacts with volumes, but @vishh may be able to give a better answer here. If this is something that you would like to discuss more, it probably makes sense to open a separate issue to do so.

Should warn the users that an external garbage collection tool is not recommended since it would break kubelet's behavior.

I agree we should warn users about this. I saw that you did so in your doc. Thanks!

Since flag like --maximum-dead-containers-per-container has been provided to the users, what if we set it to 0? Would kubelet just accept this happily even it could hurt? kubelet relies on the dead containers to make the strategy sort of. Hence zero should be a unwanted param value theoretically, yet permitted now.

Good point about relying on dead containers, this would be a bug. From looking at the code, is does look like the GC will accept a flag of 0 for dead-per-container, and perform the garbage collection. We should probably create an issue to address bugs caused by garbage collecting containers we rely upon for correctly obeying restart policy. I am not certain that setting the min flag value to 1 will solve this issue entirely because the flag for max total dead containers may cause the same issue. I would consider flag values that cause undesirable but correct behavior (such as poor performance or deleting containers the use may want to examine later) to be a separate and more complicated issue.

Is there still race condition (like break the pulling-image-process when gc) existing?

I am not aware of any known race conditions caused by container or image garbage collection, and didn't see any open issues after doing a simple search. @yujuhong can you confirm?

vishh · 2015-08-26T18:36:23Z

docs/admin/garbage-collection.md

+Garbage collection is managed by kubelet automatically, mainly including unreferenced
+images and dead containers. kubelet applies container garbage collection every minute
+and image garbage collection every 5 minutes.
+Note that we don't recommend external garbage collection tool, since it could break


A side note: Docker still can leak files and forget about them. External garbage collection tools to clean up behind docker might be useful. WDYT?

The leaking issues troubles us a lot as well, i'll admit it, yet it's (maybe) a long-term run pb. External gc could be an short-term option, as long as it wont break kubelet itself.

vishh · 2015-08-26T18:49:39Z

We generally discourage using hostpath for storing data. Use a volume. What if your container were to run on a machine with no local disk? May be, we need to document the recommendation clearly.
The more the knobs the harder it is to understand and maintain the system.
+1 for warning. As I mentioned in a comment, docker might still leak, and we might need an external garbage collector, or make kubelet smart enough to handle that.
Not storing previous dead containers will affect debugging - mainly access to logs. On a node with very little disk space, it is possible for users to not retain any old containers, except for the running ones. WDYT? Our APIs should fail gracefully, if previous instances of a container is not found.
I would hope for docker to handle these races. A image pull and deletion that ends up touching the same layers, should be synchronized by docker.

dalanlan · 2015-08-29T00:34:35Z

The comments have been fully addressed.
/cc @pwittrock @vishh

vishh · 2015-08-29T00:40:21Z

docs/admin/garbage-collection.md

+
+1. `minimum-container-ttl-duration`, minimum age for a finished container before it is
+garbage collected. Default is 1 minute.
+2. `maximum-dead-containers-per-container`, maximum number of old instances to retain


Should we add a line stating that this should be set to a minimum value of 2 as of now?

It's kinda tricky. What do you say about maximum-dead-containers flag then? That one could hurt as well :-)

Yes. Thats true. I was thinking that providing hints to have a stable
system will be very helpful to users. WDYT?

On Fri, Aug 28, 2015 at 5:57 PM, Emma He notifications@github.com wrote:

In docs/admin/garbage-collection.md
#13087 (comment)
:

+Note that we will skip the containers that are not managed by kubelet.
+
+### User Configuration
+
+Users are free to set their own value to address image garbage collection.
+
+1. image-gc-high-threshold, the percent of disk usage which triggers image garbage collection.
+Default is 90%.
+2. image-gc-low-threshold, the percent of disk usage to which image garbage collection attempts
+to free. Default is 80%.
+
+We also allow users to customize garbage collection policy, basically via following three flags.
+
+1. minimum-container-ttl-duration, minimum age for a finished container before it is
+garbage collected. Default is 1 minute.
+2. maximum-dead-containers-per-container, maximum number of old instances to retain

It's kinda tricky. What do you say about maximum-dead-containers flag
then? That one could hurt as well :-)

—
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/pull/13087/files#r38256861.

It is probably worth mentioning that maximum-dead-containers should be large enough to allow at least 2 dead containers-per-expected-container? May be also reference #13287 as an explanation for why these values are recommended.

vishh · 2015-08-29T00:41:03Z

Just one more comment @dalanlan! Otherwise LGTM.
@pwittrock I will let you make a final approval!

dalanlan · 2015-09-01T02:26:11Z

I'd updated the doc.

Add the logic of policy dealing with the case in which maximum-dead-containers-per-container and maximum-dead-containers conflict.
Sort out the previous mentioned flags pb.
PTAL /cc @vishh @pwittrock

k8s-github-robot · 2015-09-01T02:28:41Z

Labelling this PR as size/L

pwittrock · 2015-09-01T16:16:24Z

@dalanlan Thanks for all your work on this

k8s-github-robot · 2015-09-01T23:48:12Z

@k8s-bot ok to test

pr builder appears to be missing, activating due to 'lgtm' label.

k8s-bot · 2015-09-02T00:14:45Z

GCE e2e build/test failed for commit f5bdea8.

dalanlan · 2015-09-02T02:44:10Z

Shame. I cannot really access the test details for Jenkins. And it should not break anything to change doc.
May trigger this again, please? @pwittrock

pwittrock · 2015-09-02T15:50:03Z

@k8s-bot test this again please

k8s-bot · 2015-09-02T16:18:36Z

GCE e2e build/test passed for commit f5bdea8.

k8s-github-robot · 2015-09-02T23:18:30Z

The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label?

k8s-github-robot · 2015-09-03T00:08:52Z

Automatic merge from SubmitQueue

Auto commit by PR queue bot

googlebot added the cla: yes label Aug 24, 2015

yujuhong added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/documentation Categorizes issue or PR as related to documentation. labels Aug 24, 2015

yujuhong assigned pwittrock Aug 24, 2015

pwittrock reviewed Aug 25, 2015
View reviewed changes

dalanlan force-pushed the specify-garbage-collection branch from 0da63b6 to 271895d Compare August 26, 2015 02:50

bgrant0607 reviewed Aug 26, 2015
View reviewed changes

dalanlan force-pushed the specify-garbage-collection branch 2 times, most recently from 5679d0b to 1536f78 Compare August 26, 2015 05:06

vishh reviewed Aug 26, 2015
View reviewed changes

dalanlan force-pushed the specify-garbage-collection branch 2 times, most recently from 87e800d to b43e931 Compare August 27, 2015 06:45

vishh reviewed Aug 29, 2015
View reviewed changes

dalanlan force-pushed the specify-garbage-collection branch 2 times, most recently from 90ca4ec to b1f6321 Compare September 1, 2015 02:14

specify gc

f5bdea8

dalanlan force-pushed the specify-garbage-collection branch from b1f6321 to f5bdea8 Compare September 1, 2015 02:16

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 1, 2015

pwittrock added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2015

k8s-github-robot added the needs-ok-to-merge label Sep 2, 2015

brendandburns added ok-to-merge and removed needs-ok-to-merge labels Sep 2, 2015

k8s-github-robot pushed a commit that referenced this pull request Sep 3, 2015

Merge pull request #13087 from ZJU-SEL/specify-garbage-collection

80f2d89

Auto commit by PR queue bot

k8s-github-robot merged commit 80f2d89 into kubernetes:master Sep 3, 2015

pwittrock mentioned this pull request Sep 3, 2015

Kubelet: Document GC Policy and Params #8416

Closed

dalanlan deleted the specify-garbage-collection branch October 14, 2015 07:50

dustymabe mentioned this pull request Jan 12, 2016

Add option to gc containers from deleted pods #19571

Closed

dustymabe mentioned this pull request Feb 22, 2016

kubelet should remove containers associated with deleted pods #8347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specify garbage collection #13087

specify garbage collection #13087

dalanlan commented Aug 24, 2015

k8s-bot commented Aug 24, 2015

dalanlan commented Aug 25, 2015

pwittrock commented Aug 25, 2015

pwittrock Aug 25, 2015

pwittrock commented Aug 25, 2015

bgrant0607 Aug 26, 2015

bgrant0607 commented Aug 26, 2015

bgrant0607 commented Aug 26, 2015

bgrant0607 Aug 26, 2015

dalanlan Aug 26, 2015

dalanlan commented Aug 26, 2015

dalanlan commented Aug 26, 2015

pwittrock commented Aug 26, 2015

vishh Aug 26, 2015

dalanlan Aug 27, 2015

vishh commented Aug 26, 2015

dalanlan commented Aug 29, 2015

vishh Aug 29, 2015

dalanlan Aug 29, 2015

vishh Aug 31, 2015

pwittrock Aug 31, 2015

vishh commented Aug 29, 2015

dalanlan commented Sep 1, 2015

k8s-github-robot commented Sep 1, 2015

pwittrock commented Sep 1, 2015

k8s-github-robot commented Sep 1, 2015

k8s-bot commented Sep 2, 2015

dalanlan commented Sep 2, 2015

pwittrock commented Sep 2, 2015

k8s-bot commented Sep 2, 2015

k8s-github-robot commented Sep 2, 2015

k8s-github-robot commented Sep 3, 2015

specify garbage collection #13087

specify garbage collection #13087

Conversation

dalanlan commented Aug 24, 2015

k8s-bot commented Aug 24, 2015

dalanlan commented Aug 25, 2015

pwittrock commented Aug 25, 2015

Choose a reason for hiding this comment

pwittrock commented Aug 25, 2015

Choose a reason for hiding this comment

bgrant0607 commented Aug 26, 2015

bgrant0607 commented Aug 26, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalanlan commented Aug 26, 2015

dalanlan commented Aug 26, 2015

pwittrock commented Aug 26, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vishh commented Aug 26, 2015

dalanlan commented Aug 29, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vishh commented Aug 29, 2015

dalanlan commented Sep 1, 2015

k8s-github-robot commented Sep 1, 2015

pwittrock commented Sep 1, 2015

k8s-github-robot commented Sep 1, 2015

k8s-bot commented Sep 2, 2015

dalanlan commented Sep 2, 2015

pwittrock commented Sep 2, 2015

k8s-bot commented Sep 2, 2015

k8s-github-robot commented Sep 2, 2015

k8s-github-robot commented Sep 3, 2015