Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builder pods not removed after deploy #487

Open
felixbuenemann opened this issue Feb 20, 2017 · 12 comments
Open

Builder pods not removed after deploy #487

felixbuenemann opened this issue Feb 20, 2017 · 12 comments

Comments

@felixbuenemann
Copy link
Contributor

Currently (as of deis-builder v2.7.1) the slugbuild and dockerbuild pods are not deleted after a successful or failed build.

This means that the pod (eg. slugbuild-example-e24fafeb-b31237bb) will continue to exist in state "Completed" or state "Error" and the docker container associated with the pod can never be garbage collected by Kubernetes, causing the node to quickly run out of disk space.

Example:

On a k8s node with an uptime of 43 days and 95 GB disk storage for docker there where 249 completed (or some erred) slugbuild and dockerbuild pods whose docker images accounted for 80 GB of disk storage, while the deployed apps and deis services only required 15 GB storage.

Expected Behavior:

The expected behavior for the builder would be, that it automatically deletes the build pod after is has completed or erred, so that the K8s garbage collection can remove the docker containers which frees the disk space allocated to them.

@felixbuenemann
Copy link
Contributor Author

felixbuenemann commented Feb 20, 2017

This behavior can easily inspected with:

kubectl get --namespace deis --show-all pods | grep build-

The number of completed pods will increase by one for each build.

@bacongobbler
Copy link
Member

related: #57

This seems like in recent versions of k8s, they stopped cleaning up pods in the "success" state. Probably some research needs to be done on how to turn this functionality back on.

@felixbuenemann
Copy link
Contributor Author

felixbuenemann commented Feb 21, 2017

I'm running K8s 1.4.x if that matters.

Regarding #57 suggestion for Jobs – neither Jobs nor Pods are removed automatically.

From the K8s Job docs:

When a Job completes, no more Pods are created, but the Pods are not deleted either. Since they are terminated, they don’t show up with kubectl get pods, but they will show up with kubectl get pods -a. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml). When you delete the job using kubectl, all the pods it created are deleted too.

@felixbuenemann
Copy link
Contributor Author

felixbuenemann commented Feb 21, 2017

Interestingly the docs on Pod Lifecycle say:

In general, Pods do not disappear until someone destroys them. This might be a human or a controller. The only exception to this rule is that Pods with aphase of Succeeded or Failed for more than some duration (determined by the master) will expire and be automatically destroyed.

This seems to be in contrast to what I'm actually seeing…

@felixbuenemann
Copy link
Contributor Author

I have opened kubernetes/kubernetes#41787 for clarification of the above statement from the docs.

@felixbuenemann
Copy link
Contributor Author

felixbuenemann commented Feb 27, 2017

I just got feedback to the kubernetes issue, it looks like by default completed or failed pods are garbage collected if there are more than 12,500 pods. Obviously that is not very helpful in this case, so an automatic cleanup by the builder should be implemented.

@felixbuenemann
Copy link
Contributor Author

felixbuenemann commented Mar 6, 2017

Quoting here from the kube-controller-manager help on the --terminated-pod-gc-threshold <n> option:

Number of terminated pods that can exist before the terminated pod garbage collector starts deleting terminated pods. If <= 0, the terminated pod garbage collector is disabled. (default 12500)

@kwent
Copy link
Contributor

kwent commented Mar 20, 2017

Any progress on this ? Sounds like a waste of resources and space for everyone.

@pfeodrippe
Copy link

Same here, it may be linked to a issue I've opened last week.

$ kubectl get --namespace deis --show-all pods | grep build-
slugbuild-teslabit-web-production-d2fcd4c0-7e507178   0/1       Completed   0          1d

@pfeodrippe
Copy link

pfeodrippe commented Apr 12, 2017

I'm using this tiny git pre-push hook for deletion https://gist.github.com/pfeodrippe/116c8b570ee2ffcdce8aa15bbae5a22b.

It deletes the last slugbuild created for the app when you git push

@davidlmorton
Copy link

+1 This bit me after a couple of weeks of deploying applications to my deis cluster.

@Cryptophobia
Copy link

This issue was moved to teamhephy/builder#17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants