A simple tool to clean up old Job resources in Kubernetes
As it currently stands in alpha
, the TTL feature gate, which offers the ability to automatically clean up Job resources in Kubernetes based on a configured TTL, is weakly supported in managed Kubernetes offerings. For example, it's not supported at all in EKS. As a result, Job resources can quickly pile up and waste cluster resources.
This tool aims to deliver the same functionality via a script that looks for an annotation on Job resources called ttl
.
Note that setting
restartPolicy: OnFailure
is another possible solution for cleanup, but it deletes the underlying pod (including its logs) immediately after Job completion, as documented here. Therefore it is not considered a viable approach for many use cases.
apiVersion: batch/v1
kind: Job
metadata:
generateName: example-job-ttl-
annotations:
ttl: "2 hours"
spec:
template:
spec:
containers:
- name: example
image: centos
command: ["sleep", "90"]
restartPolicy: Never
backoffLimit: 0
The ttl
annotation can be specified with any value supported by GNU relative dates.
Note that this example Job is deployed with
kubectl create
rather thankubectl apply
due to its usage ofgenerateName
.
docker
kubectl
Deploying this tool is as simple as running:
./build.sh [IMAGE_URL]
where [IMAGE_URL]
is the full URL of the container image you want to build/push/deploy. For example, if your container registry is hosted on gcr.io/acme-123
, you may run:
./build.sh gcr.io/acme-123/k8s-job-reaper
This tool also supports the following configurations.
Field | Location | Description | Default |
---|---|---|---|
DEFAULT_TTL |
Environment variable in cronjob.yaml | An optional global default TTL for completed Jobs | "" |
DEFAULT_TTL_FAILED |
Environment variable in cronjob.yaml | An optional global default TTL for uncompleted/failed Jobs (DEFAULT_TTL must also be set for this to take effect) |
"" |
NS_BLACKLIST |
Environment variable in cronjob.yaml | A list of Kubernetes Namespaces (space-delimited) to ignore when looking for Jobs | "kube-system" |
schedule |
Field in cronjob.yaml | The cron schedule at which to look for Jobs to delete | "0 */1 * * *" (once an hour) |