Skip to content

A simple tool to clean up old Job resources in Kubernetes

License

Notifications You must be signed in to change notification settings

aramse/k8s-job-reaper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

k8s-job-reaper

A simple tool to clean up old Job resources in Kubernetes

Motivation

As it currently stands in alpha, the TTL feature gate, which offers the ability to automatically clean up Job resources in Kubernetes based on a configured TTL, is weakly supported in managed Kubernetes offerings. For example, it's not supported at all in EKS. As a result, Job resources can quickly pile up and waste cluster resources.

This tool aims to deliver the same functionality via a script that looks for an annotation on Job resources called ttl.

Note that setting restartPolicy: OnFailure is another possible solution for cleanup, but it deletes the underlying pod (including its logs) immediately after Job completion, as documented here. Therefore it is not considered a viable approach for many use cases.

Example

apiVersion: batch/v1
kind: Job
metadata:
  generateName: example-job-ttl-
  annotations:
    ttl: "2 hours"
spec:
  template:
    spec:
      containers:
      - name: example
        image: centos
        command: ["sleep", "90"]
      restartPolicy: Never
  backoffLimit: 0

The ttl annotation can be specified with any value supported by GNU relative dates.

Note that this example Job is deployed with kubectl create rather than kubectl apply due to its usage of generateName.

Deployment

Prerequisites

  • docker
  • kubectl

Deploying this tool is as simple as running:

./build.sh [IMAGE_URL]

where [IMAGE_URL] is the full URL of the container image you want to build/push/deploy. For example, if your container registry is hosted on gcr.io/acme-123, you may run:

./build.sh gcr.io/acme-123/k8s-job-reaper

Configuration

This tool also supports the following configurations.

Field Location Description Default
DEFAULT_TTL Environment variable in cronjob.yaml An optional global default TTL for completed Jobs ""
DEFAULT_TTL_FAILED Environment variable in cronjob.yaml An optional global default TTL for uncompleted/failed Jobs (DEFAULT_TTL must also be set for this to take effect) ""
NS_BLACKLIST Environment variable in cronjob.yaml A list of Kubernetes Namespaces (space-delimited) to ignore when looking for Jobs "kube-system"
schedule Field in cronjob.yaml The cron schedule at which to look for Jobs to delete "0 */1 * * *" (once an hour)

About

A simple tool to clean up old Job resources in Kubernetes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published