Using CronJobs to automatically clean up completed Runs #479

ghost · 2020-02-26T15:04:07Z

Expected Behavior

Create a new tool that uses CronJob objects to schedule the cleanup of completed TaskRuns and PipelineRuns.

This could have use in our own dogfooding and also provide the community with help managing their own completed runs.

In a prior PR we explored the idea of using a TTL on runs and leveraging the Kubernetes TTL Controller to help clean them up. During that review process a user suggested CronJobs as an alternative to baking this TTL support directly into the Tekton Pipelines controller.

Actual Behavior

We don't currently have any way to automatically clean up completed TaskRuns and PipelineRuns but we definitely hear feedback that some kind of tooling or guidance would be very useful.

assertion · 2020-03-04T07:33:59Z

We have the same requirement to delete the completed resources automatically. ( Status may need to be considered, for example Failed PipelineRun should be deleted after a longer time than the Succeeded PipelineRun).

(Actually, now we delete these resources by our apiServer above Tekton through monioring the pipelineRun related events and when it's finished, we'll delete the succeeded pipelineRuns after a short time, and delete the failed pipelineRuns after a longer time ).

jlpettersson · 2020-05-13T07:56:20Z

I'll have a look into this.

/assign

k · 2020-05-28T16:16:03Z

@jlpettersson any update on this? Right now I have a pretty janky kubectl command to delete old pipeline runs lol

jlpettersson · 2020-05-28T16:43:50Z

@jlpettersson any update on this? Right now I have a pretty janky kubectl command to delete old pipeline runs lol

It should not be much more job. Give me a few days.

mattmoor · 2020-06-13T00:12:23Z

In a prior PR we explored the idea of using a TTL on runs and leveraging the Kubernetes TTL Controller to help clean them up. During that review process a user suggested CronJobs as an alternative to baking this TTL support directly into the Tekton Pipelines controller.

🤔 Seems like following the idioms established for K8s Jobs would be somewhat prudent. If we were to establish a "ttl" duck type (e.g. for how this is embedded into specs) combined with the use of a Succeeded condition, then you could write a shareable meta controller that handles this for all types.

cc @n3wscott (this would benefit from the ideas in our last Kubecon talk)

ghost · 2020-06-25T12:43:25Z

@afrittoli has implemented this kind of pruning behaviour in our dogfooding cluster now 🎉 🎉 . That work could help inform this issue. Here's the PR where his changes were added: tektoncd/plumbing#442

tekton-robot · 2020-08-15T00:47:16Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

tekton-robot · 2020-08-15T00:47:17Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot · 2020-08-15T00:47:18Z

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ghost · 2020-08-17T13:28:56Z

/reopen
/remove-lifecycle stale
/remove-lifecycle rotten
/freeze

tekton-robot · 2020-08-17T13:28:58Z

@sbwsg: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle stale
/remove-lifecycle rotten
/freeze

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ghost · 2020-08-17T13:29:53Z

I'm keeping this issue open as it's a feature area that is still seeing semi-regular community requests.

/lifecycle frozen

psschwei · 2020-08-28T17:52:35Z

@sbwsg you're thinking that this would just be a sample cronjob (and related resources) that users could apply to their own clusters, correct (similar to what @afrittoli did on the dogfooding cluster)? Perhaps under the examples/ directory.

Assuming that's the case...
/assign

ghost · 2020-08-31T19:53:30Z

Great! Documenting this as part of Pipelines would also be really useful.

jlpettersson · 2020-09-13T16:02:00Z

This CronJob was shared by Tekton twitter account: https://gist.github.com/ctron/4764c0c4c4ea0b22353f2a23941928ad

raelga · 2021-10-07T13:39:00Z

This CronJob was shared by Tekton twitter account: https://gist.github.com/ctron/4764c0c4c4ea0b22353f2a23941928ad

An evolution of that CronJob, to keeping up to NUM_TO_KEEP of each Pipeline.

...
- name: kubectl
  image: docker.io/alpine/k8s:1.20.7
  env:
    - name: NUM_TO_KEEP
      value: "3"
  command:
    - /bin/bash
    - -c
    - >
      while read -r PIPELINE; do
        while read -r PIPELINE_TO_REMOVE; do
          test -n "${PIPELINE_TO_REMOVE}" || continue;
          kubectl delete ${PIPELINE_TO_REMOVE} \
              && echo "$(date -Is) PipelineRun ${PIPELINE_TO_REMOVE} deleted." \
              || echo "$(date -Is) Unable to delete PipelineRun ${PIPELINE_TO_REMOVE}.";
        done < <(kubectl get pipelinerun -l tekton.dev/pipeline=${PIPELINE} --sort-by=.metadata.creationTimestamp -o name | head -n -${NUM_TO_KEEP});
      done < <(kubectl get pipelinerun -o go-template='{{range .items}}{{index .metadata.labels "tekton.dev/pipeline"}}{{"\n"}}{{end}}' | uniq);

Full example with rbac at
https://gist.github.com/raelga/e75e6de4fd04be60f267128e985bde6d

ghost mentioned this issue Feb 26, 2020

Adding support to delete completed TaskRun or PipelineRun automatically tektoncd/pipeline#1486

Closed

ghost mentioned this issue Apr 6, 2020

Introduce runHistoryLimit tektoncd/pipeline#2332

Closed

jlpettersson mentioned this issue Apr 21, 2020

PipelineRun cleanup and cancelation tektoncd/pipeline#2452

Closed

tekton-robot assigned jlpettersson May 13, 2020

jlpettersson removed their assignment Jun 18, 2020

ghost mentioned this issue Jun 24, 2020

Automatic pruning of old TaskRuns / PipelineRuns tektoncd/pipeline#2856

Closed

ghost mentioned this issue Jul 17, 2020

Automatically delete old pr and tr on dogfooding tektoncd/plumbing#442

Merged

2 tasks

tekton-robot closed this as completed Aug 15, 2020

tekton-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 15, 2020

tekton-robot reopened this Aug 17, 2020

tekton-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 17, 2020

tekton-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Aug 17, 2020

This was referenced Aug 25, 2020

Presubmit tests failing - KCC stuck at creating clusters GoogleCloudPlatform/kubeflow-distribution#128

Closed

GC old tekton resources periodically kubeflow/testing#750

Closed

tekton-robot assigned psschwei Aug 28, 2020

psschwei mentioned this issue Sep 2, 2020

Using CronJobs to automatically clean up completed Runs #626

Merged

2 tasks

wlynch mentioned this issue Sep 8, 2020

Investigate effective storage limits for completed runs tektoncd/pipeline#3175

Open

tekton-robot closed this as completed in #626 Oct 27, 2020

dibyom mentioned this issue Nov 2, 2020

TriggerTemplate deletion does not clean up Taskrun's created by Triggertemplate tektoncd/triggers#817

Closed

wlynch mentioned this issue Jan 21, 2021

Watcher: Pipeline resource cleanup tektoncd/results#59

Closed

jwntrs mentioned this issue Jan 11, 2022

Garbage collect stamped objects vmware-tanzu/cartographer#302

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using CronJobs to automatically clean up completed Runs #479

Using CronJobs to automatically clean up completed Runs #479

ghost commented Feb 26, 2020

assertion commented Mar 4, 2020

jlpettersson commented May 13, 2020

k commented May 28, 2020

jlpettersson commented May 28, 2020

mattmoor commented Jun 13, 2020

ghost commented Jun 25, 2020 •

edited by ghost

tekton-robot commented Aug 15, 2020

tekton-robot commented Aug 15, 2020

tekton-robot commented Aug 15, 2020

ghost commented Aug 17, 2020

tekton-robot commented Aug 17, 2020

ghost commented Aug 17, 2020

psschwei commented Aug 28, 2020

ghost commented Aug 31, 2020

jlpettersson commented Sep 13, 2020

raelga commented Oct 7, 2021 •

edited

Using CronJobs to automatically clean up completed Runs #479

Using CronJobs to automatically clean up completed Runs #479

Comments

ghost commented Feb 26, 2020

Expected Behavior

Actual Behavior

assertion commented Mar 4, 2020

jlpettersson commented May 13, 2020

k commented May 28, 2020

jlpettersson commented May 28, 2020

mattmoor commented Jun 13, 2020

ghost commented Jun 25, 2020 • edited by ghost

tekton-robot commented Aug 15, 2020

tekton-robot commented Aug 15, 2020

tekton-robot commented Aug 15, 2020

ghost commented Aug 17, 2020

tekton-robot commented Aug 17, 2020

ghost commented Aug 17, 2020

psschwei commented Aug 28, 2020

ghost commented Aug 31, 2020

jlpettersson commented Sep 13, 2020

raelga commented Oct 7, 2021 • edited

ghost commented Jun 25, 2020 •

edited by ghost

raelga commented Oct 7, 2021 •

edited