Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,12 @@ include::modules/op-default-pruner-configuration.adoc[leveloffset=+2]

include::modules/op-annotations-for-automatic-pruning-taskruns-pipelineruns.adoc[leveloffset=+2]

include::modules/op-event-pruner-configuration.adoc[leveloffset=+1]

include::modules/op-event-pruner-reference.adoc[leveloffset=+2]

include::modules/op-event-pruner-observability.adoc[leveloffset=+2]

include::modules/op-additional-options-webhooks.adoc[leveloffset=+1]


Expand Down
81 changes: 81 additions & 0 deletions modules/op-event-pruner-configuration.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
// This module is included in the following assemblies:
// * install_config/customizing-configurations-in-the-tektonconfig-cr.adoc

:_mod-docs-content-type: PROCEDURE
[id="event-pruner-configuration_{context}"]
= Enabling the event-based pruner

:FeatureName: The event-based pruner
include::snippets/technology-preview.adoc[]

You can use the event-based `tektonpruner` controller to automatically delete completed resources, such as `PipelineRuns` and `TaskRuns`, based on configurable policies. Unlike the default job-based pruner, the event-based pruner listens for resource events and prunes resources in near real time.

[IMPORTANT]
====
You must disable the default pruner in the `TektonConfig` custom resource (CR) before you enable the event-based pruner. If both pruner types are enabled, the deployment readiness status changes to `False` and the following error message is displayed on the output:

[source,terminal]
----
Components not in ready state: Invalid Pruner Configuration!! Both pruners, tektonpruner(event based) and pruner(job based) cannot be enabled simultaneously. Please disable one of them.
----
====

.Procedure

. In your TektonConfig CR, disable the default pruner by setting the `spec.pruner.disabled` field to `true` and enable the event-based pruner by setting the `spec.tektonpruner.disabled` field to `false`. For example:
+
[source,yaml]
----
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
name: config
spec:
# ...
pruner:
disabled: true
# ...
tektonpruner:
disabled: false
options: {}
# ...
----
+
After you apply the updated CR, the Operator deploys the `tekton-pruner-controller` pod in the `openshift-pipelines` namespace.

. Ensure that the following config maps are present in the `openshift-pipelines` namespace:
+
[cols="1,3",options="header"]
|===
|Config map
|Purpose

|`tekton-pruner-default-spec`
|Define default pruning behavior

|`pruner-info`
|Store internal runtime data used by the controller

|`config-logging-tekton-pruner`
|Configure logging settings for the pruner

|`config-observability-tekton-pruner`
|Enable observability features such as metrics and tracing
|===


.Verification

. To verify that the `tekton-pruner-controller` pod is running, run the following command:
+
[source,terminal]
----
$ oc get pods -n openshift-pipelines
----

. Verify that the output includes a `tekton-pruner-controller` pod in the `Running` state. Example output:
+
[source,terminal]
----
$ tekton-pruner-controller-<id> Running
----
81 changes: 81 additions & 0 deletions modules/op-event-pruner-observability.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
// This module is included in the following assemblies:
// * install_config/customizing-configurations-in-the-tektonconfig-cr.adoc

:_mod-docs-content-type: REFERENCE
[id="event-pruner-observability_{context}"]
= Observability metrics of the event-based pruner

:FeatureName: The event-based pruner
include::snippets/technology-preview.adoc[]

The event-based pruner exposes detailed metrics through the `tekton-pruner-controller` controller `Service` definition on port `9090` in OpenTelemetry format for monitoring, troubleshooting, and capacity planning.

Following are categories of the metrics exposed:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following categories describe the metrics exposed by the event-based pruner:


* Resource processing
* Performance timing
* State tracking
* Error monitoring

Most pruner metrics use labels to provide additional context. You can use these labels in PromQL{nbsp}queries or dashboards to filter and group the metrics.

[cols="1,3"options="header"]
|===
| Label | Description

| `namespace`
| The Kubernetes namespace of the `PipelineRun` or `TaskRun`.

| `resource_type`
| The Tekton resource type.

| `status`
| The outcome of processing a resource.

| `operation`
| The pruning method that deleted a resource.

| `reason`
| Specific cause for skipping or error outcomes.
|===

Resource processing metrics::
The following resource processing metrics are exposed by the event-based pruner:
+
[cols="3,1,3,2",options="header"]
|===
| Name | Type | Description | Labels
| `tekton_pruner_controller_resources_processed_total` | Counter | Total resources processed | namespace, resource_type, status
| `tekton_pruner_controller_resources_deleted_total` | Counter | Total resources deleted | namespace, resource_type, operation
|===


Performance timing metrics::
The following performance timing metrics are exposed by the event-based pruner:
+
[cols="3,1,3,2,2",options="header"]
|===
| Name | Type | Description | Labels | Bucket
| `tekton_pruner_controller_reconciliation_duration_seconds` | Histogram | Time spent in reconciliation | namespace, resource_type | 0.1 to 30 seconds
| `tekton_pruner_controller_ttl_processing_duration_seconds` | Histogram | Time spent processing TTL | namespace, resource_type | 0.1 to 30 seconds
| `tekton_pruner_controller_history_processing_duration_seconds` | Histogram | Time spent processing history limits | namespace, resource_type | 0.1 to 30 seconds
|===

State tracking metrics::
The following state tracking metrics are exposed by the event-based pruner:
+
[cols="3,1,3",options="header"]
|===
| Name | Type | Description
| `kn_workqueue_adds_total` | Counter | Total resources queued
| `kn_workqueue_depth` | Gauge | Number of current items in queue
|===

Error monitoring metrics::
The following error monitoring metrics are exposed by the event-based pruner:
+
[cols="3,1,3,2",options="header"]
|===
| Name | Type | Description | Labels
| `tekton_pruner_controller_resources_errors_total` | Counter | Total processing errors | namespace, resource_type, reason
|===
90 changes: 90 additions & 0 deletions modules/op-event-pruner-reference.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// This module is included in the following assemblies:
// * install_config/customizing-configurations-in-the-tektonconfig-cr.adoc

:_mod-docs-content-type: REFERENCE
[id="event-pruner-reference_{context}"]
= Configuration of the event-based pruner

:FeatureName: The event-based pruner
include::snippets/technology-preview.adoc[]

You can configure the pruning behavior of the event-based pruner by modifying your `TektonConfig` custom resource (CR).

The following is an example of the `TektonConfig` CR with the default configuration that uses global pruning rules:

[source,yaml]
----
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
name: config
spec:
# ...
tektonpruner:
disabled: false
global-config:
enforcedConfigLevel: global
failedHistoryLimit: null
historyLimit: 10
namespaces: null
successfulHistoryLimit: null
ttlSecondsAfterFinished: null
options: {}
# ...
----
* `failedHistoryLimit`: The amount of retained failed runs.
* `historyLimit`: The amount of runs to retain. Pruner uses this setting if status-specific limits are not defined.
* `namespaces`: Definition of per-namespace pruning policies, when you set `enforcedConfigLevel` to `namespace`.
* `successfulHistoryLimit`: The amount of retained successful runs.
* `ttlSecondsAfterFinished`: Time in seconds after completion, after which the pruner deletes resources.

You can define pruning rules for individual namespaces by setting `enforcedConfigLevel` to `namespace` and configuring policies under the `namespaces` section. In the following example, a 60 second time to live (TTL) is applied to resources in the `dev-project` namespace:

[source,yaml]
----
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
name: config
spec:
# ...
tektonpruner:
disabled: false
global-config:
enforcedConfigLevel: namespace
ttlSecondsAfterFinished: 300
namespaces:
dev-project:
ttlSecondsAfterFinished: 60
# ...
----

You can use the following parameters in your `TektonConfig` CR `tektonpruner`:

[cols="1,3",options="header"]
|===
|Parameter |Description

|`ttlSecondsAfterFinished`
|Delete resources a fixed number of seconds after they complete.

|`successfulHistoryLimit`
|Retain the specified number of the most recent successful runs. Delete older successful runs.

|`failedHistoryLimit`
|Retain the specified number of the most recent failed runs. Delete older failed runs.

|`historyLimit`
|Apply a generic history limit when `failedHistoryLimit` and `successfulHistoryLimit` are not defined.

|`enforcedConfigLevel`
|Specify the level at which pruner applies the configuration. Accepted values: `global` or `namespace`.

|`namespaces`
|Define per-namespace pruning policies.
|===

[NOTE]
====
You can use TTL-based pruning to prune resources exceeding set expiration times. Use history-based pruning to prune resources exceeding the configured `historyLimit`.
====