openshift · smunje1 · Jul 26, 2024
diff --git a/modules/log6x-audit-log-filtering.adoc b/modules/log6x-audit-log-filtering.adoc
@@ -0,0 +1,118 @@
+// Module included in the following assemblies:
+//
+// * observability/logging/logging-6.0/log6x-clf.adoc
+
+:_mod-docs-content-type: CONCEPT
+[id="log6x-audit-filtering_{context}"]
+= Overview of API audit filter
+OpenShift API servers generate audit events for each API call, detailing the request, response, and the identity of the requester, leading to large volumes of data. The API Audit filter uses rules to enable the exclusion of non-essential events and the reduction of event size, facilitating a more manageable audit trail. Rules are checked in order, checking stops at the first match. How much data is included in an event is determined by the value of the `level` field:
+
+* `None`: The event is dropped.
+* `Metadata`: Audit metadata is included, request and response bodies are removed.
+* `Request`: Audit metadata and the request body are included, the response body is removed.
+* `RequestResponse`: All data is included: metadata, request body and response body. The response body can be very large. For example, `oc get pods -A` generates a response body containing the YAML description of every pod in the cluster.
+
+In logging 5.8 and later, the `ClusterLogForwarder` custom resource (CR) uses the same format as the standard link:https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/#audit-policy[Kubernetes audit policy], while providing the following additional functions:
+
+Wildcards:: Names of users, groups, namespaces, and resources can have a leading or trailing `\*` asterisk character. For example, namespace `openshift-\*` matches `openshift-apiserver` or `openshift-authentication`. Resource `\*/status` matches `Pod/status` or `Deployment/status`.
+
+Default Rules:: Events that do not match any rule in the policy are filtered as follows:
+* Read-only system events such as `get`, `list`, `watch` are dropped.
+* Service account write events that occur within the same namespace as the service account are dropped.
+* All other events are forwarded, subject to any configured rate limits.
+
+To disable these defaults, either end your rules list with a rule that has only a `level` field or add an empty rule.
+
+Omit Response Codes:: A list of integer status codes to omit. You can drop events based on the HTTP status code in the response by using the `OmitResponseCodes` field, a list of HTTP status code for which no events are created. The default value is `[404, 409, 422, 429]`. If the value is an empty list, `[]`, then no status codes are omitted.
+
+The `ClusterLogForwarder` CR audit policy acts in addition to the {product-title} audit policy. The `ClusterLogForwarder` CR audit filter changes what the log collector forwards, and provides the ability to filter by verb, user, group, namespace, or resource. You can create multiple filters to send different summaries of the same audit stream to different places. For example, you can send a detailed stream to the local cluster log store, and a less detailed stream to a remote site.
+
+[NOTE]
+====
+The example provided is intended to illustrate the range of rules possible in an audit policy and is not a recommended configuration.
+====
+
+
+.Example audit policy
+[source,yaml]
+----
+apiVersion: observability.openshift.io/v1
+kind: ClusterLogForwarder
+metadata:
+  name: <log_forwarder_name>
+  namespace: <log_forwarder_namespace>
+spec:
+  pipelines:
+    - name: my-pipeline
+      inputRefs: audit #<1>
+      filterRefs: my-policy #<2>
+      outputRefs: default
+  filters:
+    - name: my-policy
+      type: kubeAPIAudit
+      kubeAPIAudit:
+        # Don't generate audit events for all requests in RequestReceived stage.
+        omitStages:
+          - "RequestReceived"
+
+        rules:
+          # Log pod changes at RequestResponse level
+          - level: RequestResponse
+            resources:
+            - group: ""
+              resources: ["pods"]
+
+          # Log "pods/log", "pods/status" at Metadata level
+          - level: Metadata
+            resources:
+            - group: ""
+              resources: ["pods/log", "pods/status"]
+
+          # Don't log requests to a configmap called "controller-leader"
+          - level: None
+            resources:
+            - group: ""
+              resources: ["configmaps"]
+              resourceNames: ["controller-leader"]
+
+          # Don't log watch requests by the "system:kube-proxy" on endpoints or services
+          - level: None
+            users: ["system:kube-proxy"]
+            verbs: ["watch"]
+            resources:
+            - group: "" # core API group
+              resources: ["endpoints", "services"]
+
+          # Don't log authenticated requests to certain non-resource URL paths.
+          - level: None
+            userGroups: ["system:authenticated"]
+            nonResourceURLs:
+            - "/api*" # Wildcard matching.
+            - "/version"
+
+          # Log the request body of configmap changes in kube-system.
+          - level: Request
+            resources:
+            - group: "" # core API group
+              resources: ["configmaps"]
+            # This rule only applies to resources in the "kube-system" namespace.
+            # The empty string "" can be used to select non-namespaced resources.
+            namespaces: ["kube-system"]
+
+          # Log configmap and secret changes in all other namespaces at the Metadata level.
+          - level: Metadata
+            resources:
+            - group: "" # core API group
+              resources: ["secrets", "configmaps"]
+
+          # Log all other resources in core and extensions at the Request level.
+          - level: Request
+            resources:
+            - group: "" # core API group
+            - group: "extensions" # Version of group should NOT be included.
+
+          # A catch-all rule to log all other requests at the Metadata level.
+          - level: Metadata
+----
+<1> The log types that are collected. The value for this field can be `audit` for audit logs, `application` for application logs, `infrastructure` for infrastructure logs, or a named input that has been defined for your application.
+<2> The name of your audit policy.
diff --git a/modules/log6x-content-filter-drop-records.adoc b/modules/log6x-content-filter-drop-records.adoc
@@ -0,0 +1,108 @@
+// Module included in the following assemblies:
+//
+// * observability/logging/logging-6.0/log6x-clf.adoc
+
+:_mod-docs-content-type: PROCEDURE
+[id="log6x-content-filter-drop-records_{context}"]
+= Configuring content filters to drop unwanted log records
+
+When the `drop` filter is configured, the log collector evaluates log streams according to the filters before forwarding. The collector drops unwanted log records that match the specified configuration.
+
+.Prerequisites
+
+* You have installed the {clo}.
+* You have administrator permissions.
+* You have created a `ClusterLogForwarder` custom resource (CR).
+
+.Procedure
+
+. Add a configuration for a filter to the `filters` spec in the `ClusterLogForwarder` CR.
++
+The following example shows how to configure the `ClusterLogForwarder` CR to drop log records based on regular expressions:
++
+.Example `ClusterLogForwarder` CR
+[source,yaml]
+----
+apiVersion: observability.openshift.io/v1
+kind: ClusterLogForwarder
+metadata:
+# ...
+spec:
+  filters:
+  - name: <filter_name>
+    type: drop # <1>
+    drop: # <2>
+      test: # <3>
+      - field: .kubernetes.labels."foo-bar/baz" # <4>
+        matches: .+ # <5>
+      - field: .kubernetes.pod_name
+        notMatches: "my-pod" # <6>
+  pipelines:
+  - name: <pipeline_name> # <7>
+    filterRefs: ["<filter_name>"]
+# ...
+----
+<1> Specifies the type of filter. The `drop` filter drops log records that match the filter configuration.
+<2> Specifies configuration options for applying the `drop` filter.
+<3> Specifies the configuration for tests that are used to evaluate whether a log record is dropped.
+** If all the conditions specified for a test are true, the test passes and the log record is dropped.
+** When multiple tests are specified for the `drop` filter configuration, if any of the tests pass, the record is dropped.
+** If there is an error evaluating a condition, for example, the field is missing from the log record being evaluated, that condition evaluates to false.
+<4> Specifies a dot-delimited field path, which is a path to a field in the log record. The path can contain alpha-numeric characters and underscores (`a-zA-Z0-9_`), for example, `.kubernetes.namespace_name`. If segments contain characters outside of this range, the segment must be in quotes, for example, `.kubernetes.labels."foo.bar-bar/baz"`. You can include multiple field paths in a single `test` configuration, but they must all evaluate to true for the test to pass and the `drop` filter to be applied.
+<5> Specifies a regular expression. If log records match this regular expression, they are dropped. You can set either the `matches` or `notMatches` condition for a single `field` path, but not both.
+<6> Specifies a regular expression. If log records do not match this regular expression, they are dropped. You can set either the `matches` or `notMatches` condition for a single `field` path, but not both.
+<7> Specifies the pipeline that the `drop` filter is applied to.
+
+. Apply the `ClusterLogForwarder` CR by running the following command:
++
+[source,terminal]
+----
+$ oc apply -f <filename>.yaml
+----
+
+.Additional examples
+
+The following additional example shows how you can configure the `drop` filter to only keep higher priority log records:
+
+[source,yaml]
+----
+apiVersion: observability.openshift.io/v1
+kind: ClusterLogForwarder
+metadata:
+# ...
+spec:
+  filters:
+  - name: important
+    type: drop
+    drop:
+      test:
+      - field: .message
+        notMatches: "(?i)critical|error"
+      - field: .level
+        matches: "info|warning"
+# ...
+----
+
+In addition to including multiple field paths in a single `test` configuration, you can also include additional tests that are treated as _OR_ checks. In the following example, records are dropped if either `test` configuration evaluates to true. However, for the second `test` configuration, both field specs must be true for it to be evaluated to true:
+
+[source,yaml]
+----
+apiVersion: observability.openshift.io/v1
+kind: ClusterLogForwarder
+metadata:
+# ...
+spec:
+  filters:
+  - name: important
+    type: drop
+    drop:
+      test:
+      - field: .kubernetes.namespace_name
+        matches: "^open"
+      test:
+      - field: .log_type
+        matches: "application"
+      - field: .kubernetes.pod_name
+        notMatches: "my-pod"
+# ...
+----
diff --git a/modules/log6x-content-filter-prune-records.adoc b/modules/log6x-content-filter-prune-records.adoc
@@ -0,0 +1,58 @@
+// Module included in the following assemblies:
+//
+// * observability/logging/logging-6.0/log6x-clf.adoc
+
+:_mod-docs-content-type: PROCEDURE
+[id="log6x-content-filter-prune-records_{context}"]
+= Configuring content filters to prune log records
+
+When the `prune` filter is configured, the log collector evaluates log streams according to the filters before forwarding. The collector prunes log records by removing low value fields such as pod annotations.
+
+.Prerequisites
+
+* You have installed the {clo}.
+* You have administrator permissions.
+* You have created a `ClusterLogForwarder` custom resource (CR).
+
+.Procedure
+
+. Add a configuration for a filter to the `prune` spec in the `ClusterLogForwarder` CR.
++
+The following example shows how to configure the `ClusterLogForwarder` CR to prune log records based on field paths:
++
+[IMPORTANT]
+====
+If both are specified, records are pruned based on the `notIn` array first, which takes precedence over the `in` array. After records have been pruned by using the `notIn` array, they are then pruned by using the `in` array.
+====
++
+.Example `ClusterLogForwarder` CR
+[source,yaml]
+----
+apiVersion: observability.openshift.io/v1
+kind: ClusterLogForwarder
+metadata:
+# ...
+spec:
+  filters:
+  - name: <filter_name>
+    type: prune # <1>
+    prune: # <2>
+      in: [.kubernetes.annotations, .kubernetes.namespace_id] # <3>
+      notIn: [.kubernetes,.log_type,.message,."@timestamp"] # <4>
+  pipelines:
+  - name: <pipeline_name> # <5>
+    filterRefs: ["<filter_name>"]
+# ...
+----
+<1> Specify the type of filter. The `prune` filter prunes log records by configured fields.
+<2> Specify configuration options for applying the `prune` filter. The `in` and `notIn` fields are specified as arrays of dot-delimited field paths, which are paths to fields in log records. These paths can contain alpha-numeric characters and underscores (`a-zA-Z0-9_`), for example, `.kubernetes.namespace_name`. If segments contain characters outside of this range, the segment must be in quotes, for example, `.kubernetes.labels."foo.bar-bar/baz"`.
+<3> Optional: Any fields that are specified in this array are removed from the log record.
+<4> Optional: Any fields that are not specified in this array are removed from the log record.
+<5> Specify the pipeline that the `prune` filter is applied to.
+
+. Apply the `ClusterLogForwarder` CR by running the following command:
++
+[source,terminal]
+----
+$ oc apply -f <filename>.yaml
+----
diff --git a/modules/log6x-delivery-tuning.adoc b/modules/log6x-delivery-tuning.adoc
@@ -0,0 +1,108 @@
+// Module included in the following assemblies:
+//
+// * observability/logging/logging-6.0/log6x-clf.adoc
+
+:_mod-docs-content-type: REFERENCE
+[id="log6x-delivery-tuning_{context}"]
+= Tuning log payloads and delivery
+
+In {logging} 5.9 and newer versions, the `tuning` spec in the `ClusterLogForwarder` custom resource (CR) provides a means of configuring your deployment to prioritize either throughput or durability of logs.
+
+For example, if you need to reduce the possibility of log loss when the collector restarts, or you require collected log messages to survive a collector restart to support regulatory mandates, you can tune your deployment to prioritize log durability. If you use outputs that have hard limitations on the size of batches they can receive, you may want to tune your deployment to prioritize log throughput.
+
+[IMPORTANT]
+====
+To use this feature, your {logging} deployment must be configured to use the Vector collector. The `tuning` spec in the `ClusterLogForwarder` CR is not supported when using the Fluentd collector.
+====
+
+The following example shows the `ClusterLogForwarder` CR options that you can modify to tune log forwarder outputs:
+
+.Example `ClusterLogForwarder` CR tuning options
+[source,yaml]
+----
+apiVersion: "observability.openshift.io/v1"
+kind: ClusterLogForwarder
+metadata:
+# ...
+spec:
+  outputs:
+  - name: <output-name>
+    type: <output_type>
+    <output_type>:
+    tuning:
+      delivery: atLeastOnce # <1>
+      maxWrite: <integer> # <2>
+      compression:  none  # <3> 
+      minRetryDuration: 1s  # <4>
+      maxRetryDuration: 1s  # <5>
+# ...
+----
+<1> Specify the delivery mode for log forwarding.
+** `AtLeastOnce` delivery means that if the log forwarder crashes or is restarted, any logs that were read before the crash but not sent to their destination are re-sent. It is possible that some logs are duplicated after a crash.
+** `AtMostOnce` delivery means that the log forwarder makes no effort to recover logs lost during a crash. This mode gives better throughput, but may result in greater log loss.
+<2> Specifying a `compression` configuration causes data to be compressed before it is sent over the network. Note that not all output types support compression, and if the specified compression type is not supported by the output, this results in an error. The possible values for this configuration are `none` for no compression, `gzip`, `snappy`, `zlib`, or `zstd`. `lz4` compression is also available if you are using a Kafka output. See the table "Supported compression types for tuning outputs" for more information.
+<3> Specifies a limit for the maximum payload of a single send operation to the output.
+<4> Specifies a minimum duration to wait between attempts before retrying delivery after a failure. This value is a string, and can be specified as milliseconds (`ms`), seconds (`s`), or minutes (`m`).
+<5> Specifies a maximum duration to wait between attempts before retrying delivery after a failure. This value is a string, and can be specified as milliseconds (`ms`), seconds (`s`), or minutes (`m`).
+
+.Supported compression types for tuning outputs
+[options="header"]
+|===
+|Compression algorithm |Splunk |Amazon Cloudwatch |Elasticsearch 8 |LokiStack |Apache Kafka |HTTP |Syslog |Google Cloud |Microsoft Azure Monitoring
+
+|`gzip`
+|X
+|X
+|X
+|X
+|
+|X
+|
+|
+|
+
+|`snappy`
+|
+|X
+|
+|X
+|X
+|X
+|
+|
+|
+
+|`zlib`
+|
+|X
+|X
+|
+|
+|X
+|
+|
+|
+
+|`zstd`
+|
+|X
+|
+|
+|X
+|X
+|
+|
+|
+
+|`lz4`
+|
+|
+|
+|
+|X
+|
+|
+|
+|
+
+|===