diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml index 32d6f206ba23..207d270559e2 100644 --- a/_topic_maps/_topic_map.yml +++ b/_topic_maps/_topic_map.yml @@ -2709,6 +2709,17 @@ Topics: File: logging-5-8-release-notes - Name: Logging 5.7 File: logging-5-7-release-notes + - Name: Logging 6.0 + Dir: logging-6.0 + Topics: + - Name: About Logging 6.0 + File: log6x-about + - Name: Getting started + File: log6x-start + - Name: Configuring log forwarding + File: log6x-clf + - Name: Configuring LokiStack storage + File: log6x-loki - Name: Support File: cluster-logging-support - Name: Troubleshooting logging diff --git a/modules/log6x-oc-explain.adoc b/modules/log6x-oc-explain.adoc new file mode 100644 index 000000000000..c5686d0cca19 --- /dev/null +++ b/modules/log6x-oc-explain.adoc @@ -0,0 +1,17 @@ + +// Module included in the following assemblies: +// +:_mod-docs-content-type: CONCEPT +[id="log6x-oc-explain_{context}"] + += Using the `oc explain` command + +The `oc explain` command is an essential tool in the OpenShift Command-Line Interface (CLI) that provides detailed descriptions of the fields within Kubernetes resources, including Custom Resources (CRs) introduced by Operators. This command is invaluable for administrators and developers who are configuring or troubleshooting resources in an OpenShift cluster. + +. *Resource Descriptions*: It offers in-depth explanations of all fields associated with a specific Kubernetes or Custom Resource. This includes standard resources like pods and services, as well as more complex entities like statefulsets. + +. *Hierarchical Structure*: The command displays the structure of resource fields in a hierarchical format, clarifying the relationships between different configuration options. This hierarchical view is crucial for proper configuration. + +. *Type Information*: `oc explain` also indicates the type of each field (such as string, integer, or boolean), which is critical for ensuring that resource definitions are provided with correct data types. + +. *Default Values*: When applicable, the command shows the default values for fields, providing insights into what values will be used if none are explicitly specified. diff --git a/modules/logging-6x-v-5x.adoc b/modules/logging-6x-v-5x.adoc new file mode 100644 index 000000000000..fb9733c1fc13 --- /dev/null +++ b/modules/logging-6x-v-5x.adoc @@ -0,0 +1,27 @@ +// Module included in the following assemblies: +// + + +:_mod-docs-content-type: CONCEPT +[id="logging-6x-v-5x_{context}"] += OpenShift Logging 6.0: Key Differences from Previous Versions + +OpenShift Logging 6.0 is the culmination of efforts to modernize the underlying components of the logging solution. Over the past few releases, Loki and Vector were gradually introduced with a focus on improving performance and enhancing features while maintaining the stability required for business continuity. + + +* **Horizontal Scalability:** Loki's architecture allows for seamless horizontal scaling, ensuring that the logging system can handle the growing demands of large-scale OpenShift deployments. + +* **Improved Query Performance:** Loki's indexing and query mechanisms are optimized for speed, enabling faster log retrieval and analysis. + +* **Improved Integration:** Loki and Vector offer tighter integration with OpenShift, providing a more seamless and native logging experience. + +* **Increased Flexibility** More configuration options to fine tune your logging deployment. + +* **Enhanced Log Processing:** Vector's processing capabilities have been expanded, enabling you to perform more complex filtering. + +* **OpenShift Web Console:** The integrated Logs page offers a user-friendly interface for searching, filtering, and viewing logs directly within the OpenShift console. + + +== Functional changes + +https://issues.redhat.com/browse/OBSDOCS-1168 diff --git a/modules/logging-content-filter-drop-records.adoc b/modules/logging-content-filter-drop-records.adoc index 3dbd51690292..d5319e959d43 100644 --- a/modules/logging-content-filter-drop-records.adoc +++ b/modules/logging-content-filter-drop-records.adoc @@ -1,6 +1,7 @@ // Module included in the following assemblies: // // * observability/logging/performance_reliability/logging-content-filtering.adoc +// * observability/logging/logging-6.0/log6x-clf.adoc :_mod-docs-content-type: PROCEDURE [id="logging-content-filter-drop-records_{context}"] @@ -10,7 +11,6 @@ When the `drop` filter is configured, the log collector evaluates log streams ac .Prerequisites -* You have installed the {clo}. * You have administrator permissions. * You have created a `ClusterLogForwarder` custom resource (CR). diff --git a/modules/logging-content-filter-prune-records.adoc b/modules/logging-content-filter-prune-records.adoc index 89e08fcd48a8..bc30572da073 100644 --- a/modules/logging-content-filter-prune-records.adoc +++ b/modules/logging-content-filter-prune-records.adoc @@ -1,7 +1,7 @@ // Module included in the following assemblies: // // * observability/logging/performance_reliability/logging-content-filtering.adoc - +// * observability/logging/logging-6.0/log6x-clf.adoc :_mod-docs-content-type: PROCEDURE [id="logging-content-filter-prune-records_{context}"] = Configuring content filters to prune log records @@ -10,7 +10,6 @@ When the `prune` filter is configured, the log collector evaluates log streams a .Prerequisites -* You have installed the {clo}. * You have administrator permissions. * You have created a `ClusterLogForwarder` custom resource (CR). diff --git a/modules/logging-elastic-to-loki-store.adoc b/modules/logging-elastic-to-loki-store.adoc new file mode 100644 index 000000000000..3d053cd79c23 --- /dev/null +++ b/modules/logging-elastic-to-loki-store.adoc @@ -0,0 +1,371 @@ +// Module included in the following assemblies: +// + + +:_mod-docs-content-type: PROCEDURE +[id="logging-elastic-to-loki-store_{context}"] += Changing the default log store from Elasticsearch to Loki + +This guide describes how to switch the OpenShift Logging storage service from Elasticsearch to LokiStack. It focuses on log forwarding, not data migration. After following these steps, old logs will remain in Elasticsearch (accessible via Kibana), while new logs will go to LokiStack (visible in the OpenShift Console). + +.Prerequisites + +* Red Hat OpenShift Logging Operator (v5.5.5+) +* OpenShift Elasticsearch Operator (v5.5.5+) +* Red Hat Loki Operator (v5.5.5+) +* Sufficient resources on target nodes to run both Elasticsearch and LokiStack (see LokiStack Deployment Sizing Table in the documentation). + +== Installing LokiStack + +. **Install the Loki Operator:** Follow the official guide to install the Loki Operator using the OpenShift web console. +.. **Create a Secret for Loki Object Storage:** Create a secret for Loki object storage (e.g., AWS S3). Refer to the documentation for other object storage types. + +[source,bash] +---- +$ cat << EOF |oc create -f - +apiVersion: v1 +kind: Secret +metadata: + name: logging-loki-s3 + namespace: openshift-logging +data: + access_key_id: $(echo "PUT_S3_ACCESS_KEY_ID_HERE" | base64 -w0) + access_key_secret: $(echo "PUT_S3_ACCESS_KEY_SECRET_HERE" | base64 -w0) + bucketnames: $(echo "s3-bucket-name" | base64 -w0) + endpoint: $(echo "https://s3.eu-central-1.amazonaws.com" | base64 -w0) + region: $(echo "eu-central-1" | base64 -w0) +EOF +---- + + +... **Deploy LokiStack Custom Resource (CR):** + +[NOTE] +==== +Change the `spec.size` if needed. +==== + +[source,bash] +---- +$ cat << EOF |oc create -f - +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + size: 1x.small + storage: + schemas: + - version: v12 + effectiveDate: '2022-06-01' + secret: + name: logging-loki-s3 + type: s3 + storageClassName: gp2 + tenants: + mode: openshift-logging +EOF +---- + +== Disconnecting Elasticsearch and Kibana + +To keep Elasticsearch and Kibana running while transitioning: + +. **Set `ClusterLogging` to Unmanaged:** + +[source,bash] +---- +oc -n openshift-logging patch clusterlogging/instance -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge +---- + +.. **Remove Owner References:** Remove `ClusterLogging` owner references from Elasticsearch and Kibana resources: + +[source,bash] +---- +$ oc -n openshift-logging patch elasticsearch/elasticsearch -p '{"metadata":{"ownerReferences": []}}' --type=merge +---- + +[source,bash] +---- +$ oc -n openshift-logging patch kibana/kibana -p '{"metadata":{"ownerReferences": []}}' --type=merge +---- + +... **Back Up Elasticsearch and Kibana Resources:** Use `yq` to back up these resources to prevent accidental deletion: link:https://github.com/mikefarah/yq[yq utility] + +For Elasticsearch: + +[source,bash]// Module included in the following assemblies: +// + + +:_mod-docs-content-type: PROCEDURE +[id="logging-elastic-to-loki-migration_{context}"] += Migrating the Default Log Store from Elasticsearch to Loki in OpenShift + +This guide describes how to switch the OpenShift Logging storage service from Elasticsearch to LokiStack. It focuses on log forwarding, not data migration. After following these steps, old logs will remain in Elasticsearch (accessible via Kibana), while new logs will go to LokiStack (visible in the OpenShift Console). + +.Prerequisites + +* Red Hat OpenShift Logging Operator (v5.5.5+) +* OpenShift Elasticsearch Operator (v5.5.5+) +* Red Hat Loki Operator (v5.5.5+) +* Sufficient resources on target nodes to run both Elasticsearch and LokiStack (see LokiStack Deployment Sizing Table in the documentation). + +== Installing LokiStack + +. **Install the Loki Operator:** Follow the official guide to install the Loki Operator using the OpenShift web console. +.. **Create a Secret for Loki Object Storage:** Create a secret for Loki object storage (e.g., AWS S3). Refer to the documentation for other object storage types. + +[source,bash] +---- +$ cat << EOF |oc create -f - +apiVersion: v1 +kind: Secret +metadata: + name: logging-loki-s3 + namespace: openshift-logging +data: + access_key_id: $(echo "PUT_S3_ACCESS_KEY_ID_HERE" | base64 -w0) + access_key_secret: $(echo "PUT_S3_ACCESS_KEY_SECRET_HERE" | base64 -w0) + bucketnames: $(echo "s3-bucket-name" | base64 -w0) + endpoint: $(echo "https://s3.eu-central-1.amazonaws.com" | base64 -w0) + region: $(echo "eu-central-1" | base64 -w0) +EOF +---- + + +... **Deploy LokiStack Custom Resource (CR):** + +[NOTE] +==== +Change the `spec.size` if needed. +==== + +[source,bash] +---- +$ cat << EOF |oc create -f - +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + size: 1x.small + storage: + schemas: + - version: v12 + effectiveDate: '2022-06-01' + secret: + name: logging-loki-s3 + type: s3 + storageClassName: gp2 + tenants: + mode: openshift-logging +EOF +---- + +== Disconnecting Elasticsearch and Kibana + +To keep Elasticsearch and Kibana running while transitioning: + +. **Set `ClusterLogging` to Unmanaged:** + +[source,bash] +---- +oc -n openshift-logging patch clusterlogging/instance -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge +---- + +.. **Remove Owner References:** Remove `ClusterLogging` owner references from Elasticsearch and Kibana resources: + +[source,bash] +---- +$ oc -n openshift-logging patch elasticsearch/elasticsearch -p '{"metadata":{"ownerReferences": []}}' --type=merge +---- + +[source,bash] +---- +$ oc -n openshift-logging patch kibana/kibana -p '{"metadata":{"ownerReferences": []}}' --type=merge +---- + +... **Back Up Elasticsearch and Kibana Resources:** Use `yq` to back up these resources to prevent accidental deletion: link:https://github.com/mikefarah/yq[yq utility] + +For Elasticsearch: + +[source,bash] +---- +$ oc -n openshift-logging get elasticsearch elasticsearch -o yaml \ + | yq 'del(.metadata.resourceVersion) | del(.metadata.uid)' \ + | yq 'del(.metadata.generation) | del(.metadata.creationTimestamp)' \ + | yq 'del(.metadata.selfLink) | del(.status)' > /tmp/cr-elasticsearch.yaml +---- + +For Kibana: + +[source,bash] +---- +$ oc -n openshift-logging get kibana kibana -o yaml \ + | yq 'del(.metadata.resourceVersion) | del(.metadata.uid)' \ + | yq 'del(.metadata.generation) | del(.metadata.creationTimestamp)' \ + | yq 'del(.metadata.selfLink) | del(.status)' > /tmp/cr-kibana.yaml +---- + +== Switching to LokiStack + +. Switch Log Storage to LokiStack +The following manifest will apply several changes to the `ClusterLogging` resource: +* Re-instate the management state to `Managed`. +* Switch the `logStore` spec from `elasticsearch` to `lokistack`, restarting the collector pods to start forwarding logs to `lokistack`. +* Remove the `visualization` spec, prompting the cluster-logging-operator to install the `logging-view-plugin` for observing `lokistack` logs in the OpenShift Console. +* If the collection type is not `fluentd`, replace it with `vector`. + +[source,yaml] +---- +$ cat << EOF |oc replace -f - +apiVersion: logging.openshift.io/v1 +kind: ClusterLogging +metadata: + name: instance + namespace: openshift-logging +spec: + managementState: Managed + logStore: + type: lokistack + lokistack: + name: logging-loki + collection: + logs: + type: fluentd + fluentd: {} + visualization: + type: kibana + kibana: + replicas: 1 +EOF +---- + +. Re-instantiate Kibana Resource + +In the previous step, removing the `visualization` field prompted the operator to remove the `Kibana` resource. Re-instantiate the `Kibana` resource using the backup created earlier. + +[source,bash] +---- +$ oc -n openshift-logging apply -f /tmp/cr-kibana.yaml +---- + +. Enable the Console View Plugin + +Enable the console view plugin to view the logs integrated from the OpenShift Console (Observe > Logs). + +[source,bash] +---- +$ oc patch consoles.operator.openshift.io cluster --type=merge --patch '{ "spec": { "plugins": ["logging-view-plugin"] } }' +---- + +== Delete the Elasticsearch Stack + +Once the retention period for logs stored in Elasticsearch expires and no more logs are visible in Kibana, remove the old stack to release resources. + +=== Step 1: Delete Elasticsearch and Kibana Resources + +[source,bash] +---- +$ oc -n openshift-logging delete kibana/kibana elasticsearch/elasticsearch +---- + +=== + + Step 2: Delete the PVCs Used by Elasticsearch Instances + +[source,bash] +---- +$ oc delete -n openshift-logging pvc -l logging-cluster=elasticsearch +---- +---- +$ oc -n openshift-logging get elasticsearch elasticsearch -o yaml \ + | yq 'del(.metadata.resourceVersion) | del(.metadata.uid)' \ + | yq 'del(.metadata.generation) | del(.metadata.creationTimestamp)' \ + | yq 'del(.metadata.selfLink) | del(.status)' > /tmp/cr-elasticsearch.yaml +---- + +For Kibana: + +[source,bash] +---- +$ oc -n openshift-logging get kibana kibana -o yaml \ + | yq 'del(.metadata.resourceVersion) | del(.metadata.uid)' \ + | yq 'del(.metadata.generation) | del(.metadata.creationTimestamp)' \ + | yq 'del(.metadata.selfLink) | del(.status)' > /tmp/cr-kibana.yaml +---- + +== Switching to LokiStack + +. Switch Log Storage to LokiStack +The following manifest will apply several changes to the `ClusterLogging` resource: +* Re-instate the management state to `Managed`. +* Switch the `logStore` spec from `elasticsearch` to `lokistack`, restarting the collector pods to start forwarding logs to `lokistack`. +* Remove the `visualization` spec, prompting the cluster-logging-operator to install the `logging-view-plugin` for observing `lokistack` logs in the OpenShift Console. +* If the collection type is not `fluentd`, replace it with `vector`. + +[source,yaml] +---- +$ cat << EOF |oc replace -f - +apiVersion: logging.openshift.io/v1 +kind: ClusterLogging +metadata: + name: instance + namespace: openshift-logging +spec: + managementState: Managed + logStore: + type: lokistack + lokistack: + name: logging-loki + collection: + logs: + type: fluentd + fluentd: {} + visualization: + type: kibana + kibana: + replicas: 1 +EOF +---- + +. Re-instantiate Kibana Resource + +In the previous step, removing the `visualization` field prompted the operator to remove the `Kibana` resource. Re-instantiate the `Kibana` resource using the backup created earlier. + +[source,bash] +---- +$ oc -n openshift-logging apply -f /tmp/cr-kibana.yaml +---- + +. Enable the Console View Plugin + +Enable the console view plugin to view the logs integrated from the OpenShift Console (Observe > Logs). + +[source,bash] +---- +$ oc patch consoles.operator.openshift.io cluster --type=merge --patch '{ "spec": { "plugins": ["logging-view-plugin"] } }' +---- + +== Delete the Elasticsearch Stack + +Once the retention period for logs stored in Elasticsearch expires and no more logs are visible in Kibana, remove the old stack to release resources. + +=== Step 1: Delete Elasticsearch and Kibana Resources + +[source,bash] +---- +$ oc -n openshift-logging delete kibana/kibana elasticsearch/elasticsearch +---- + +=== + + Step 2: Delete the PVCs Used by Elasticsearch Instances + +[source,bash] +---- +$ oc delete -n openshift-logging pvc -l logging-cluster=elasticsearch +---- diff --git a/observability/logging/logging-6.0/.logging-enhancement-proposals.adoc b/observability/logging/logging-6.0/.logging-enhancement-proposals.adoc new file mode 100644 index 000000000000..bb42d32475ea --- /dev/null +++ b/observability/logging/logging-6.0/.logging-enhancement-proposals.adoc @@ -0,0 +1,1452 @@ +// Module included in the following assemblies: +// + + +:_mod-docs-content-type: PROCEDURE +[id="logging-enhancement-proposals_{context}"] += Enhancement proposals +== Summary + +Since its initial release in OpenShift 3.x, Red Hat OpenShift logging has evolved from an on-cluster, highly opinionated offering to a more flexible log forwarding solution that supports multiple internal (e.g., LokiStack, Elasticsearch) and externally managed log storage. As original components like Elasticsearch and Fluentd are deprecated for various reasons, this enhancement introduces the next version of APIs to formally drop support for those features and provide an API reflecting the future direction of log storage and forwarding. + +== Motivation + +=== User Stories + +The next version of the APIs aims to continue supporting the primary objectives of the project, which include: + +* Collecting logs from various sources and services running on a cluster +* Normalizing logs to a common format to include workload metadata (i.e., labels, namespace, name) +* Forwarding logs to storage of an administrator's choosing (e.g., LokiStack) +* Providing a Red Hat managed log storage solution +* Providing an interface to allow users to review logs from a Red Hat managed storage solution + +Deployment scenarios supporting these objectives include: + +* An administrator wanting to deploy a complete operator-managed logging solution that includes collection, storage, and visualization to evaluate log records while on the cluster +* An administrator wanting to deploy an operator-managed log collector only to forward logs to an existing storage solution +* An administrator wanting to deploy an operator-managed instance of LokiStack and visualization + +The administrator role is any user who has permissions to deploy the operator and the cluster-wide resources required to deploy the logging components. + +=== Goals + +* Drop support for the *ClusterLogging* custom resource +* Drop support for *ElasticSearch*, *Kibana* custom resources, and the *elasticsearch-operator* +* Drop support for Fluentd collector implementation, Red Hat managed Elastic stack (e.g., Elasticsearch, Kibana) +* Drop support in the *cluster-logging-operator* for *logging-view-plugin* management +* Support log forwarder API with minimal or no dependency upon reserved words (e.g., default) +* Support an API to spec a Red Hat managed LokiStack with the logging tenancy model +* Continue to allow deployment of a log forwarder to the output sinks of the administrators choosing +* Automated migration path from `ClusterLogForwarder.logging.openshift.io/v1` to `ClusterLogForwarder.observability.openshift.io/v1` + +=== Non-Goals + +* "One click" deployment of a full logging stack as provided by *ClusterLogging* v1 +* Complete backwards compatibility to *ClusterLogForwarder.logging.openshift.io/v1* v1 + +== Proposal + +=== Workflow Description + +The workflow described below allows deployment of a full logging stack to collect and forward logs to a Red Hat managed log store, covering the first user story which is a superset of the others. + +*Cluster administrator*: +* Manages and deploys day 2 operators +* Manages and deploys an on-cluster LokiStack +* Manages and deploys a cluster-wide log forwarder + +*Cluster-observability-operator*: +* Manages and deploys observability operands (e.g., LokiStack, ClusterLogForwarder, Tracing) and console plugins (e.g., console-logging-plugin) + +*Loki-operator*: +* Manages a Loki stack + +*Cluster-logging-operator*: +* Manages log collection and forwarding + +Workflow steps: +. Deploys the Red Hat *cluster-observability-operator* +. Deploys the Red Hat *loki-operator* +. Deploys an instance of *LokiStack* in the `openshift-logging` namespace +. Deploys the Red Hat *cluster-logging-operator* +. Creates a *ClusterLogForwarder* custom resource for the *LokiStack* + +*Cluster-observability-operator* deploys the console-logging-plugin for reading logs in the OpenShift console. + +*Loki-operator* deploys the *LokiStack* for storing logs on-cluster. + +*Cluster-logging-operator* deploys the log collector to forward logs to log storage in the `openshift-logging` namespace. + +=== API Extensions + +This API defines the following opinionated input sources as a continuation of prior cluster logging versions: + +* * + +application*: Logs of container workloads running in all namespaces except **default**, **openshift***, and **kube*** +* *infrastructure*: journald logs from OpenShift nodes and container workloads running only in namespaces **default**, **openshift***, and **kube*** +* *audit*: The logs from OpenShift nodes written to the node filesystem by: Kubernetes API server, OpenShift API server, Auditd, and OpenShift Virtual Network (OVN). + +These reserved words represent input sources that can be referenced by a pipeline without an explicit input specification. + +More explicit specification of *audit* and *infrastructure* logs is allowed by creating a named input of that type and specifying at least one of the allowed sources. + +This is a namespaced resource that follows the rules and design described in the multi-ClusterLogForwarder proposal with the following exceptions: + +* Drops the *legacy* mode described in the proposal. +* Moves collector specification to the *ClusterLogForwarder* + +=== ClusterLogForwarer CustomResourceDefinition + +The next version of a ClusterLogForwarder is defined as follows. Note that this resource is part of a new API group to align log collection with the objectives of Red Hat observability. + +[source,yaml] +---- +apiVersion: "observability.openshift.io/v1" +kind: ClusterLogForwarder +metadata: + name: +spec: + managementState: #enum: Managed, Unmanaged + serviceAccount: + name: + collector: + resources: #corev1.ResourceRequirements + limits: #cpu, memory + requests: + nodeSelector: #map[string]string + tolerations: #corev1.Toleration + inputs: + - name: + type: #enum: application,infrastructure,audit + application: + selector: #labelselector + includes: + - namespace: + container: + excludes: + - namespace: + container: + tuning: + ratelimitPerContainer: #rate limit applied to each container selected by this input + recordsPerSecond: #int (no multiplier, a each container only runs on one node at a time.) + infrastructure: + sources: [] #enum: node,container + audit: + sources: [] #enum: auditd,kubeAPI,openshiftAPI,ovn + receiver: + type: #enum: syslog,http + port: + http: + format: #enum: kubeAPIAudit , format of incoming data + tls: + ca: + key: #the key in the resource + configmap: + name: # the name of resource + secret: + name: # the name of resource + certificate: + key: #the key in the resource + configmap: + name: # the name of resource + secret: + name: # the name of resource + key: + key: #the key in the resource + secret: + name: # the name of resource + keyPassphrase: + key: #the key in the resource + secret: + name: # the name of resource + filters: + - name: + type: #enum: kubeAPIaudit, detectMultilineException, parse, openshiftLabels, drop, prune + kubeAPIAudit: + parse: + pipelines: + - inputRefs: [] + outputRefs: [] + filterRefs: [] + outputs: + - name: + type: #enum: azureMonitor,cloudwatch,elasticsearch,googleCloudLogging,http,kafka,loki,lokiStack,splunk,syslog + tls: + ca: + key: #the key in the resource + configmap: + name: # the name of resource + secret: + name: # the name of resource + certificate: + key: #the key in the resource + configmap: + name: # the name of resource + secret: + name: # the name of resource + key: + key: #the key in the resource + secret: + name: # the name of resource + keyPassphrase: + key: #the key in the resource + insecureSkipVerify: #bool + securityProfile: #openshiftv1.TLSSecurityProfile + rateLimit: + recordsPerSecond: #int - document per-forwarder/per-node multiplier + azureMonitor: + customerId: + logType: + azureResourceId: + host: + authorization: + sharedKey: + key: + secret: + name: # the name of resource + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + minRetryDuration: + maxRetryDuration: + cloud + +watch: + region: + groupBy: # enum. should support templating? + groupPrefix: # should support templating? + authorization: # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + elasticsearch: + url: + version: + index: # templating? do we need structured key/name or is this good enough + authorization: # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + googleCloudLogging: + ID: + type: #enum: billingAccount,folder,project,organization + value: + logID: # templating? + authorization: # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + http: + url: + headers: + timeout: + method: + authorization: # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + kafka: + url: + topic: #templating? + brokers: + authorization: # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + loki: + url: + tenant: # templating? + labelKeys: + authorization: # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + lokiStack: # RH managed loki stack with RH tenant model + target: + name: + namespace: + labelKeys: + authorization: + token: + key: + secret: + name: # the name of resource + serviceAccount: + name: + username: + key: + secret: + name: # the name of resource + password: + key: + secret: + name: # the name of resource + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + splunk: + url: + index: #templating? + authorization: + secret: #the secret to search for keys + name: + # output specific auth keys + tuning: + delivery: # enum: AtMostOnce, AtLeastOnce + maxWrite: # quantity (e.g. 500k) + compression: # enum of supported algos specific to the output + minRetryDuration: + maxRetryDuration: + syslog: #only supports RFC5424? + url: + severity: + facility: + trimPrefix: + tagKey: #templating? + payloadKey: #templating? + addLogSource: + appName: #templating? + procID: #templating? + msgID: #templating? +status: + conditions: # []metav1.conditions + inputs: # []metav1.conditions + outputs: # []metav1.conditions + filters: # []metav1.conditions + pipelines: # []metav1.conditions +---- + +Here is your content converted to AsciiDoc format: + +``` +.Example +[source,yaml] +---- +apiVersion: "observability.openshift.io/v1" +kind: ClusterLogForwarder +metadata: + name: log-collector + namespace: acme-logging +spec: + outputs: + - name: rh-loki + type: lokiStack + service: + namespace: openshift-logging + name: rh-managed-loki + authorization: + resource: + name: audit-collector-sa-token + token: + key: token + inputs: + - name: infra-container + type: infrastructure + infrastructure: + sources: [container] + serviceAccount: + name: audit-collector-sa + pipelines: + - inputRefs: + - infra-container + - audit + outputRefs: + - rh-loki +---- + +This example: + +* Deploys a log collector to the `acme-logging` namespace +* Expects the administrator to have created a service account named `audit-collector-sa` in that namespace +* Expects the administrator to have created a secret named `audit-collector-sa-token` in that namespace with a key named token that is a bearer token +* Expects the administrator to have bound the roles `collect-audit-logs`, `collect-infrastructure-logs` to the service account +* Expects the administrator created a **LokiStack** CR named `rh-managed-loki` in the `openshift-logging` namespace +* Collects all audit log sources and only infrastructure container logs and writes them to the Red Hat managed lokiStack +``` + +### Topology Considerations +#### Hypershift / Hosted Control Planes +#### Standalone Clusters +#### Single-node Deployments or MicroShift + +### Implementation Details/Notes/Constraints + +==== Log Storage + +Deployment of log storage is a separate task of the administrator. They deploy a custom resource to be managed by the **loki-operator**. They will additionally specify forwarding logs to this storage by defining an output in the **ClusterLogForwarder**. Deployment of Red Hat managed log storage is optional and not a requirement for log forwarding. + +==== Log Visualization + +The **cluster-observability-operator** will take ownership of the management of the **console-logging-plugin** which replaces the **log-view-plugin**. This requires feature changes to the operator and the OpenShift console before being fully realized. Earlier versions of the **cluster-logging-operator** will be updated with logic (TBD) to recognize the **cluster-observability-operator** is able to deploy the plugin and will remove its own deployment in deference to the **cluster-observability-operator**. Deployment of log visualization is optional and not a requirement for log forwarding. + +==== Log Collection and Forwarding + +The *observability.openshift.io/v1* version of the **ClusterLogForwarder** depends upon a **ServiceAccount** to which roles must be bound that allow elevated permissions (e.g., mounting node filesystem, collecting logs). + +The Red Hat managed log store is represented by a `lokiStack` output type defined without a URL with the following assumptions: + +* Named the same as a **LokiStack** CR deployed in the `openshift-logging` namespace +* Follows the logging tenant model + +The **cluster-logging-operator** will: + +* Internally migrate the **ClusterLogForwarder** to craft the URL to the **LokiStack** + +==== Data Model + +The **ClusterLogForwarder** API allows for users to spec the format of data that is forwarded to an output. Various models are provided to allow users to embrace industry trends (e.g., OTEL) while also offering the capability to continue with the current model. This will allow consumers to continue to use existing tooling while offering options for transitioning to other models when they are ready. + +===== ViaQ + +The ViaQ model is the original data model that has been provided since the inception of OpenShift logging. The model has not been generally publicly documented until relatively recently. It can be verbose and was subject to subtle change causing issues for users because of the lack of documentation. This enhancement document intends to rectify that. + +====== V1 + +Refer to the following reference documentation for model details: + +* Container Logs: https://github.com/openshift/cluster-logging-operator/blob/release-5.9/docs/reference/datamodels/viaq/v1.adoc#viaq-data-model-for-containers +* Journald Node Logs: https://github.com/openshift/cluster-logging-operator/blob/release-5.9/docs/reference/datamodels/viaq/v1.adoc#viaq-data-model-for-journald +* Kubernetes & OpenShift API Events: https://github.com/openshift/cluster-logging-operator/blob/release-5.9/docs/reference/datamodels/viaq/v1.adoc#viaq-data-model-for-kubernetes-api-events + +====== + + V2 + +The progression of the ViaQ data model strives to be succinct by removing fields that have been reported by customers as extraneous. + +.Container log +[source,yaml] +---- +model_version: v2.0 +timestamp: +hostname: +severity: +kubernetes: + container_image: + container_name: + pod_name: + namespace_name: + namespace_labels: #map[string]string: underscore, dedotted, deslashed + labels: #map[string]string: underscore, dedotted, deslashed + stream: #enum: stdout,stderr +message: #string: optional. only preset when structured is not +structured: #map[string]: optional. only present when message is not +openshift: + cluster_id: + log_type: #enum: application, infrastructure, audit + log_source: #journal, ovn, etc + sequence: #int: atomically increasing number during the life of the collector process to be used with the timestamp + labels: #map[string]string: additional labels added to the record defined on a pipeline +---- + +.Event Log +[source,yaml] +---- +model_version: v2.0 +timestamp: +hostname: +event: + uid: + object_ref_api_group: + object_ref_api_version: + object_ref_name: + object_ref_resource: + request_received_timestamp: + response_status_code: + stage: + stage_timestamp: + user_groups: [] + user_name: + user_uid: + user_agent: + verb: +openshift: + cluster_id: + log_type: #audit + log_source: #enum: kube,openshift,ovn,auditd + labels: #map[string]string: additional labels added to the record defined on a pipeline +---- + +.Journald Log +[source,yaml] +---- +model_version: v2.0 +timestamp: +message: +hostname: +systemd: + t: #map + u: #map +openshift: + cluster_id: + log_type: #infrastructure + log_source: #journald + labels: #map[string]string: additional labels added to the record defined on a pipeline +---- + +=== Risks and Mitigations + +==== User Experience + +The product is no longer offering a "one-click" experience for deploying a full logging stack from collection to storage. Given we started moving away from this experience when Loki was introduced, this should be low risk. Many customers already have their own log storage solution so they are only making use of log forwarding. Additionally, it is intended for the **cluster-observability-operator** to recognize the existence of the internally managed log storage and automatically deploy the view plugin. This should reduce the burden of administrators. + +==== Security + +The risk of forwarding logs to unauthorized destinations remains as from previous releases. This enhancement embraces the design from [multi cluster log forwarding](https://github.com/openshift/enhancements/blob/master/enhancements/cluster-logging/multi-cluster-log-forwarder.md) by requiring administrators to provide a service account with the proper permissions. The permission scheme relies upon RBAC offered by the platform and places the control in the hands of administrators. + +### Drawbacks + +The largest drawback to implementing new APIs is the product continues to identify the availability of technologies which are deprecated and will soon not be supported. This will continue to confuse consumers of logging and will require documentation and explanations of our technology decisions. Furthermore, some customers will continue to delay the move to the newer technologies provided by Red Hat. + +## Open Questions [optional] + +## Test Plan + +* Execute all existing tests for log collection, forwarding and storage with the exception of tests specifically intended to test deprecated features (e.g., Elasticsearch). Functionally, other tests are still applicable +* Execute a test to verify the flow defined for collecting, storing, and visualizing logs from an on-cluster, Red Hat operator managed LokiStack +* Execute a test to verify legacy deployments of logging are no longer managed by the **cluster-logging-operator** after upgrade. + +## Graduation Criteria + +### Dev Preview -> Tech Preview + +### Tech Preview -> GA + +This release: + +* Intends to support the use-cases described within this proposal +* Intends to distribute *ClusterLogForwarder.observability.openshift.io/v1* of the APIs described within this proposal +* Drop support of *ClusterLogging.logging.openshift.io/v1* API +* Deprecate support of *ClusterLogForwarder.logging.openshift.io/v1* API +* Stop any feature development to support the *ClusterLogForwarder.logging.openshift.io/v1* API +* May support multiple data models (e.g., OpenTelemetry, VIAQ v2) + +### Removing a deprecated feature + +Upon GA release of this enhancement: + +- The internally managed Elastic (e.g., Elasticsearch, Kibana) offering will no longer be available. +- The Fluentd collector implementation will no longer be available +- The *ClusterLogForwarder.logging.openshift.io/v1* is deprecated and intends to be removed after two z-stream releases after GA of this enhancement. +- The *ClusterLogging.logging.openshift.io/v1* will no longer be available + +## Upgrade / Downgrade Strategy + +The **cluster-logging-operator** will internally convert the *ClusterLogForwarder.logging.openshift.io/v1* resources to *ClusterLogForwarder.observability.openshift.io/v1* and identify the original resource as deprecated. The operator will return an error for any resource that is unable to be converted, for example, a forwarder that is utilizing the FluentdForward output type. Once migrated, the operator will continue to reconcile it. Log forwarders depending upon fluentd collectors will be re-deployed with vector collectors. Fluentd deployments forwarding to fluentforward endpoints will be unsupported. + +**Note:** No new features will be added to *ClusterLogForwarder.logging.openshift.io/v1*. + +**LokiStack** is unaffected by this proposal and not managed by the **cluster-logging-operator** + +## Version Skew Strategy + +## Operational Aspects of API Extensions + +## Support Procedures + +## Alternatives + +Given most of the changes will result in an operator that manages only log collection and forwarding, we could release a new operator for that purpose only that provides only *ClusterLogForwarder.observability.openshift.io/v1* APIs + +## Infrastructure Needed [optional] + + +--- +title: cluster-logging-log-forwarding +authors: + - "@jcantrill" + - "@jaosorior" + - "@alanconway" +reviewers: + - "@bparees" + - "@ewolinetz" + - "@jeremyeder" +approvers: + - "@bparees" + - "@ewolinetz" +creation-date: 2019-09-17 +last-updated: 2020-07-20 +status: implementable +see-also:[] +replaces:[] +superseded-by: + - "/enhancements/cluster-logging-v2-apis.md" +--- + +# cluster-logging-log-forwarding + +## Release Signoff Checklist + +- [X] Enhancement is `implementable` +- [X] Design details are appropriately documented from clear requirements +- [X] Test plan is defined +- [X] Graduation criteria for dev preview, tech preview, GA +- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) + +## Summary + +The purpose of log forwarding is to provide a declarative way by which adopters of cluster logging can ship +container and node logs to destinations that are not necessarily managed by the OKD cluster logging infrastructure. +Destinations are either on or off cluster endpoints such as the cluster logging provided Elasticsearch, an organization's Kafka message bus, a syslog server, etc. + +This document describes the initial release goals for log forwarding. "Future plans" at the end outlines stories that were pushed out during discussions. + +## Motivation + +Organizations desire to reuse their existing enterprise log solutions +to store container logs. Providing a declarative mechanism by which +administrators define a log destination simplifies their operational +burden. They are able to take advantage of log collection +infrastructure with minimal configuration changes. The cluster logging +stack is able to deploy a collector to each OKD node that is +configured with the necessary permissions to collect the logs, add +container metadata (e.g. labels) and ship them to the specified +endpoint. + +### Goals +The specific goals of this proposal are: + +* Selectively forward `application`, `infrastructure` and `audit` inputs. +* Forward any combination of inputs to any combination of outputs. +* Send logs to endpoints not managed by the cluster logging infrastructure such as: + * An Elasticsearch cluster (version 5 or 6) + * An endpoint that accepts the fluent forward protocol (fluentd, fluentbit, others...). + * An endpoint that accepts the syslog protocol via UDP, TCP or TLS. + * A kafka broker + * Others based on demand... +* Support TLS connections to outputs if so configured +* Provide a common, simplified, generic configuration for all output types. + - connection URL, TLS, reconnect are always configured the same way. + - limited access to essential output-specific features, e.g. setting a syslog facility. +* Configure inputs for the managed store (Elasticsearch) in the same way as external stores. +* Deploy a singleton cluster-scoped forwarder to manage forwarding for the cluster. +* Re-connect automatically if an output connection fails. + +We will be successful when: + +* an administrator is able to deploy their own log aggregation service + - specifies this service as an output in the `ClusterLogForwarder` spec. + - specifies the inputs (categories) to forward + - the service receives the expected logs + +### Non-Goals + +* No secure storage for audit logs, only secure (TLS) delivery to a target system. + - The user must ensure that the target system is secure and compliant with regulations. + - The OpenShift Elasticsearch store is not guaranteed to comply with any such regulations. +* No direct access to the configuration schemes of target systems or the local collector + - limited access to essential output-specific features, e.g. setting a syslog facility. +* Not intended to provide a complex routing solution as one might achieve by using a custom collector configuration or a messaging solution (e.g. kafka) - but is intended to allow forwarding to an such a system that is deployed externally. + +## Proposal + +Provide a declarative `pipeline` that associates a set of named `inputs` with a set of named `outputs`. + +The following reserved input names are defined for the initial release: +* `application` - Container logs generated by user applications running on the platform, excluding `infrastructure` containers. +* `infrastructure` - Logs generated by infrastructure components running on the platform and OKD nodes (e.g. journal logs). "infrastructure" applications are defined as any pods which run in namespaces: `openshift*`, `kube*`, `default`. +* `audit` - Logs generated by the nodes' auditd (/var/log/audit/audit.log), audit logs from the kubeapi-server and the openshift-apiserver. This will not be forwarded by default. + +The following reserved output names are defined: +* `default` - the current Elasticsearch based store. + +Users can define their own named outputs pointing to their target endpoints. An endpoint can be deployed on or off cluster. Endpoints off-cluster may require adminstrators to perform additional actions in order for logs to be forwarded (e.g. secret creation, opening port, enable global proxy configuration) + +The following output types are planned for initial support: + +* Elasticsearch (v6.x) with/without TLS. +* Fluent `forward` with/without TLS. +* Syslog UDP, TCP, TLS. +* Kafka + +### User Stories + +#### As an OKD admin, I want to deploy only a collector, and have it ship logs off-cluster to my own aggregation solution + +This is a typical example of organizations that desires to re-use their existing enterprise log solution. We will succeeded if we are able to write logs to their logging service. + +#### As an OKD admin, I want to aggregate application logs on-cluster and infra logs off-cluster + +This is an example of an OKD cluster hosting solution where several organizations are each provided with a dedicated cluster. The organization requires access to application container logs but the host requires access to the infra structure logs. + +#### As an OKD admin, I need to forward my audit logs to a secure SIEM that meets government regulations + +This is often required for industries such as the US public sector, healthcare or financials. The logs will be forwarded to a government approved SIEM through secure means (mutual TLS). + +### Implementation Details + +* A pipeline associates multiple input names with multiple output names. +* Users can define new outputs, output configuration includes: + - `type` (e.g. `Syslog`,`Fluent`) + - `url` used to connect to the endpoint. + - `secret` referring to a secret object used for secure connections. + - For a standard logging installation, secrets are in the `openshift-logging` namespace + - You can omit the `namespace` field in the secretRef, the `openshift-logging` namespace is assumed. If you do include the `namespace` field it must be `openshift-logging` + - Secrets not created by the cluster-logging-operator shall be created and managed by the administrator of the endpoint + - NOTE: We use SecretReference rather than a simple name string for future flexibility. +* If no `ClusterLogForwarder` object exists, the default Elasticsearch instance is enabled (status quo) +* If a `ClusterLogForwarder` exists the default Elasticsearch instance is disabled unless there is a `pipeline` with the `Default` output. + +#### Security + +* Server-authenticated TLS is enabled if `url` is a secure URL (e.g. 'https:') +* Client-authenticated TLS is enabled if `url` is secure *and* `secretRef` has keys `tls.crt`, `tls.key`, `ca-bundle.crt` + - it is an error if `secretRef` is present but `url` is *not* secure, or the required 'tls.' keys are missing or invalid. +* An intentionally *insecure* output (no TLS) must have `insecure: true` + - This is to avoid accidental insecure mis-configuration of an output that was intended to be secure. +* The user is responsible for creating and maintaining the secret objects +* The cluster logging operator is responsible for watching secrets and applying changes + - e.g. if the user replaces a certificate or changes a password in a secret, the operator must re-connect affected outputs with the new credentials. + +#### Reliability +* Output connections will automatically re-connect on disconnect. +* Changes to secrets will trigger automatic re-connect with new credentials. + +#### Scale +* The cluster logging operator generates configuration for the collector + - with a singleton ClusterLogForwarder this is unlikely to be a scaling problem. +* The actual connection and forwarding to remote endpoints is done by the collector + - we rely on the collector to scale and perform. + +#### Metrics +Note: the actual forwarding is done by the collector, so we can only provide metrics that are available from the collector. + +Desirable metrics include: + +* Counter: + - Volume (bytes) per input, per pipeline, per output and total. +* Histogram/Summary: + - Throughput (bytes/sec) per input, per pipeline, per output and total. + - Read size: per input + - Write size: per output + - Latency (sec) + - per pipeline: from read to written on all outputs. + - per output: from read to written on this output. + +#### Cluster Logging Operator +The `cluster-logging-operator` will use the `ClusterLogForwarder` configuration to: + +* generate output configuration for the collector that respects all the pipelines. +* mount secrets in the collector daemonset as needed for each endpoint. + - the controller ensures that collector configuration refers to the correct mounted secrets. + - the exact location of secrets in the file-system is a controller implementation detail. + +#### Collector +* The collectors will be modified to be remove endpoint config specific logic from the start script ; configuration is assumed to be correct and used as provided by the `cluster-logging-operator` +* Extract all configuration into the collector configuration. +* Extract the `run.sh` script from the collector image and mount into the deployed pod + +### Risks and Mitigations +- The API and GA feature set are a close match to the Tech Preview API, which reduces the risk. +- We have starting-point implementations for fluent and syslog outputs. +- We have done experimental work on kafka outputs. + +### Examples CRs for some use cases + +#### As a cluster administrator, I want to forward to a remote service and also store logs locally + +I want a remote copy of logs, but also I want to continue using the default elasticsearch log store: +- I don't lose logs while the remote service is down. +- My local users can continue to view and query the logs locally. + +```yaml +apiVersion: "logging.openshift.io/v1" +kind: "ClusterLogForwarder" +spec: + outputs: + - name: SecureRemote + type: syslog + url: tls://secureforward.offcluster.com:9200 + secret: + name: my_secrets # Must contain keys tls.key, tls.cert and ca.cert + + pipelines: + - inputs: [ infrastructure, application, audit ] + outputs: [ SecureRemote, Default ] +``` + +#### As a cluster administrator, I want to use a local syslog instance only, with no elasticsearch + +```yaml +apiVersion: "logging.openshift.io/v1" +kind: "ClusterLogForwarder" +spec: + outputs: + - name: MyLogs + type: syslog + syslog: + Facility: Local0 + url: localstore.example.com:9200 + pipelines: + - inputs: [infrastructure, application, audit] + outputs: [MyLogs] +``` + +#### As a cluster administrator, I want to clearly separate where the logging stack forwards infrastructure and/or audit related logs + +```yaml +apiVersion: "logging.openshift.io/v1" +kind: "ClusterLogForwarder" +spec: + outputs: + - name: MyInfra ... + - name: MyApp ... + - name: MyAudit ... + pipelines: + - inputs: [infrastructure] + outputs: [MyInfra] + - inputs: [application] + outputs: [MyApp] + - inputs: [audit] + outputs: [MyAudit] +``` +### As a Red Hat SRE who operates OSD clusters, I want a mechanism to protect my configuration (e.g. audit log forwarding, infra logs) from non SRE administrators of OSD but at the same time give them the opportunity to configure their own log forwarding for applications + +This use case will be resolved by an admissions webhook, outside of the forwarder. Such a webhook will +* refuse requests to create/modify pipelines with `infrastructure` or `audit` inputs except for a special role/user representing the SRC +* allow requests to create/modify pipelines with only `application` inputs as usual. + +### Test Plan + +#### Regression testing +Translate all existing TP tests to new API, translation should be simple, tests should pass. + +#### Unit testing +* Log forwarding will add unit tests to provide adequate coverage for all changes +* BDD unit testing will be added to unit testing to make tests goals more expliit, readable, and obvious +* Use `go test -cover` and related tools to measure coverage https://blog.golang.org/cover + +#### Functional testing +Go tests that run (in sub-processes or goroutines): +- a collector instance +- a dummy log receiver +- a simulated container generating logs + +Verify configurations: +* Pipelines with multiple, overlapping outputs. +* TLS server and client authentication. +* Tests for all output types +* Verify reconnect +* Error scenarios. + +Note: by driving functional tests from `go test` we can get coverage stats integrated with the unit tests. + +#### Integration and E2E tests +* Tests to verify no change in behavior with + - No ClusterLogForwarder object deployed. + - A ClusterLogForwarder object with this configuration: + pipelines: {inputs: [infrastructure, application, audit], outputs: [Default]}``` +* Tests to verify log forwarding is writing logs to an Elasticsearch instance not managed by cluster logging +* Tests to verify log forwarding is writing logs to a fluentd instance that is not managed by cluster logging +* Tests to verify log forwarding is writing logs to a syslog instance that is not managed by cluster logging + +#### Scale and stress testing +* Run selected E2E tests under stress conditions: + - many nodes + - many containers + - high-volume log streams + - many outputs +* Find breaking points. +* Fix bugs that show up under stress. +* Optimize performance bottlenecks. + +### Graduation Criteria + +#### Tech Preview -> GA + +Essential: +- Refactor existing TP implementation to implement new API. +- Implement new GA output types. +- Sufficient test coverage (upgrade, tech. preview migration, downgrade, scale) +- Available by default without tech preview annotation +- End user documentation. + +### Upgrade / Downgrade Strategy + +#### Upgrade + +After upgrading the cluster-logging-operator the Tech Preview API will become inactive (abandon in place) + +For upgrade from a Tech preview we are not obliged to do more than that, but if time permits the operator will: +* detect an existing tech-preview instance +* generate an equivalent GA configuration (also considering the TP enable/disable annotations to include/exclude a Default pipeline) +* deploy the equivalent GA API instance +* mark the old instance as inactive with an informative status. + +#### Downgrade +Downgrades should be discouraged unless we know for certain the Elasticsearch version managed by cluster logging is the same version. There is risk that Elasticsearch may have migrated data that is unreadable by an older version. + +### Version Skew Strategy + +Version skew is not relevant to the GA proposal because the operands will not change, only the way the operator configures them. Logging is deployed as an OLM managed operator and component versions are set in a versioned operator deployment. + +In future upgrades where operator+operand versions may be temporarily mismatched, we will need to handle the version skew issues. + + +## Implementation History + +| release|Description| +|---|---| +|4.3| **Tech Preview** - Initial release supporting `Elasticsearch` and Fluentd `forward` + +## Drawbacks +Drawbacks to providing this enhancement are: +* Increased exposure to issues being raised by customers for things outside the control of the cluster logging team + * What happens when the customer managed endpoint is down? How well does the collector handle the back pressure? When do logs get dropped because they can not be shipped? +* Setting customer expectations of the capabilities of log forwarding and guarantees (e.g. rates, delivery, reliability, message loss) + +## Alternatives + +Provide a recipe for customer's to deploy their own log collector to move the responsibility to the customer. + +## Infrastructure Needed +* Future target endpoints may require special infrastructure or licensing to properly test. +* Scale and stress tests require intensive use of a large cluster for an extended period of time. + +## Future plans +As well as serving the current GA requirements, the log forwarding API has been designed with the following future requirements in mind. + +### Stand-alone log forwarding + +Deploy log forwarding without deploying the entirety of the cluster logging infrastructure (e.g. Kibana, Elasticsearch) Forwarding will be a stand-alone system independent of any log store. This decoupling will let us test forwarding separately, and let customers to switch off our managed store entirely while still using a managed and supported forwarder. + +### As a team lead (tenant), I’d like to configure secure log forwarding to the tool of my team's choice, separate from global config + +Introduce a namespace-scoped LogForwarder. The API is a restricted version of the ClusterLogForwarder API: +- can't use infrastructure or audit inputs +- can only forward logs from own namespace. + +Although there could be many `LogForwarder` objects, there is still only one collector. The operator would join all the configurations and compile them to a single collector configuration. It would also enforce the limitations of namespace-scoped forwarders. + +### I want to configure log forwarding to include/exclude logs on k8s labels + +Allow user-defined named inputs in addition to the built in application, infrastructure, audit. +User inputs can select logs based on: +* K8s label selector maps and/or expressions. +* Namespaces + +User defined inputs could also be extended to allow per-record filtering and transformations (e.g. using regular expressions), but we haven't though much about that yet. + +### I want many namespace-scoped forwarders to share the same remote logging connection +Having every namespace define it's own log forwarding outputs may create a large number of connections from the underlying collector. In many cases you would like to define a single Output destination (e.g. for "ImportantApplications"), but allow each namespace to define for itself which applications are "Important" by creating pipelines to a shared ImportantApplications output. + +The solution is to define a "shared output" API. This has the same configuration as an `output` entry in the ClusterLogForwarder API, but can be deployed as a separate object. Any forwarder configuration can refer to the output as "/", the cluster logging controller will collect all pipelines referring to that name, and generate collector configuration +to do all the requested forwarding over a single connection. + +Security consideration: we need to restrict use of an Output either by role or namespace, needs investigation. + +--- +title: forwarder-input-selectors +authors: +- "@jcantril" +reviewers: +- "@alanconway, Red Hat Logging Architect" +- "@xperimental" +- "@syedriko" +- "@cahartma" +approvers: +- "@alanconway" +api-approvers: +- "@alanconway" +creation-date: 2023-10-30 +last-updated: 2024-03-07 +tracking-link: +- https://issues.redhat.com/browse/LOG-2155 +see-also: +- +replaces: +- +--- + + +# Log Forwarding Input Slection using Kubernetes Metadata +## Summary + + +Cluster Logging defines a set of well known log sources in order to facilitate configuration of log collection and normalization. Given customers are no longer bound to the data storage provided by cluster logging, this enhancement expands those definitions to allow specifying which logs are collected by using Kubernetes metadata. + + +Logs originate from six distinct sources and are logically grouped using the following definitions: + + +* **Application** are container logs from all namespaces across the cluster excluding infrastructure namespaces. + + +* **Infrastructure** logs are: + * container logs from namespaces: default, kube*, openshift* + + +* **Audit** are logs written to files on master nodes that include: + * kubernetes API server + * OpenShift API server + * auditd + * OVN + + +**NOTE**: **application**, **infrastructure**, and **audit** are reserved words to the **cluster-logging-operator** and continue to represent the previous definitions. + + +Administrators use these definitions to specify pipelines to normalize and route messages from the sources to outputs. + + +This enhancement allow administrators to define "named" inputs by expanding the previous definitions as follows: + + +* Named application: + * Any name that is not reserved + * Collect from any namespace including the ones for **infrastructure** container logs +* Named infrastructure: + * Any name that is not reserved + * Explicit source choices of: node, container +* Named audit: + * Any name that is not reserved + * Explicit source choices of: kubeAPI, openshiftAPI, auditd, ovn + + + + +## Motivation + + +### User Stories + + + + +* As an administrator of cluster logging, I want to only forward logs from a limited set of namespaces because I do not need the others +* As an administrator of cluster logging, I want to exclude logs from a limited set of namespaces because I do not need them +* As an administrator of cluster logging, I want to only forward logs from pods with a specific set of labels +* As an administrator of cluster logging, I want to exclude certain container logs from a pod because they are noisy and uninteresting to me +* As an administrator of cluster logging, I do not want to collect node logs because they are not of interest to me + + +### Goals + + +* Allow specifying which container logs are or are not collected using workload metadata (e.g. namespace, labels, container name) +* Allow specifying which source of infrastructure (i.e. node, container) or audit (i.e. kubernetes API, openshift API, auditd, ovn) logs are collected +* Reduce the CPU and memory load on the collector by configuring it to only process logs that are interesting to administrators +* Reduce the network usage when forwarding logs +* Reduce the resources required to store logs (e.g. size, cpu, memory) +* Reduce the cost to store logs + + +### Non-Goals + + +* Introduction of the next version of logging APIs. +* Allow administrators full access to the native collector configuration. + + +## Proposal + + +### Workflow Description + + +Administrators create an instance of **ClusterLogForwarder** which defines which logs to collect, how they are normalized, and where they are forwarded. They can choose to explicitly collect logs from specific namespaces or from pods which have specific labels by defining a "named" input. No other changes to the existing workflow are required. + + +### API Extensions + + +#### ClusterLogForwarder + + +Following are the additions to the InputSpec: + +* Application Input +```yaml + spec: + - name: my-app + application: + namespaces: [] #deprecated: exact string or glob + includes: + - container: #exact string or glob + namespace: #exact string or glob + excludes: + - container: #exact string or glob + namespace: #exact string or glob + selector: #metav1.LabelSelector + matchLabels: [] + matchExpressions: + - key: + operator: + values: [] +``` + +**NOTE:** *application.namespaces* field is deprecated. + +```golang + type Application struct { + Namespaces []string + Includes *NamespaceContainerGlob + Excludes *NamespaceContainerGlob + Selector *metav1.LabelSelector + } + + + type NamespaceContainerGlob struct { + Namespace string + Container string + } +``` + +* Infrastructure Input +```yaml + spec: + - name: my-infra + infrastructure: + sources: ["node","container"] + +``` +```golang + type Infrastructure struct { + Sources []string + } + + const ( + InfrastructureSourceNode string = "node" + InfrastructureSourceContainer string = "container" + ) +``` + +* Audit Input +```yaml + spec: + - name: my-audit + audit: + sources: ["kubeAPI","openshiftAPI","auditd","ovn"] +``` +```golang + type Audit struct { + Sources []string + } + + const ( + AuditSourceKube string = "kubeAPI" + AuditSourceOpenShift string = "openShiftAPI" + AuditSourceAuditd string = "auditd" + AuditSourceOVN string = "ovn" + ) +``` + +##### Verification and Validations +The operator will validate resources upon reconciliation of a **ClusterLogForwarder**. Failure to meet any of the following conditions will stop the operator from deploying a collector and it will add error status to the resource or be rejected before admission: + + +* The **ClusterLogForwarder** CR defines a valid spec +* Input spec fields that are "globs" (i.e. Namespace, container) match RE: '`^[a-zA-Z0-9\*]*$`' +* Input field 'selector' is a valid metav1.LabelSelector +* Input enum fields accept only the values listed +* type "infrastructure" sources specs at least one value +* type "audit" sources specs at least one value + + +##### Examples +Following is an example of a **ClusterLogForwarder** that redefines "infrastructure" logs to include node logs and other namespaces outside of "openshift*" while dropping all istio container logs from any namespace: + +```yaml + apiVersion: "logging.openshift.io/v1" + kind: ClusterLogForwarder + metadata: + name: infra-logs + namespace: mycluster-infras + spec: + serviceAccountName: audit-collector-sa + inputs: + - name: my-infra-container-logs + application: + namespaces: + - openshift* + includes: + - namespace: mycompany-infra* + excludes: + - container: istio* + - name: my-node-logs + infrastructure: + sources: ["node"] + pipelines: + - inputRefs: + - my-infra-container-logs + - my-node-logs + outputRefs: + - default +``` + +### Implementation Details/Notes/Constraints + + +* The collector configuration will be restructured to dedicate a source for each **ClusterLogForwarder** input + + +### Risks and Mitigations + + +* Are we able to provide enough test coverage to ensure we cover all the ways the configuration may change with this expanded offering + + +--- +title: multi-cluster-log-forwarder +authors: +- "@jcantril" +reviewers: +- "@alanconway, Red Hat Logging Architect" +approvers: +- "@alanconway" +api-approvers: +- "@alanconway" +creation-date: 2023-02-23 +last-updated: 2023-07-12 +tracking-link: +- https://issues.redhat.com/browse/LOG-1344 +see-also: +- +replaces: +- +--- + +# Multi ClusterLogForwarder +## Summary + +Log forwarding is functionally a "cluster singleton" where the operator explicitly only reconcilies a **ClusterLogForwarder** in the namespace *openshift-logging* named *instance*. This enhancement removes that restriction to allow administrators to define multiple instance of **ClusterLogForwarder** while retaining the legacy behavior. + + +## Motivation + +### User Stories + + +* As an administrator of a Red Hat managed cluster, I want to RBAC my log forwarder configuration from customer admins so they can take ownership of their log forwarder needs without being able to modify mine. +* As an administrator of Hosted Control Planes, I want to deploy individual log forwarders to isolate audit log collection of each managed control plane. +* As an administrator adopting vector, I want to deploy it separately from my existing fluentd deployment so they can operate side-by-side and I can migrate my workloads. + +### Goals + +* Cluster administrators control which users are allowed to define log collection and which logs they are allowed to collect. +* Users with allowable permissions are able to specify additional log collection configurations +* Log forwarder deployments are isolated so they do not interfere with other log forwarder deployments +* Support ClusterLogForwarders simultaneously in legacy and multiple instance modes + +### Non-Goals + +* Introduction of the next version of logging APIs. +* Adding RBAC to the output destinations to restrict where logs can be forwarded + +## Proposal + +### Workflow Description +This proposal identifies two separate workflows in order to support the legacy deployment and allow additional deployments to meet the enhancement goals. The legacy deployment will be familiar to users of ClusterLogForwarder +prior to the implementation of this enhancement. They should see no differences in the manner by which they use log forwarding. The new workflow will require additional permissions to create new ClusterLogForwarders in order +to limit the number of deployments for resource concerns. Cluster administrators will need to explicitly allow additional deployments. + +The workflows make the following assumptions: + +* The **cluster-logging-operator** is deployed to the *openshift-logging* namespace +* The **cluster-logging-operator** is able to watch any namespace + +#### Multiple-Instance Mode: Allowing multiple ClusterForwarder and ClusterLogging resources + +This workflow supports any ClusterLogForwarder except one named "instance" in the *openshift-logging namespace*. The resource openshift-logging/instance is significant to supporting the legacy workflow. + +**NOTE:** Vector is the only supported collector implementation in this mode. + +**cluster administrator** is a user: + +* responsible for maintaining the cluster +* able to bind cluster roles to serviceaccounts +* that deploys the **cluster-logging-operator** + +**namespace administrator** is a user: + +* able to create a serviceaccount +* able to create a serviceaccount token +* manages a **ClusterLogForwarder** custom resource + +The general workflow: + +* The namespace administrator creates a service account to be used by a log collector. The service account must additionally include a token if there is intent to write to log storage that depends upon a token for authentication. +* The cluster administrator binds cluster roles to the service account for the log types they are allowed to collect (e.g. audit, infrastructure). Several roles are added to the operator manifest and look something like: + +```yaml + apiVersion: rbac.authorization.k8s.io/v1 + kind: ClusterRole + metadata: + name: collect-audit-logs + rules: + - apiGroups: + - "logging.openshift.io" + resources: + - logs + resourceNames: + - audit + verbs: + - collect + +``` +This role allows collection of application logs and requires the namespace administor to bind the service account to the role like: + +```text + oc create clusterrolebinding kube-audit-log-collection --clusterrole=collect-audit-logs --serviceaccount=openshift-kube-apiserver:audit-collector-sa +``` + +* The namespace administrator creates a **ClusterLogForwarder** CR that references the serviceaccount and the inputs for which that serviceaccount is allowed to collect + +```yaml + apiVersion: "logging.openshift.io/v1" + kind: ClusterLogForwarder + metadata: + name: audit-collector + namespace: openshift-kube-apiserver + spec: + serviceAccountName: audit-collector-sa + pipelines: + - inputRefs: + - audit + outputRefs: + - loki + outputs: + - name: loki + type: loki + url: https://mycenteralizedserver.some.place +``` + +##### Use of ClusterLogging resource +This resource is optional in multiple instance mode and is a departure from the legacy mode where a ClusterLogging resource is always required with a ClusterLogForwarder. A namespace administrator must define a **ClusterLogging** CR named the same as the **ClusterLogForwarder** CR and in the same namespace when needing to spec collector resources or placement. + +```yaml + apiVersion: "logging.openshift.io/v1" + kind: "ClusterLogging" + metadata: + name: audit-collector + namespace: openshift-kube-apiserver + spec: + collection: + type: "vector" +``` + +The relevent spec level fields for this CR in multiple instance mode are: + +* managmentState +* collection + +All other spec fields are ignored: logStore, visualization, curation, forwarder, collection.logs + + +##### Verification and Validations +The operator will validate resources upon reconciliation of a **ClusterLogForwarder** and **ClusterLogging** CR. Failure to meet any of the following conditions will stop the operator from deploying a collector and it will add error status to the resource. + +* The **ClusterLogForwarder** CR defines a valid spec +* The serviceaccount defined in **ClusterLogForwarder** CR is bound to clusterroles that allow the input spec of the **ClusterLogForwarder** CR +* When a **ClusterLogging** CR is deployed that has a matching name and namespace to a **ClusterLogForwarder** CR it must only define a valid collection spec. + +The previous example identifies a valid **ClusterLogForwarder** CR that specs audit logs forwarded to a loki stack. The following is an example of a CR rejected by the operator because it specs collection of application logs but does not have the required role binding: + +```yaml + apiVersion: "logging.openshift.io/v1" + kind: ClusterLogForwarder + metadata: + name: audit-collector + namespace: openshift-kube-apiserver + spec: + pipelines: + - inputRefs: + - audit + - application + outputRefs: + - loki + outputs: + - name: loki + type: loki + url: https://mycenteralizedserver.some.place +``` + + +#### Legacy Mode: Allow only a single ClusterForwarder and ClusterLogging resource in openshift-logging + +This workflow is the exising, legacy workflow. It relies upon oppinionated resource names in an explicit namespace. There are two variations to this workflow: administrator provides **ClusterLogging** CR with or without a **ClusterLogForwarder**. This workflow continues to function as it has for previous releases of logging prior to the implementation of this proposal: + +* **ClusterLogging** CR which specs collection and logstore results in a deployment that collects application and infrastructure logs and forwards to logging operator managed log store (e.g. loki, elasticsearch) +* **ClusterLogging** CR which specs at least collection and a **ClusterLogForwarder** CR which defines forwarding results in a deployment that at a minimum is a collector that forwards logs to the defined outputs + +### API Extensions +None + +### Implementation Details/Notes/Constraints + +#### Log File Metric Exporter as a Separate Deployment + +The cluster logging project provides a component to gather metrics about the volume of application logs being generated on each node in the cluster. Prior to this enhancement this component was deployed as part of the collector pod. This proposal will: + +* move this component into a separate deployment from the collector +* introduce API to support configuring the component +* Explicitly only reconcile the object in the namespace *openshift-logging* named *instance* + +```yaml + apiVersion: "logging.openshift.io/v1alpha1" + kind: LogFileMetricExporter + metadata: + name: instance + namespace: openshift-logging + spec: + tolerations: + resources: + limits: + requests: +``` +* restrict the number of deployments to 1 as no more then one is required per cluster +* require existing cluster logging deployments to create an instance of **LogFileMetricExporter** in order to continue to generate these metrics + +**Note:** This is breaking change from previous releases but will allow administrators to manage scheduling and resources and to explicitly choose to gather + these metrics. + +#### Metrics Dashboards + +* Deploy singleton instance of the collector dashboard if **ClusterLogForwarder** count >= 1 +* Refactor the dashboard to be agnostic of collector implementation and support multiple collector deployments + + +#### Metrics Alerts +* Deploy singleton instance of the alerts if **ClusterLogForwarder** count >= 1 +* Refactor alerts to be agnostic of collector implementation and support multiple collector deployments + + +### Risks and Mitigations + +* Are we properly supporting the app-sre? +* Are we properly supporting Hosted Control Planes? + +### Drawbacks + +## Design Details + +### Open Questions [optional] + +1. Is there any reason we need to support fluentd deployments for this feature given we consider fluentd deprecated? + +### Test Plan +* Verify existing (legacy) deployments upgrade without regression +* Verify administrators can create legacy mode deployments as documented in logging 5.7 without regression + +# diff --git a/observability/logging/logging-6.0/_attributes b/observability/logging/logging-6.0/_attributes new file mode 120000 index 000000000000..bf7c2529fdb4 --- /dev/null +++ b/observability/logging/logging-6.0/_attributes @@ -0,0 +1 @@ +../../../_attributes/ \ No newline at end of file diff --git a/observability/logging/logging-6.0/images b/observability/logging/logging-6.0/images new file mode 120000 index 000000000000..4399cbb3c0f3 --- /dev/null +++ b/observability/logging/logging-6.0/images @@ -0,0 +1 @@ +../../../images/ \ No newline at end of file diff --git a/observability/logging/logging-6.0/log6x-about.adoc b/observability/logging/logging-6.0/log6x-about.adoc new file mode 100644 index 000000000000..c1087357597c --- /dev/null +++ b/observability/logging/logging-6.0/log6x-about.adoc @@ -0,0 +1,21 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-about"] += Logging 6.0 +:context: logging-6x + +toc::[] + +// Summary of significant changes from 5x. +include::modules/logging-6x-v-5x.adoc[leveloffset=+1] + +// Migration guide(s). 1 - Migrating to the current stack re configurations. 2 - Migrating existing data. +include::modules/logging-elastic-to-loki-store.adoc[leveloffset=+1] + +// User education - teach to fish for field fornats. +include::modules/log6x-oc-explain.adoc[leveloffset=+1] + +== Operators (architecture). +// Should be an include, create: modules/log6x-operators-crs.adoc - Associated JIRA? + +* Operator:: CR ? diff --git a/observability/logging/logging-6.0/log6x-clf.adoc b/observability/logging/logging-6.0/log6x-clf.adoc new file mode 100644 index 000000000000..78b578ec6c8e --- /dev/null +++ b/observability/logging/logging-6.0/log6x-clf.adoc @@ -0,0 +1,91 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-CLF_{context}"] += Configuring log forwarding +:context: logging-6x + +toc::[] + +The `ClusterLogForwarder` CR serves as a single point of configuration for log forwarding, making it easier to manage and maintain log collection and forwarding rules. + +* Defines inputs (sources) for log collection +* Specifies outputs (destinations) for log forwarding +* Configures filters for log processing +* Defines pipelines to route logs from inputs to outputs +* Indicates management state (managed or unmanaged) + +// Needs engineering eval re applicability to 6.0 +include::modules/log-forwarding-implementations.adoc[leveloffset=+1] + +// This needs to be called out extensively. +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: instance +spec: + inputs: + - application + - infrastructure + - audit + outputs: + - lokiStack + - elasticsearch + - splunk + filters: + - kubeAPIAudit + - detectMultilineException + - parse + - openshiftLabels + - drop + - prune + pipelines: + - name: application-logs + input: application + output: lokiStack + filters: + - parse + - openshiftLabels + managementState: + managed: true +---- + +[id="log6x-CLF-tuning_{context}"] +== Output tuning +//need to validate how much of this applies to 6.0 & edit as needed. +include::modules/logging-delivery-tuning.adoc[leveloffset=+1] + + +[id="log6x-CLF-filters_{context}"] +== Filtering logs + +The following filters are available: +// Should be an include for each, create: modules/log6.adoc - Associated JIRA? +detectMultilineException:: Detects and handles multiline exceptions. +parse:: Parses log messages using Vector's parsing language. +openshiftLabels:: Adds OpenShift-specific labels to log messages. + +// All of these need to be validated by engineering for 6.0. +include::modules/logging-audit-log-filtering.adoc[leveloffset=+1] + +include::modules/logging-content-filter-drop-records.adoc[leveloffset=+1] + +include::modules/logging-content-filter-prune-records.adoc[leveloffset=+1] + +include::modules/logging-input-spec-filter-audit-infrastructure.adoc[leveloffset=+1] + +include::modules/logging-input-spec-filter-labels-expressions.adoc[leveloffset=+1] + +include::modules/logging-input-spec-filter-namespace-container.adoc[leveloffset=+1] + +include::modules/logging-multiline-except.adoc[leveloffset=+1] + + +[id="log6x-CLF-samples_{context}"] +== Use case samples +// Should be an include, create: modules/log6x-use-multi-instance.adoc - Associated JIRA? +* Multiple-Instance Mode + +// Should be an include, create: modules/log6x-use-metric-export.adoc - Associated JIRA? +* Log File Metric Exporter as a Separate Deployment diff --git a/observability/logging/logging-6.0/log6x-loki.adoc b/observability/logging/logging-6.0/log6x-loki.adoc new file mode 100644 index 000000000000..231f0fc2ca72 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-loki.adoc @@ -0,0 +1,103 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-loki"] += Storing logs with LokiStack +:context: logging-6x + +toc::[] + +== Installation +Partial includes of modules/logging-loki-cli-install.adoc & logging-loki-gui-install.adoc once https://github.com/openshift/openshift-docs/pull/77407/ is merged. + + + + +// All of these need to be validated by engineering for 6.0. +include::modules/logging-enabling-loki-alerts.adoc[leveloffset=+1] + +include::modules/logging-loki-memberlist-ip.adoc[leveloffset=+1] + +//include::modules/logging-loki-log-access.adoc[leveloffset=+1] + +include::modules/logging-identity-federation.adoc[leveloffset=+1] + +include::modules/logging-loki-log-access.adoc[leveloffset=+1] + +include::modules/logging-loki-pod-placement.adoc[leveloffset=+1] + +include::modules/logging-loki-reliability-hardening.adoc[leveloffset=+1] + +include::modules/logging-loki-restart-hardening.adoc[leveloffset=+1] + +include::modules/logging-loki-retention.adoc[leveloffset=+1] + +include::modules/logging-loki-zone-aware-rep.adoc[leveloffset=+1] + +include::modules/logging-loki-zone-fail-recovery.adoc[leveloffset=+1] + +include::modules/loki-rate-limit-errors.adoc[leveloffset=+1] + +include::modules/loki-rbac-rules-permissions.adoc[leveloffset=+1] + +== Loki Architecture: Understanding the Components +Loki is a distributed system with several core components, each serving a specific purpose in the logging pipeline: + +* **Distributor:** Receives incoming log streams, validates them, and distributes them to the appropriate ingesters. +* **Ingester:** Builds chunks of log data and creates an index to enable efficient querying. +* **Querier:** Executes queries against the log data, retrieving relevant log entries from the ingesters and storage backend. +* **Query Frontend:** Manages and coordinates incoming queries, distributes the workload across queriers, and returns the query results. +* **Storage Backend:** Persists the log data and index, typically using an object storage like Amazon S3 or Google Cloud Storage. +* **Compactor:** Optimizes the storage of log data by compacting older log chunks into a more efficient format. +* **Ruler:** Allows you to define rules for alerting and recording based on log data patterns. + +[discrete] +=== Distributor: + +* `spec.distributor.replicas`: Controls the number of distributor replicas. +* `spec.distributor.resources`: Sets resource limits (CPU, memory) for the distributor pods. +* `spec.distributor.ring`: Configures the hash ring for distributing log streams among ingesters. + +[discrete] +=== Ingester: + +* `spec.ingester.replicas`: Controls the number of ingester replicas. +* `spec.ingester.resources`: Sets resource limits (CPU, memory) for the ingester pods. +* `spec.ingester.lifecycler`: Configures the ingester lifecycle, including chunk lifecycle and table manager settings. +* `spec.ingester.persistence`: Configures the persistent storage for ingester data (e.g., volume claims, storage class). + +[discrete] +=== Querier: + +* `spec.querier.replicas`: Controls the number of querier replicas. +* `spec.querier.resources`: Sets resource limits (CPU, memory) for the querier pods. +* `spec.querier.queryRange`: Configures the default query range (time duration) for queriers. +* `spec.querier.storeGateway`: Configures the interaction between queriers and the storage backend. + +[discrete] +=== Query Frontend: + +* `spec.queryFrontend.replicas`: Controls the number of query frontend replicas. +* `spec.queryFrontend.resources`: Sets resource limits (CPU, memory) for the query frontend pods. +* `spec.queryFrontend.grpcServer`: Configures the gRPC server for the query frontend. + +[discrete] +=== Storage Backend: + +* `spec.storageConfig`: Specifies the type of storage backend (e.g., S3, GCS, local filesystem) and its configuration parameters (e.g., bucket name, credentials). + +[discrete] +=== Compactor: + +* `spec.compactor.replicas`: Controls the number of compactor replicas. +* `spec.compactor.resources`: Sets resource limits (CPU, memory) for the compactor pods. +* `spec.compactor.workingDirectory`: Specifies the working directory for the compactor. +* `spec.compactor.sharedStore`: Configures the shared storage for compactor data. + +[discrete] +=== Ruler: + +* `spec.ruler.replicas`: Controls the number of ruler replicas. +* `spec.ruler.resources`: Sets resource limits (CPU, memory) for the ruler pods. +* `spec.ruler.alertmanagerURL`: Specifies the URL of the Alertmanager instance to send alerts to. +* `spec.ruler.ruleNamespaceSelector`: Selects the namespaces from which to load ruler rules. +* `spec.ruler.storage`: Configures the storage for ruler data. diff --git a/observability/logging/logging-6.0/log6x-start.adoc b/observability/logging/logging-6.0/log6x-start.adoc new file mode 100644 index 000000000000..e89727a30ef8 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-start.adoc @@ -0,0 +1,25 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-start"] += Getting started with Logging 6.0 +:context: logging-6x + +toc::[] + +.Deployment workflow +. *Cluster Administrator Responsibilities:* +** Deploys the Red Hat `cluster-observability-operator`. +** Deploys the `loki-operator`. +** Deploys the Red Hat `cluster-logging-operator`. +** Deploys an instance of `LokiStack` in the `openshift-logging` namespace. +** Creates a `ClusterLogForwarder` custom resource for the `LokiStack`. + +. *Cluster-Observability-Operator Actions:* +** Deploys the `console-logging-plugin` for reading logs in the OpenShift console. + +. *Loki-Operator Actions:* +** Deploys the `LokiStack` for storing logs on-cluster. + +. *Cluster-Logging-Operator Actions:* +** Deploys the log collector to forward logs to the log storage in the `openshift-logging` namespace. +** Manages different types of logs such as application logs, infrastructure logs, and audit logs. diff --git a/observability/logging/logging-6.0/modules b/observability/logging/logging-6.0/modules new file mode 120000 index 000000000000..7e8b50bee77a --- /dev/null +++ b/observability/logging/logging-6.0/modules @@ -0,0 +1 @@ +../../../modules/ \ No newline at end of file diff --git a/observability/logging/logging-6.0/snippets b/observability/logging/logging-6.0/snippets new file mode 120000 index 000000000000..ce62fd7c41e2 --- /dev/null +++ b/observability/logging/logging-6.0/snippets @@ -0,0 +1 @@ +../../../snippets/ \ No newline at end of file