Skip to content

Commit

Permalink
add a sample for the k8scluster receiver
Browse files Browse the repository at this point in the history
  • Loading branch information
dashpole committed Nov 17, 2023
1 parent 57b0741 commit 7c81f05
Show file tree
Hide file tree
Showing 3 changed files with 253 additions and 0 deletions.
112 changes: 112 additions & 0 deletions recipes/kubernetes-state-receiver/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# K8sCluster integration

This recipe demonstrates how to configure the OpenTelemetry Collector
(as deployed by the Operator) to send kubernetes state metrics from the
k8s_cluster receiver to [Google Cloud Manged Service for Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus).

This recipe is based on applying a Collector config that enables the Google Cloud exporter.
It provides an `OpenTelemetryCollector` object that, when created, instructs the Operator to
create a new instance of the Collector with that config. If overwriting an existing `OpenTelemetryCollector`
object (i.e., you already have a running Collector through the Operator such as the one from the
[main README](../../README.md#starting-the-collector)), the Operator will update that existing
Collector with the new config.


## Prerequisites

* Cloud Monitoring API enabled in your GCP project
* The `roles/monitoring.metricWriter` [IAM permission](https://cloud.google.com/monitoring/access-control#monitoring.metricWriter)
for your cluster's service account (or Workload Identity setup as shown below).
* A running GKE cluster
* The OpenTelemetry Operator installed in your cluster

## Running

### Workload Identity Setup

If you have Workload Identity enabled (on by default in GKE Autopilot), you'll need to set
up a service account with permission to write traces to Cloud Trace. You can do this with
the following commands:

```
export GCLOUD_PROJECT=<your GCP project ID>
gcloud iam service-accounts create otel-collector --project=${GCLOUD_PROJECT}
```

Then give that service account permission to write traces:

```
gcloud projects add-iam-policy-binding $GCLOUD_PROJECT \
--member "serviceAccount:otel-collector@${GCLOUD_PROJECT}.iam.gserviceaccount.com" \
--role "roles/monitoring.metricWriter"
```

Then bind the GCP service account to the Kubernetes ServiceAccount that is used by the Collector
you deployed in the prerequisites (note: set `$COLLECTOR_NAMESPACE` to the namespace you installed
the Collector in):

```
export COLLECTOR_NAMESPACE=default
gcloud iam service-accounts add-iam-policy-binding "otel-collector@${GCLOUD_PROJECT}.iam.gserviceaccount.com" \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:${GCLOUD_PROJECT}.svc.id.goog[${COLLECTOR_NAMESPACE}/otel-collector]"
```

**(Optional):** If you don't already have a ServiceAccount for the Collector (such as the one provided
when deploying a prior OpenTelemetryCollector object), create it with `kubectl create serviceaccount otel-collector`.

Finally, annotate the Collector's ServiceAccount to use Workload Identity:

```
kubectl annotate serviceaccount otel-collector \
--namespace $COLLECTOR_NAMESPACE \
iam.gke.io/gcp-service-account=otel-collector@${GCLOUD_PROJECT}.iam.gserviceaccount.com
```

### Deploying the Recipe

Apply the `OpenTelemetryCollector` object from this recipe:

```
kubectl apply -f rbac.yaml
kubectl apply -f collector-config.yaml
```

(This will overwrite any existing collector config, or create a new one if none exists.)

Once the Collector restarts, you should see traces from your application

## View your Metrics

Navigate to https://console.cloud.google.com/monitoring/metrics-explorer, and
search for `k8s_` to show all metrics from the k8s cluster receiver. Make sure
you are looking at the right GCP project. If you don't see any metrics right
away, you might need to wait and refresh the page.

## Troubleshooting

### rpc error: code = PermissionDenied

An error such as the following:

```
2022/10/21 13:41:11 failed to export to Google Cloud Monitoring: rpc error: code = PermissionDenied desc = The caller does not have permission
```

This indicates that your Collector is unable to export spans, likely due to misconfigured IAM. Things to check:

#### GKE (cluster-side) config issues

With some configurations it's possible that the Operator could overwrite an existing ServiceAccount when deploying
a new Collector. Ensure that the Collector's service account has the `iam.gke.io/gcp-service-account` annotation after
running the `kubectl apply...` command in [Deploying the Recipe](#deploying-the-recipe). If this is missing, re-run the
`kubectl annotate` command to add it to the ServiceAccount and restart the Collector Pod by deleting it (`kubectl delete pod/otel-collector-xxx..`).

#### GCP (project-side) config issues

Double check that IAM is properly configured for Cloud Trace access. This includes:

* Verify the `otel-collector` service account exists in your GCP project
* That service account must have `roles/monitoring.metricWriter` permissions
* The `serviceAccount:${GCLOUD_PROJECT}.svc.id.goog[${COLLECTOR_NAMESPACE}/otel-collector]` member must also be bound
to the `roles/iam.workloadIdentityUser` role (this identifies the Kubernetes ServiceAccount as able to use Workload Identity)
51 changes: 51 additions & 0 deletions recipes/kubernetes-state-receiver/collector-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: opentelemetry.io/v1alpha1

Check warning on line 15 in recipes/kubernetes-state-receiver/collector-config.yaml

View workflow job for this annotation

GitHub Actions / yamllint

15:1 [document-start] missing document start "---"
kind: OpenTelemetryCollector
metadata:
name: otel
spec:
image: otel/opentelemetry-collector-contrib:latest
config: |
receivers:
k8s_cluster:
collection_interval: 60s
processors:
resourcedetection:
detectors: [env, gcp]
timeout: 2s
override: false
exporters:
googlemanagedprometheus:
metric:
resource_filters:
- prefix: k8s.
# If the googlecloud exporter is used, metrics are sent to k8s_cluster,
# k8s_node, k8s_pod and k8s_container monitored resources.
googlecloud:
metric:
resource_filters:
- prefix: k8s.
logging:
loglevel: debug
service:
pipelines:
metrics:
receivers: [k8s_cluster]
processors: [resourcedetection]
exporters: [logging, googlemanagedprometheus]
90 changes: 90 additions & 0 deletions recipes/kubernetes-state-receiver/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: rbac.authorization.k8s.io/v1

Check warning on line 15 in recipes/kubernetes-state-receiver/rbac.yaml

View workflow job for this annotation

GitHub Actions / yamllint

15:1 [document-start] missing document start "---"
kind: ClusterRole
metadata:
name: otel-collector
rules:
- apiGroups:
- ""
resources:
- events
- namespaces
- namespaces/status
- nodes
- nodes/spec
- nodes/stats
- nodes/proxy
- pods
- pods/status
- replicationcontrollers
- replicationcontrollers/status
- resourcequotas
- services
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- jobs
- cronjobs
verbs:
- get
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: otel-collector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: otel-collector
subjects:
- kind: ServiceAccount
name: otel-collector
namespace: default

0 comments on commit 7c81f05

Please sign in to comment.