-
Notifications
You must be signed in to change notification settings - Fork 66
📖 Metrics Docs Maintenance #2024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ The following procedure is provided as an example for testing purposes. Do not d | |
|
||
In OLM v1, you can use the provided metrics with tools such as the [Prometheus Operator][prometheus-operator]. By default, Operator Controller and catalogd export metrics to the `/metrics` endpoint of each service. | ||
|
||
You must grant the necessary permissions to access the metrics by using [role-based access control (RBAC) polices][rbac-k8s-docs]. | ||
You must grant the necessary permissions to access the metrics by using [role-based access control (RBAC) polices][rbac-k8s-docs]. You will also need to create a `NetworkPolicy` to allow egress traffic from your scraper pod, as the OLM namespace by default allows only `catalogd` and `operator-controller` to send and receive traffic. | ||
Because the metrics are exposed over HTTPS by default, you need valid certificates to use the metrics with services such as Prometheus. | ||
The following sections cover enabling metrics, validating access, and provide a reference of a `ServiceMonitor` | ||
to illustrate how you might integrate the metrics with the [Prometheus Operator][prometheus-operator] or other third-part solutions. | ||
|
@@ -23,6 +23,25 @@ kubectl create clusterrolebinding operator-controller-metrics-binding \ | |
--serviceaccount=olmv1-system:operator-controller-controller-manager | ||
``` | ||
|
||
2. Next, create a `NetworkPolicy` to allow the scraper pods to send their scrape requests: | ||
|
||
```shell | ||
kubectl apply -f - << EOF | ||
apiVersion: networking.k8s.io/v1 | ||
kind: NetworkPolicy | ||
metadata: | ||
name: scraper-policy | ||
namespace: olmv1-system | ||
spec: | ||
podSelector: | ||
matchLabels: | ||
metrics: scraper | ||
policyTypes: | ||
- Egress | ||
egress: | ||
- {} # Allows all egress traffic for metrics requests | ||
EOF | ||
``` | ||
### Validating Access Manually | ||
|
||
1. Generate a token for the service account and extract the required certificates: | ||
|
@@ -41,6 +60,8 @@ kind: Pod | |
metadata: | ||
name: curl-metrics | ||
namespace: olmv1-system | ||
labels: | ||
metrics: scraper | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would we need the label? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The NP added is matching with the label 👍 |
||
spec: | ||
serviceAccountName: operator-controller-controller-manager | ||
containers: | ||
|
@@ -69,28 +90,27 @@ spec: | |
secretName: olmv1-cert | ||
securityContext: | ||
runAsNonRoot: true | ||
runAsUser: 1000 | ||
seccompProfile: | ||
type: RuntimeDefault | ||
restartPolicy: Never | ||
EOF | ||
``` | ||
|
||
3. Access the pod: | ||
3. Run the following command using the `TOKEN` value obtained above to check the metrics: | ||
|
||
```shell | ||
kubectl exec -it curl-metrics -n olmv1-system -- sh | ||
``` | ||
|
||
4. Run the following command using the `TOKEN` value obtained above to check the metrics: | ||
|
||
```shell | ||
curl -v -k -H "Authorization: Bearer <TOKEN>" \ | ||
kubectl exec -it curl-metrics -n olmv1-system -- \ | ||
curl -v -k -H "Authorization: Bearer ${TOKEN}" \ | ||
https://operator-controller-service.olmv1-system.svc.cluster.local:8443/metrics | ||
``` | ||
|
||
5. Run the following command to validate the certificates and token: | ||
4. Run the following command to validate the certificates and token: | ||
|
||
```shell | ||
kubectl exec -it curl-metrics -n olmv1-system -- \ | ||
curl -v --cacert /tmp/cert/ca.crt --cert /tmp/cert/tls.crt --key /tmp/cert/tls.key \ | ||
-H "Authorization: Bearer <TOKEN>" \ | ||
-H "Authorization: Bearer ${TOKEN}" \ | ||
https://operator-controller-service.olmv1-system.svc.cluster.local:8443/metrics | ||
``` | ||
|
||
|
@@ -131,6 +151,8 @@ kind: Pod | |
metadata: | ||
name: curl-metrics-catalogd | ||
namespace: olmv1-system | ||
labels: | ||
metrics: scraper | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we do not need either |
||
spec: | ||
serviceAccountName: catalogd-controller-manager | ||
containers: | ||
|
@@ -159,27 +181,26 @@ spec: | |
secretName: $OLM_SECRET | ||
securityContext: | ||
runAsNonRoot: true | ||
runAsUser: 1000 | ||
seccompProfile: | ||
type: RuntimeDefault | ||
restartPolicy: Never | ||
EOF | ||
``` | ||
|
||
4. Access the pod: | ||
|
||
```shell | ||
kubectl exec -it curl-metrics-catalogd -n olmv1-system -- sh | ||
``` | ||
|
||
5. Run the following command using the `TOKEN` value obtained above to check the metrics: | ||
4. Run the following command using the `TOKEN` value obtained above to check the metrics: | ||
|
||
```shell | ||
curl -v -k -H "Authorization: Bearer <TOKEN>" \ | ||
kubectl exec -it curl-metrics -n olmv1-system -- \ | ||
curl -v -k -H "Authorization: Bearer ${TOKEN}" \ | ||
https://catalogd-service.olmv1-system.svc.cluster.local:7443/metrics | ||
``` | ||
|
||
6. Run the following command to validate the certificates and token: | ||
5. Run the following command to validate the certificates and token: | ||
```shell | ||
kubectl exec -it curl-metrics -n olmv1-system -- \ | ||
curl -v --cacert /tmp/cert/ca.crt --cert /tmp/cert/tls.crt --key /tmp/cert/tls.key \ | ||
-H "Authorization: Bearer <TOKEN>" \ | ||
-H "Authorization: Bearer ${TOKEN}" \ | ||
https://catalogd-service.olmv1-system.svc.cluster.local:7443/metrics | ||
``` | ||
|
||
|
@@ -253,7 +274,7 @@ metadata: | |
spec: | ||
endpoints: | ||
- path: /metrics | ||
port: https | ||
port: metrics | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
scheme: https | ||
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token | ||
tlsConfig: | ||
|
@@ -272,7 +293,7 @@ spec: | |
key: tls.key | ||
selector: | ||
matchLabels: | ||
control-plane: catalogd-controller-manager | ||
app.kubernetes.io/name: catalogd | ||
EOF | ||
``` | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that is required.
See that the NPs that we have should allow already we scrap the metrics
Also, note that we are calling the metrics endpoint at: https://github.com/operator-framework/operator-controller/blob/main/test/e2e/metrics_test.go and we do not create any new NP
If we break it, then in the downstream we would no longer be able to get the metrics, and that is why we have a test to ensure it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried the guide again without the NetworkPolicy and it does not work. After I apply the NetworkPolicy, it works again.
The reason that the e2e test works is that it puts the curl pod into a random namespace, outside of olmv1-system. If you were to create the pod inside olmv1-system, the tests would fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and to your point on downstream metrics, the reason that also works fine is because the metrics scraper pod does not live in the same namespace as catalogd or operator-controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. 👍
Thank you for the clarification