Skip to content

Commit

Permalink
Update Metrics endpoints for ODH operator (opendatahub-io#349)
Browse files Browse the repository at this point in the history
* Fix ODH and Argo monitoring

Signed-off-by: Anish Asthana <anishasthana1@gmail.com>

* Increase replica count to 2 for HA

Signed-off-by: Anish Asthana <anishasthana1@gmail.com>

* Update Prometheus name and corresponding test

Signed-off-by: Anish Asthana <anishasthana1@gmail.com>

* Restructure Service Monitors

This separate the ODH operator and ODH application monitoring into two
seperate Service Monitors.

Signed-off-by: Anish Asthana <anishasthana1@gmail.com>
  • Loading branch information
anishasthana committed Mar 25, 2021
1 parent 039f3cf commit c52b7dd
Show file tree
Hide file tree
Showing 9 changed files with 85 additions and 9 deletions.
17 changes: 16 additions & 1 deletion prometheus/operator/base/kustomization.yaml
Expand Up @@ -4,10 +4,25 @@ resources:
- kafka-podmonitors.yaml
- prometheus.yaml
- route.yaml
- servicemonitor.yaml
- service-monitors
- prometheus-monitoring-role.yaml
- prometheus-monitoring-role-binding.yaml

namespace: opendatahub
commonLabels:
opendatahub.io/component: "true"
component.opendatahub.io/name: prometheus
generatorOptions:
disableNameSuffixHash: true

vars:
- name: namespace
objref:
kind: Prometheus
name: odh-monitoring
apiVersion: monitoring.coreos.com/v1
fieldref:
fieldpath: metadata.namespace

configurations:
- params.yaml
4 changes: 4 additions & 0 deletions prometheus/operator/base/params.yaml
@@ -0,0 +1,4 @@
varReference:
- path: subjects/namespace
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
13 changes: 13 additions & 0 deletions prometheus/operator/base/prometheus-monitoring-role-binding.yaml
@@ -0,0 +1,13 @@
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: odh-prometheus-monitoring-rb
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: $(namespace)
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: odh-prometheus-monitoring
23 changes: 23 additions & 0 deletions prometheus/operator/base/prometheus-monitoring-role.yaml
@@ -0,0 +1,23 @@
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: odh-prometheus-monitoring
namespace: opendatahub
rules:
- verbs:
- get
- list
- watch
apiGroups:
- ''
resources:
- services
- endpoints
- pods
- verbs:
- get
apiGroups:
- ''
resources:
- configmaps
6 changes: 3 additions & 3 deletions prometheus/operator/base/prometheus.yaml
@@ -1,12 +1,12 @@
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
name: odh-monitoring
labels:
prometheus: k8s
app: odh-monitoring
namespace: prometheus
spec:
replicas: 1
replicas: 2
serviceAccountName: prometheus-k8s
securityContext: {}
serviceMonitorSelector:
Expand Down
Expand Up @@ -3,10 +3,10 @@ kind: ServiceMonitor
metadata:
labels:
team: opendatahub
name: odhservicemonitor
name: odh-application-servicemonitor
spec:
endpoints:
- port: web # odh-operator, Argo
- port: metrics # Argo
- bearerTokenSecret:
key: PROMETHEUS_API_TOKEN
name: jupyterhub
Expand Down
5 changes: 5 additions & 0 deletions prometheus/operator/base/service-monitors/kustomization.yaml
@@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- application-service-monitor.yaml
- operator-service-monitor.yaml
@@ -0,0 +1,16 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
team: opendatahub
name: odh-operator-servicemonitor
spec:
endpoints:
- port: http-metrics # Open Data Hub Operator
- port: cr-metrics # Open Data Hub Operator
selector:
matchLabels:
name: opendatahub-operator
namespaceSelector:
matchNames:
- openshift-operators
6 changes: 3 additions & 3 deletions tests/basictests/prometheus.sh
Expand Up @@ -20,9 +20,9 @@ function test_prometheus() {
os::cmd::try_until_text "oc get pods -l k8s-app=prometheus-operator --field-selector='status.phase=Running' -o jsonpath='{$.items[*].metadata.name}'" "prometheus-operator" $odhdefaulttimeout $odhdefaultinterval
runningbuspods=($(oc get pods -l k8s-app=prometheus-operator --field-selector="status.phase=Running" -o jsonpath="{$.items[*].metadata.name}"))
os::cmd::expect_success_and_text "echo ${#runningbuspods[@]}" "1"
os::cmd::try_until_text "oc get pods -l app=prometheus --field-selector='status.phase=Running' -o jsonpath='{$.items[*].metadata.name}'" "prometheus-prometheus" $odhdefaulttimeout $odhdefaultinterval
runningbuspods=($(oc get pods -l app=prometheus --field-selector="status.phase=Running" -o jsonpath="{$.items[*].metadata.name}"))
os::cmd::expect_success_and_text "echo ${#runningbuspods[@]}" "1"
os::cmd::try_until_text "oc get pods -l prometheus=odh-monitoring --field-selector='status.phase=Running' -o jsonpath='{$.items[*].metadata.name}'" "prometheus-odh-monitoring" $odhdefaulttimeout $odhdefaultinterval
runningbuspods=($(oc get pods -l prometheus=odh-monitoring --field-selector="status.phase=Running" -o jsonpath="{$.items[*].metadata.name}"))
os::cmd::expect_success_and_text "echo ${#runningbuspods[@]}" "2"
test_promportal
}

Expand Down

0 comments on commit c52b7dd

Please sign in to comment.