Skip to content

Commit

Permalink
Add OLM CSV metrics
Browse files Browse the repository at this point in the history
  • Loading branch information
awgreene committed Nov 19, 2019
1 parent e8fe77a commit 10746a8
Show file tree
Hide file tree
Showing 5 changed files with 19 additions and 3 deletions.
8 changes: 7 additions & 1 deletion docs/data-collection.md
Expand Up @@ -105,7 +105,13 @@ For the OpenShift 4 Developer Preview we will be sending back these exact attrib
// subscription_sync_total is the number of times an OLM operator
// Subscription has been synced, labelled by name and installed csv
'{__name__="subscription_sync_total"}',
//
// csv_succeeded is unique to the namespace, name, version, and phase labels.
// The metrics is always present and can be equal to 0 or 1, where 0 represents that the
// csv is not in the succeeded state while 1 represents that the csv is in the succeeded state.
'{__name__="csv_succeeded"}',
// csv_abnormal represents the reason why a csv is not in the succeeded state and includes the
// namespace, name, version, phase, reason labels. When a csv is updated, the previous time series associated with the csv will be deleted.
'{__name__="csv_abnormal"}',
// OCS metrics to be collected:
// ceph_cluster_total_bytes gives the size of ceph cluster in bytes.
'{__name__="ceph_cluster_total_bytes"}',
Expand Down
2 changes: 1 addition & 1 deletion docs/sample-metrics.md
Expand Up @@ -13,7 +13,7 @@ return the full set of metrics that the Telemeter client captures:

[embedmd]:# (telemeter_query txt)
```txt
{__name__="up"} or {__name__="cluster_version"} or {__name__="cluster_version_available_updates"} or {__name__="cluster_operator_up"} or {__name__="cluster_operator_conditions"} or {__name__="cluster_version_payload"} or {__name__="cluster_installer"} or {__name__="cluster_infrastructure_provider"} or {__name__="cluster_feature_set"} or {__name__="node_uname_info"} or {__name__="instance:etcd_object_counts:sum"} or {__name__="ALERTS",alertstate="firing"} or {__name__="code:apiserver_request_count:rate:sum"} or {__name__="cluster:capacity_cpu_cores:sum"} or {__name__="cluster:capacity_memory_bytes:sum"} or {__name__="cluster:cpu_usage_cores:sum"} or {__name__="cluster:memory_usage_bytes:sum"} or {__name__="openshift:cpu_usage_cores:sum"} or {__name__="openshift:memory_usage_bytes:sum"} or {__name__="workload:cpu_usage_cores:sum"} or {__name__="workload:memory_usage_bytes:sum"} or {__name__="cluster:virt_platform_nodes:sum"} or {__name__="cluster:node_instance_type_count:sum"} or {__name__="cnv:vmi_status_running:count"} or {__name__="node_role_os_version_machine:cpu_capacity_cores:sum"} or {__name__="node_role_os_version_machine:cpu_capacity_sockets:sum"} or {__name__="subscription_sync_total"} or {__name__="ceph_cluster_total_bytes"} or {__name__="ceph_cluster_total_used_raw_bytes"} or {__name__="ceph_health_status"} or {__name__="job:ceph_osd_metadata:count"} or {__name__="job:kube_pv:count"} or {__name__="job:ceph_pools_iops:total"} or {__name__="job:ceph_pools_iops_bytes:total"} or {__name__="job:ceph_versions_running:count"} or {__name__="job:noobaa_total_unhealthy_buckets:sum"} or {__name__="job:noobaa_bucket_count:sum"} or {__name__="job:noobaa_total_object_count:sum"} or {__name__="noobaa_accounts_num"} or {__name__="noobaa_total_usage"} or {__name__="console_url"} or {__name__="cluster:network_attachment_definition_instances:max"} or {__name__="cluster:network_attachment_definition_enabled_instance_up:max"}
{__name__="up"} or {__name__="cluster_version"} or {__name__="cluster_version_available_updates"} or {__name__="cluster_operator_up"} or {__name__="cluster_operator_conditions"} or {__name__="cluster_version_payload"} or {__name__="cluster_installer"} or {__name__="cluster_infrastructure_provider"} or {__name__="cluster_feature_set"} or {__name__="node_uname_info"} or {__name__="instance:etcd_object_counts:sum"} or {__name__="ALERTS",alertstate="firing"} or {__name__="code:apiserver_request_count:rate:sum"} or {__name__="cluster:capacity_cpu_cores:sum"} or {__name__="cluster:capacity_memory_bytes:sum"} or {__name__="cluster:cpu_usage_cores:sum"} or {__name__="cluster:memory_usage_bytes:sum"} or {__name__="openshift:cpu_usage_cores:sum"} or {__name__="openshift:memory_usage_bytes:sum"} or {__name__="workload:cpu_usage_cores:sum"} or {__name__="workload:memory_usage_bytes:sum"} or {__name__="cluster:virt_platform_nodes:sum"} or {__name__="cluster:node_instance_type_count:sum"} or {__name__="cnv:vmi_status_running:count"} or {__name__="node_role_os_version_machine:cpu_capacity_cores:sum"} or {__name__="node_role_os_version_machine:cpu_capacity_sockets:sum"} or {__name__="subscription_sync_total"} or {__name__="csv_succeeded"} or {__name__="csv_abnormal"} or {__name__="ceph_cluster_total_bytes"} or {__name__="ceph_cluster_total_used_raw_bytes"} or {__name__="ceph_health_status"} or {__name__="job:ceph_osd_metadata:count"} or {__name__="job:kube_pv:count"} or {__name__="job:ceph_pools_iops:total"} or {__name__="job:ceph_pools_iops_bytes:total"} or {__name__="job:ceph_versions_running:count"} or {__name__="job:noobaa_total_unhealthy_buckets:sum"} or {__name__="job:noobaa_bucket_count:sum"} or {__name__="job:noobaa_total_object_count:sum"} or {__name__="noobaa_accounts_num"} or {__name__="noobaa_total_usage"} or {__name__="console_url"} or {__name__="cluster:network_attachment_definition_instances:max"} or {__name__="cluster:network_attachment_definition_enabled_instance_up:max"}
```

For reference, here is an example response produced by a running OpenShift cluster:
Expand Down
8 changes: 7 additions & 1 deletion jsonnet/telemeter/metrics.jsonnet
Expand Up @@ -97,7 +97,13 @@
// subscription_sync_total is the number of times an OLM operator
// Subscription has been synced, labelled by name and installed csv
'{__name__="subscription_sync_total"}',
//
// csv_succeeded is unique to the namespace, name, version, and phase labels.
// The metrics is always present and can be equal to 0 or 1, where 0 represents that the
// csv is not in the succeeded state while 1 represents that the csv is in the succeeded state.
'{__name__="csv_succeeded"}',
// csv_abnormal represents the reason why a csv is not in the succeeded state and includes the
// namespace, name, version, phase, reason labels. When a csv is updated, the previous time series associated with the csv will be deleted.
'{__name__="csv_abnormal"}',
// OCS metrics to be collected:
// ceph_cluster_total_bytes gives the size of ceph cluster in bytes.
'{__name__="ceph_cluster_total_bytes"}',
Expand Down
2 changes: 2 additions & 0 deletions manifests/benchmark/statefulSetTelemeterServer.yaml
Expand Up @@ -52,6 +52,8 @@ spec:
- --whitelist={__name__="node_role_os_version_machine:cpu_capacity_cores:sum"}
- --whitelist={__name__="node_role_os_version_machine:cpu_capacity_sockets:sum"}
- --whitelist={__name__="subscription_sync_total"}
- --whitelist={__name__="csv_succeeded"}
- --whitelist={__name__="csv_abnormal"}
- --whitelist={__name__="ceph_cluster_total_bytes"}
- --whitelist={__name__="ceph_cluster_total_used_raw_bytes"}
- --whitelist={__name__="ceph_health_status"}
Expand Down
2 changes: 2 additions & 0 deletions manifests/client/deployment.yaml
Expand Up @@ -54,6 +54,8 @@ spec:
- --match={__name__="node_role_os_version_machine:cpu_capacity_cores:sum"}
- --match={__name__="node_role_os_version_machine:cpu_capacity_sockets:sum"}
- --match={__name__="subscription_sync_total"}
- --match={__name__="csv_succeeded"}
- --match={__name__="csv_abnormal"}
- --match={__name__="ceph_cluster_total_bytes"}
- --match={__name__="ceph_cluster_total_used_raw_bytes"}
- --match={__name__="ceph_health_status"}
Expand Down

0 comments on commit 10746a8

Please sign in to comment.