Skip to content

Commit

Permalink
Correct the way nodes are computed for alert ClusterIPTablesStale
Browse files Browse the repository at this point in the history
Change kube_pod_info_node_count to sum(kube_pod_info{namespace="openshift-sdn",  pod=~"ovs.*"}) this more accuratly computes
the alert by returning the number of nodes that have an ovs pod running

also change using time() to timestamp()
  • Loading branch information
JacobTanenbaum committed Sep 24, 2019
1 parent bf92b4e commit adb7bf7
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion bindata/network/openshift-sdn/alert-rules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,10 @@ spec:
annotations:
message: The average time between iptables resyncs is too high. NOTE - There is some scrape delay and other offsets, 90s isn't exact but it is still too high.
expr: |
time() - (sum(kubeproxy_sync_proxy_rules_last_timestamp_seconds) / :kube_pod_info_node_count:) > 90
quantile(0.95,
timestamp(kubeproxy_sync_proxy_rules_last_timestamp_seconds)
- on(pod) kubeproxy_sync_proxy_rules_last_timestamp_seconds
* on(pod) group_right kube_pod_info{namespace="openshift-sdn", pod=~"sdn-[^-]*"}) > 90
for: 20m
labels:
severity: warning

0 comments on commit adb7bf7

Please sign in to comment.