Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: support health assessment for prometheus operator < v0.56 (#5620) #11901

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,23 +1,30 @@
local hs={ status = "Progressing", message = "Waiting for initialization" }
local hs = { status = "Progressing", message = "Waiting for initialization" }
found_status = false

if obj.status ~= nil then
if obj.status.conditions ~= nil then
for i, condition in ipairs(obj.status.conditions) do

if condition.type == "Available" and condition.status ~= "True" then
if condition.reason == "SomePodsNotReady" then
hs.status = "Progressing"
if condition.type == "Available" then
found_status = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no condition is Available then it should be marked as Unknown

if condition.status ~= "True" then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found_status should be set to true right after the if obj.status ~= nil then statement.

if condition.reason == "SomePodsNotReady" then
hs.status = "Progressing"
else
hs.status = "Degraded"
end
hs.message = condition.message or condition.reason
else
hs.status = "Degraded"
hs.status = "Healthy"
hs.message = "All instances are available"
end
hs.message = condition.message or condition.reason
end
if condition.type == "Available" and condition.status == "True" then
hs.status = "Healthy"
hs.message = "All instances are available"
end
end
end
end

return hs
if not found_status then
hs = { status = "Unknown", message = "Status is not provided" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Argo CD, resources that don't provide status (like ConfigMap and Secrets), are always marked as healthy. The reason is that this impacts the overall Application status and in this case Apps will always be marked as Unknown. I suggest changing this to healthy instead.

end

return hs
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,7 @@ tests:
status: Degraded
message: "shard 0: pod prometheus-prometheus-stack-kube-prom-prometheus-0: 0/5 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) were unschedulable.\nshard 0: pod prometheus-prometheus-stack-kube-prom-prometheus-1: 0/5 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) were unschedulable."
inputPath: testdata/degraded.yaml
- healthStatus:
status: Unknown
message: "Status is not provided"
inputPath: testdata/unknown.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
annotations:
argocd.argoproj.io/tracking-id: >-
prometheus-stack:monitoring.coreos.com/Prometheus:prometheus/prometheus-stack-kube-prom-prometheus
creationTimestamp: '2021-12-09T15:51:10Z'
generation: 46
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: prometheus-stack
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 34.10.0
chart: kube-prometheus-stack-34.10.0
heritage: Helm
release: prometheus-stack
name: prometheus-stack-kube-prom-prometheus
namespace: prometheus
resourceVersion: '200307978'
uid: 6f2e1016-926d-44e7-945b-dec4c975595b
spec:
additionalScrapeConfigs:
key: prometheus-additional.yaml
name: additional-scrape-configs
alerting:
alertmanagers:
- apiVersion: v2
name: prometheus-stack-kube-prom-alertmanager
namespace: prometheus
pathPrefix: /
port: http-web
containers:
- name: prometheus
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
- name: config-reloader
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
enableAdminAPI: false
evaluationInterval: 30s
externalUrl: 'http://prometheus-stack-kube-prom-prometheus.prometheus:9090'
image: 'quay.io/prometheus/prometheus:v2.37.0'
initContainers:
- name: init-config-reloader
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
portName: http-web
probeNamespaceSelector: {}
probeSelector: {}
replicas: 2
resources:
requests:
memory: 700Mi
retention: 6h
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector: {}
scrapeInterval: 10s
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-stack-kube-prom-prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
shards: 1
storage:
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: default
topologySpreadConstraints:
- labelSelector:
matchLabels:
app.kubernetes.io/name: prometheus
maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
- labelSelector:
matchLabels:
app.kubernetes.io/name: prometheus
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
version: v2.37.0
Loading