[kube-prometheus-chart] Alertmanager not getting installed #3293

munsayac13 · 2023-04-26T23:00:12Z

Is your feature request related to a problem ?

Im using kube-prometheus-stack chart version 45.21.0 in google cloud platform. I would like to have feature or perhaps find some kind detailed events or issue as to why Alertmanager or Prometheus does not get installed. Currently, i deleted and recreated the CRDS then re-installed kube-prometheus-stack. Alertmanager did not get deployed but all other prometheus components were deployed. Enabled and disabled alertmanager from the values.yaml and reinstalled stack, still alertmanager did not work but prometheus components were deployed.

when i do kubectl get alertmanager -n monitoring, I see this

NAME VERSION REPLICAS READY RECONCILED AVAILABLE AGE

prometheus-kube-prometheus-alertmanager v0.25.0 1 20m

when i do kubectl describe alertmanager -n monitoring prometheus-kube-prometheus-alertmanager, I see nothing but specs.

Can someone assist? Thanks.

Describe the solution you'd like.

Looking into more details as to why alertmanager does not get installed

Describe alternatives you've considered.

First I deleted and recreated CRDs then reinstalled kube-prometheus-stack. Nothing works.

kubectl delete crd alertmanagerconfigs.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com

kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.63.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.63.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml

Disabled and enabled alertmanager in values.yaml. But still alertmanager does not get installed.

Additional context.

This is my alertmanager values.yaml

alertmanager:
alertmanagerSpec:
additionalPeers: []
affinity: {}
alertmanagerConfigMatcherStrategy: {}
alertmanagerConfigNamespaceSelector: {}
alertmanagerConfigSelector: {}
alertmanagerConfiguration: {}
clusterAdvertiseAddress: false
configMaps: []
containers: []
externalUrl: null
forceEnableClusterMode: false
image:
registry: quay.io
repository: prometheus/alertmanager
sha: ""
tag: v0.25.0
initContainers: []
listenLocal: false
logFormat: logfmt
logLevel: info
minReadySeconds: 0
nodeSelector: {}
paused: false
podAntiAffinity: ""
podAntiAffinityTopologyKey: kubernetes.io/hostname
podMetadata: {}
portName: http-web
priorityClassName: ""
replicas: 1
resources:
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 100m
memory: 16Mi
retention: 120h
routePrefix: /
secrets: []
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
storage: {}
tolerations: []
topologySpreadConstraints: []
useExistingSecret: false
volumeMounts: []
volumes: []
web: {}
annotations: {}
apiVersion: v2
config:
global:
resolve_timeout: 5m
smtp_auth_password: xxxxxxxxx
smtp_auth_username: xxxxxxxx
smtp_from: xxxxxxxxx
smtp_require_tls: true
smtp_smarthost: xxxxxxxx:25
receivers:
- email_configs:
- to: xxxxx@xxxxxxx
name: google-gitlab
- name: discord_webhook
slack_configs:
- channel: 'xxxxxxx'
username: 'xxxxxxx'
api_url: https://discord.com/api/webhooks/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
route:
group_by:
- job
group_interval: 5m
group_wait: 30s
receiver: discord_webhook
repeat_interval: 60m
routes:
- continue: true
match:
severity: page
receiver: discord_webhook
- match:
severity: page
receiver: devops
templates:
- /etc/alertmanager/config/*.tmpl
enabled: true
extraSecret:
annotations: {}
data: {}
ingress:
annotations: {}
enabled: false
hosts: []
labels: {}
paths: []
tls: []
ingressPerReplica:
annotations: {}
enabled: false
hostDomain: ""
hostPrefix: ""
labels: {}
paths: []
tlsSecretName: ""
tlsSecretPerReplica:
enabled: false
prefix: alertmanager
podDisruptionBudget:
enabled: false
maxUnavailable: ""
minAvailable: 1
secret:
annotations: {}
service:
additionalPorts: []
annotations: {}
clusterIP: ""
externalIPs: []
externalTrafficPolicy: Cluster
labels: {}
loadBalancerIP: ""
loadBalancerSourceRanges: []
nodePort: 30903
port: 9093
targetPort: 9093
type: ClusterIP
serviceAccount:
annotations: {}
create: true
name: ""
serviceMonitor:
additionalLabels: {}
bearerTokenFile: null
enableHttp2: true
interval: ""
labelLimit: 0
labelNameLengthLimit: 0
labelValueLengthLimit: 0
metricRelabelings: []
proxyUrl: ""
relabelings: []
sampleLimit: 0
scheme: ""
selfMonitor: true
targetLimit: 0
tlsConfig: {}
servicePerReplica:
annotations: {}
enabled: false
externalTrafficPolicy: Cluster
loadBalancerSourceRanges: []
nodePort: 30904
port: 9093
targetPort: 9093
type: ClusterIP
stringConfig: ""
templateFiles: {}
tplConfig: false

jmc000 · 2023-04-27T10:08:53Z

Same issue here using GCP and v45.21.0 too

chichi13 · 2023-04-27T10:36:44Z

Same issue with the same version on GCP/Minikube and the same debugging process. Also tried with older versions (v45.10.0 for example).

Can you change the label to bug instead of enhancement please ?

jmc000 · 2023-04-27T11:39:12Z

Ok I found what was the problem in my case. The Alertmanager CRD didnt created the statefulset because of an incorrect alert-manager configuration file (given by a secret).

After correcting my alert manager config secret file pods have started to appear.

It would be better if config file error does not stop CRD from creating the alertmanager pods, instead it should indicate it in the log of the pods

zeritti · 2023-04-27T12:02:14Z

In cases in which custom resources like prometheus and alertmanager do not behave as expected, the first steps should lead to Prometheus operator and its logs - it does report errors on those resources if they occur. For instance, if the operator does not validate alertmanager's configuration successfully, it won't launch the resource.

munsayac13 · 2023-04-27T14:31:07Z

Ok I found what was the problem in my case. The Alertmanager CRD didnt created the statefulset because of an incorrect alert-manager configuration file (given by a secret).

After correcting my alert manager config secret file pods have started to appear.

It would be better if config file error does not stop CRD from creating the alertmanager pods, instead it should indicate it in the log of the pods

Thanks to all who responded. I would agree with this notion. I applied the same solution you did and got rid of secret then used helm upgrade again, then alertmanager started to appear.

munsayac13 added the enhancement New feature or request label Apr 26, 2023

munsayac13 closed this as completed Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[kube-prometheus-chart] Alertmanager not getting installed #3293

[kube-prometheus-chart] Alertmanager not getting installed #3293

munsayac13 commented Apr 26, 2023 •

edited

jmc000 commented Apr 27, 2023

chichi13 commented Apr 27, 2023

jmc000 commented Apr 27, 2023

zeritti commented Apr 27, 2023

munsayac13 commented Apr 27, 2023 •

edited

[kube-prometheus-chart] Alertmanager not getting installed #3293

[kube-prometheus-chart] Alertmanager not getting installed #3293

Comments

munsayac13 commented Apr 26, 2023 • edited

Is your feature request related to a problem ?

Describe the solution you'd like.

Describe alternatives you've considered.

Additional context.

jmc000 commented Apr 27, 2023

chichi13 commented Apr 27, 2023

jmc000 commented Apr 27, 2023

zeritti commented Apr 27, 2023

munsayac13 commented Apr 27, 2023 • edited

munsayac13 commented Apr 26, 2023 •

edited

munsayac13 commented Apr 27, 2023 •

edited