Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-prometheus-chart] Alertmanager not getting installed #3293

Closed
munsayac13 opened this issue Apr 26, 2023 · 5 comments
Closed

[kube-prometheus-chart] Alertmanager not getting installed #3293

munsayac13 opened this issue Apr 26, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@munsayac13
Copy link

munsayac13 commented Apr 26, 2023

Is your feature request related to a problem ?

Im using kube-prometheus-stack chart version 45.21.0 in google cloud platform. I would like to have feature or perhaps find some kind detailed events or issue as to why Alertmanager or Prometheus does not get installed. Currently, i deleted and recreated the CRDS then re-installed kube-prometheus-stack. Alertmanager did not get deployed but all other prometheus components were deployed. Enabled and disabled alertmanager from the values.yaml and reinstalled stack, still alertmanager did not work but prometheus components were deployed.

when i do kubectl get alertmanager -n monitoring, I see this

NAME VERSION REPLICAS READY RECONCILED AVAILABLE AGE

prometheus-kube-prometheus-alertmanager v0.25.0 1 20m

when i do kubectl describe alertmanager -n monitoring prometheus-kube-prometheus-alertmanager, I see nothing but specs.

Can someone assist? Thanks.

Describe the solution you'd like.

Looking into more details as to why alertmanager does not get installed

Describe alternatives you've considered.

  1. First I deleted and recreated CRDs then reinstalled kube-prometheus-stack. Nothing works.

kubectl delete crd alertmanagerconfigs.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com

kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.63.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.63.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml

  1. Disabled and enabled alertmanager in values.yaml. But still alertmanager does not get installed.

Additional context.

This is my alertmanager values.yaml

alertmanager:
alertmanagerSpec:
additionalPeers: []
affinity: {}
alertmanagerConfigMatcherStrategy: {}
alertmanagerConfigNamespaceSelector: {}
alertmanagerConfigSelector: {}
alertmanagerConfiguration: {}
clusterAdvertiseAddress: false
configMaps: []
containers: []
externalUrl: null
forceEnableClusterMode: false
image:
registry: quay.io
repository: prometheus/alertmanager
sha: ""
tag: v0.25.0
initContainers: []
listenLocal: false
logFormat: logfmt
logLevel: info
minReadySeconds: 0
nodeSelector: {}
paused: false
podAntiAffinity: ""
podAntiAffinityTopologyKey: kubernetes.io/hostname
podMetadata: {}
portName: http-web
priorityClassName: ""
replicas: 1
resources:
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 100m
memory: 16Mi
retention: 120h
routePrefix: /
secrets: []
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
storage: {}
tolerations: []
topologySpreadConstraints: []
useExistingSecret: false
volumeMounts: []
volumes: []
web: {}
annotations: {}
apiVersion: v2
config:
global:
resolve_timeout: 5m
smtp_auth_password: xxxxxxxxx
smtp_auth_username: xxxxxxxx
smtp_from: xxxxxxxxx
smtp_require_tls: true
smtp_smarthost: xxxxxxxx:25
receivers:
- email_configs:
- to: xxxxx@xxxxxxx
name: google-gitlab
- name: discord_webhook
slack_configs:
- channel: 'xxxxxxx'
username: 'xxxxxxx'
api_url: https://discord.com/api/webhooks/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
route:
group_by:
- job
group_interval: 5m
group_wait: 30s
receiver: discord_webhook
repeat_interval: 60m
routes:
- continue: true
match:
severity: page
receiver: discord_webhook
- match:
severity: page
receiver: devops
templates:
- /etc/alertmanager/config/*.tmpl
enabled: true
extraSecret:
annotations: {}
data: {}
ingress:
annotations: {}
enabled: false
hosts: []
labels: {}
paths: []
tls: []
ingressPerReplica:
annotations: {}
enabled: false
hostDomain: ""
hostPrefix: ""
labels: {}
paths: []
tlsSecretName: ""
tlsSecretPerReplica:
enabled: false
prefix: alertmanager
podDisruptionBudget:
enabled: false
maxUnavailable: ""
minAvailable: 1
secret:
annotations: {}
service:
additionalPorts: []
annotations: {}
clusterIP: ""
externalIPs: []
externalTrafficPolicy: Cluster
labels: {}
loadBalancerIP: ""
loadBalancerSourceRanges: []
nodePort: 30903
port: 9093
targetPort: 9093
type: ClusterIP
serviceAccount:
annotations: {}
create: true
name: ""
serviceMonitor:
additionalLabels: {}
bearerTokenFile: null
enableHttp2: true
interval: ""
labelLimit: 0
labelNameLengthLimit: 0
labelValueLengthLimit: 0
metricRelabelings: []
proxyUrl: ""
relabelings: []
sampleLimit: 0
scheme: ""
selfMonitor: true
targetLimit: 0
tlsConfig: {}
servicePerReplica:
annotations: {}
enabled: false
externalTrafficPolicy: Cluster
loadBalancerSourceRanges: []
nodePort: 30904
port: 9093
targetPort: 9093
type: ClusterIP
stringConfig: ""
templateFiles: {}
tplConfig: false

@munsayac13 munsayac13 added the enhancement New feature or request label Apr 26, 2023
@jmc000
Copy link

jmc000 commented Apr 27, 2023

Same issue here using GCP and v45.21.0 too

@chichi13
Copy link

Same issue with the same version on GCP/Minikube and the same debugging process. Also tried with older versions (v45.10.0 for example).

Can you change the label to bug instead of enhancement please ?

@jmc000
Copy link

jmc000 commented Apr 27, 2023

Ok I found what was the problem in my case. The Alertmanager CRD didnt created the statefulset because of an incorrect alert-manager configuration file (given by a secret).

After correcting my alert manager config secret file pods have started to appear.

It would be better if config file error does not stop CRD from creating the alertmanager pods, instead it should indicate it in the log of the pods

@zeritti
Copy link
Member

zeritti commented Apr 27, 2023

In cases in which custom resources like prometheus and alertmanager do not behave as expected, the first steps should lead to Prometheus operator and its logs - it does report errors on those resources if they occur. For instance, if the operator does not validate alertmanager's configuration successfully, it won't launch the resource.

@munsayac13
Copy link
Author

munsayac13 commented Apr 27, 2023

Ok I found what was the problem in my case. The Alertmanager CRD didnt created the statefulset because of an incorrect alert-manager configuration file (given by a secret).

After correcting my alert manager config secret file pods have started to appear.

It would be better if config file error does not stop CRD from creating the alertmanager pods, instead it should indicate it in the log of the pods

Thanks to all who responded. I would agree with this notion. I applied the same solution you did and got rid of secret then used helm upgrade again, then alertmanager started to appear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants