Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

Alert Manager Will Not Start #6507

Closed
rewt opened this issue Jul 6, 2018 · 5 comments
Closed

Alert Manager Will Not Start #6507

rewt opened this issue Jul 6, 2018 · 5 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@rewt
Copy link

rewt commented Jul 6, 2018

Version of Helm and Kubernetes:
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

Which chart:
stabe/prometheus

What happened:
level=info ts=2018-07-06T15:28:07.859903912Z caller=main.go:174 msg="Starting Alertmanager" version="(version=0.15.0, branch=HEAD, revision=462c969d85cf1a473587754d55e4a3c4a2abc63c)" level=info ts=2018-07-06T15:28:07.859973115Z caller=main.go:175 build_context="(go=go1.10.3, user=root@bec9939eb862, date=20180622-11:58:41)" level=info ts=2018-07-06T15:28:07.865063193Z caller=cluster.go:155 component=cluster msg="setting advertise address explicitly" addr=<nil> port=9094 level=error ts=2018-07-06T15:28:07.866180432Z caller=main.go:201 msg="Unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: Failed to parse advertise address \"<nil>\""

What you expected to happen:
Alertmanager to start correctly

How to reproduce it (as minimally and precisely as possible):
Git clone repo and helm install

This is on Azure Kubernetes Services with default ClusterIP config. I've found some other issues where this message appears to occur when alertmanager cannot find IP address, but not understanding why other containers get IP just fine, while alertmanager cannot?

@Arun8460
Copy link

Facing same issue:
level=info ts=2018-07-12T10:28:29.429527993Z caller=main.go:174 msg="Starting Alertmanager" version="(version=0.15.0, branch=HEAD, revision=462c969d85cf1a473587754d55e4a3c4a2abc63c)"
level=info ts=2018-07-12T10:28:29.429617252Z caller=main.go:175 build_context="(go=go1.10.3, user=root@bec9939eb862, date=20180622-11:58:41)"
level=info ts=2018-07-12T10:28:29.45677939Z caller=cluster.go:155 component=cluster msg="setting advertise address explicitly" addr= port=8001
level=error ts=2018-07-12T10:28:29.463090869Z caller=main.go:201 msg="Unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: Failed to parse advertise address """

@Mikulas
Copy link

Mikulas commented Jul 17, 2018

This is same as those two issues:
prometheus/alertmanager#1271
prometheus/alertmanager#1434

I've patched the deployment as

diff --git a/charts/vendor/prometheus/templates/alertmanager-deployment.yaml b/charts/vendor/prometheus/templates/alertmanager-deployment.yaml
index fd51d2d..e109352 100755
--- a/charts/vendor/prometheus/templates/alertmanager-deployment.yaml
+++ b/charts/vendor/prometheus/templates/alertmanager-deployment.yaml
@@ -42,9 +42,15 @@ spec:
             - name: {{ $key }}
               value: {{ $value }}
             {{- end }}
+            - name: POD_IP
+              valueFrom:
+                fieldRef:
+                  apiVersion: v1
+                  fieldPath: status.podIP
           args:
             - --config.file=/etc/config/alertmanager.yml
             - --storage.path={{ .Values.alertmanager.persistentVolume.mountPath }}
+            - --cluster.advertise-address=$(POD_IP):6783
           {{- range $key, $value := .Values.alertmanager.extraArgs }}
             - --{{ $key }}={{ $value }}
           {{- end }}

and alertmanager starts.

level=info ts=2018-07-17T07:52:02.776561077Z caller=main.go:320 msg="Loading configuration file" file=/etc/config/alertmanager.yml level=info ts=2018-07-17T07:52:02.776820037Z caller=cluster.go:570 component=cluster msg="Waiting for gossip to settle..." interval=10m0s level=info ts=2018-07-17T07:52:02.788541771Z caller=main.go:396 msg=Listening address=:9093 level=info ts=2018-07-17T07:53:02.769489698Z caller=cluster.go:579 component=cluster msg="gossip not settled but continuing anyway" polls=0 elapsed=59.992592358s

@TheKangaroo
Copy link
Contributor

Hey guys, we hit the exact same issue with alertmanager. Fortunately, your patch @Mikulas solves this problem. Can you provide a PR with this patch? :)

@stale
Copy link

stale bot commented Aug 22, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 22, 2018
@stale
Copy link

stale bot commented Sep 5, 2018

This issue is being automatically closed due to inactivity.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

4 participants