Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/alertmanager: start after the network-online target #82808

Merged
merged 1 commit into from Mar 28, 2020

Conversation

@nlewo
Copy link
Member

nlewo commented Mar 17, 2020

Motivation for this change

Since an upgrade to 20.03, the Prometheus alertmanager is no longer starting after a reboot on my machine:

If the host network stack is slow to start, the alertmanager fails to
start with this error message:

caller=main.go:256 msg="unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"

This bug can be reproduced by shutting down the network stack and
restarting the alertmanager.

Note I don't know why I didn't hit this issue with previous
alertmanager releases.

Things done

I deployed the patch on this machine and the issue is now fixed.

We don't have any tests for the alertmanager. And i'm not able to reproduce the issue in a private Prometheus NixOS test because the VM network stack is up too fast.

  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
If the host network stack is slow to start, the alertmanager fails to
start with this error message:

    caller=main.go:256 msg="unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"

This bug can be reproduced by shutting down the network stack and
restarting the alertmanager.

Note I don't know why I didn't hit this issue with previous
alertmanager releases.
@nlewo nlewo requested a review from andir Mar 17, 2020
@nlewo nlewo added this to the 20.03 milestone Mar 17, 2020
@Ma27 Ma27 requested a review from fpletz Mar 19, 2020
@nlewo nlewo merged commit afb035a into NixOS:master Mar 28, 2020
12 checks passed
12 checks passed
Evaluation Performance Report Evaluator Performance Report
Details
grahamcofborg-eval ^.^!
Details
grahamcofborg-eval-check-meta config.nix: checkMeta = true
Details
grahamcofborg-eval-darwin nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A darwin-tested
Details
grahamcofborg-eval-nixos nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release-combined.nix -A tested
Details
grahamcofborg-eval-nixos-manual nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release.nix -A manual
Details
grahamcofborg-eval-nixos-options nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release.nix -A options
Details
grahamcofborg-eval-nixpkgs-manual nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A manual
Details
grahamcofborg-eval-nixpkgs-tarball nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A tarball
Details
grahamcofborg-eval-nixpkgs-unstable-jobset nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A unstable
Details
grahamcofborg-eval-package-list nix-env -qa --json --file .
Details
grahamcofborg-eval-package-list-no-aliases nix-env -qa --json --file . --arg config { allowAliases = false; }
Details
@nlewo
Copy link
Member Author

nlewo commented Mar 28, 2020

Backported in d87fea7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

1 participant
You can’t perform that action at this time.