Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alertmanagers static_configs no longer working on HEAD #3637

Closed
hoffie opened this Issue Dec 29, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@hoffie
Copy link

hoffie commented Dec 29, 2017

What did you do?
Start Prometheus (587dec9) with the example config (with the example alertmanager target enabled).

What did you expect to see?
Status page (http://localhost:9090/status) should show one configured Alertmanager.

What did you see instead? Under which circumstances?
Status page does not show any Alertmanagers.

Environment

  • System information: Linux 4.14.8-1-ARCH x86_64
  • Prometheus version:
prometheus, version 2.0.0 (branch: HEAD, revision: 587dec9eb970531cddc7f1803d258e72129b5aa0)
  build user:       christian@wuechoo
  build date:       20171229-22:55:19
  go version:       go1.9.2
  • Prometheus configuration file:
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

file_sd_configs seems to yield the same result. This may indicate a general issue with alertmanager target discovery in master. I have not tried other discovery methods though. Maybe I am misunderstand something or I am supposed to use different configs now, but for now, this looks like a bug to me as it unexpectedly changes previous behaviour.

I tried bisect'ing this issue but could not identify a single commit due to build issues (retrieval/manager.go:95:11: undefined: TargetManager).

The last working and building commit is a8cce41.
The first non-working and building commit is 587dec9

$ git bisect log
# only skipped commits left to test
# possible first bad commit: [587dec9eb970531cddc7f1803d258e72129b5aa0] rebased and resolved conflicts with the new Discovery GUI page
# possible first bad commit: [60ef2016d508545f0e8b575a954899cae37a96e2] add a cancel func to the scrape pool as it is needed in the scrape loop select block
# possible first bad commit: [80182a5d82d0f29fe8ec483f2ec1ae6d10f9f1de] use poolKey as the pool map key to avoid multi dimensional maps
# possible first bad commit: [1ec76d1950862c1cd5263065616212a97670449d] rearange the contexts variables and logic split the groupsMerge function to set and get other small nits
# possible first bad commit: [6ff1d5c51e3cb0e0df58aa90a473317e25c6d4ac] add the scrape manager config reloader handle errors with invalid scrape config
# possible first bad commit: [b0d4f6ee08632e09d0b0cc8ec0b2a0cbf61b5861] resolved merge confilc in main.go
# possible first bad commit: [f2df712166ef4e16e69d8aace76202619483c0f8] updated README
# possible first bad commit: [aca8f85699211c2453f0121a4015975c605fc2dd] fixed the tests
# possible first bad commit: [fe6c544532360a09c3cbdc6d235d6798b366263d] some renaming and comments fixes. remove some select state that is most likely obsoleete and hoepfully doesn't braje anything :) merge targets will sort by Discoverer name so we can have consistent tests for the maps.
# possible first bad commit: [f5c2c5ff8fc8e21f9bfaa23d0f81c8245ba8632e] brake the start provider func so that can run unit tests against it.
# possible first bad commit: [c5cb0d2910692cfd77b36d7d270677f58f55d3f9] simplify naming and API.
# possible first bad commit: [9c61f0e8a0c1d4f83d06e003af610a7d671287b2] scrape pool doesn't rely on context as Stop() needs to be blocking to prevent Scrape loops trying to write to a closed TSDB storage.
# possible first bad commit: [e405e2f1ea2156f76a7e70f5471b88a590a8e24a] refactored discovery

According to the commits, this may be related to pull request #3362 (@krasi-georgiev).

@hoffie hoffie changed the title Alertmanager static_configs no longer working on HEAD alertmanagers static_configs no longer working on HEAD Dec 29, 2017

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Dec 30, 2017

I found the issue and will make a PR with the fix.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Dec 30, 2017

The PR seems to work with your config ,so you can test now or can subscribe the PR to see if the maintainers would require some changes first.

It would be great if you could test different service discovery providers if you have something ready.

@hoffie

This comment has been minimized.

Copy link
Author

hoffie commented Dec 31, 2017

@krasi-georgiev: I can confirm that your branch fixes this issue for static_configs and file_sd_configs and allows usage of alertmanagers again. Thanks for your effort!
Leaving this issue open until this is merged.

Thanks!

@xuelvming

This comment has been minimized.

Copy link

xuelvming commented Jan 16, 2018

I guess I have the same issue here , not able to make the alertmanager recognized by the prometheus ...
Hope we can make it work soon...

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.