Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rules fail to load with yaml unmarshal error in 2.00 alpha.3 release #2913

Closed
xamogast opened this Issue Jul 7, 2017 · 5 comments

Comments

Projects
None yet
5 participants
@xamogast
Copy link

xamogast commented Jul 7, 2017

What did you do?
I tried to load my alert ruleset using the 'rule_files:' directive. (see config below) I also tried with the [] notation, and indenting the path.

What did you expect to see?
Proemtheus starting up and rules loading succesfully. (When I start it without including the rule file, it starts)

What did you see instead? Under which circumstances?
Prometheus fails to start. See error message in the log section below. I don't get any errors when using the 1.7.1 release, same config loads any ruleset.

Environment

  • System information:

Linux 4.4.0-1020-aws x86_64

  • Prometheus version:

prometheus, version 2.0.0-alpha.3 (branch: master, revision: 70f96b0)
build user: root@5630fb1ab539
build date: 20170622-10:04:46
go version: go1.8.3

  • Alertmanager version:

alertmanager, version 0.7.1 (branch: master, revision: ab4138299b94c78dc554ea96e2ab28d04b048059)
build user: root@97e9539a4c3f
build date: 20170609-15:31:09
go version: go1.8.3

  • Prometheus configuration file:
global:
  scrape_interval:     15s
  evaluation_interval: 15s

rule_files:
- "alert.rules"

scrape_configs:
# Node exporter
  - job_name: 'node'
    scrape_interval: "15s"
    file_sd_configs:
      - files:
        - '/home/hajzso/prometheus/targets/node/*.json'
        - '/home/hajzso/prometheus/discovery/node/*.json'
# Blackbox exporter
  - job_name: 'blackbox_http'
    metrics_path: /probe
    params:
      module: [http_2xx]  # Look for a HTTP 200 response.
    static_configs:
      - targets:
        - https:/example.com
        
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 127.0.0.1:9115  # Blackbox exporter
  • Rules file:
ALERT website_down
  IF probe_success == 0
  FOR 3m
  LABELS { severity = "hipchat" }
  ANNOTATIONS {
    summary = "Instance {{ $labels.instance }} down",
    description = "{{ $labels.instance }} has been down for more than 3 minutes.",
}

ALERT downage
  IF up == 0
FOR 1h
  • Alertmanager configuration file:
route:
 group_by: [cluster]
 # If an alert isn't caught by a route, send it to hipchat.
 receiver: team-hipchat
 routes:
  # Send severity=hipchat alerts to hipchat.
  - match:
      severity: hipchat
    receiver: team-hipchat

receivers:
- name: team-hipchat
  hipchat_configs:
  - room_id: 12345
    auth_token: 12345
    notify: true
  • Logs:
● prometheus.service - Prometheus server
   Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fri 2017-07-07 07:09:36 UTC; 2s ago
  Process: 31904 ExecStart=/home/hajzso/prometheus/prometheus/prometheus --config.file=/home/hajzso/prometheus/prometheus/prometheus.yml (code=exited, status=1/FAILURE)
 Main PID: 31904 (code=exited, status=1/FAILURE)

Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]: time="2017-07-07T07:09:36Z" level=info msg="Starting tsdb" source="main.go:210"
Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]: time="2017-07-07T07:09:36Z" level=info msg="tsdb started" source="main.go:216"
Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]: time="2017-07-07T07:09:36Z" level=info msg="Loading configuration file /home/hajzso/prometheus/prometheus/prometheus.yml" source="main.go:344"
Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]: time="2017-07-07T07:09:36Z" level=error msg="yaml: unmarshal errors:
Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]:   line 1: cannot unmarshal !!str `ALERT w...` into rulefmt.RuleGroups" source="manager.go:484"
Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]: time="2017-07-07T07:09:36Z" level=error msg="Failed to apply configuration: error loading rules, previous rule set restored" source="main.go:362"
Jul 07 07:09:36 ip-10-0-0-157 prometheus[31904]: time="2017-07-07T07:09:36Z" level=error msg="Error loading config: one or more errors occurred while applying the new configuration (-config.file=/home/hajzso/prometheus/prometheus/prometheus.yml)" source="main.go:265"
Jul 07 07:09:36 ip-10-0-0-157 systemd[1]: prometheus.service: Main process exited, code=exited, status=1/FAILURE
Jul 07 07:09:36 ip-10-0-0-157 systemd[1]: prometheus.service: Unit entered failed state.
Jul 07 07:09:36 ip-10-0-0-157 systemd[1]: prometheus.service: Failed with result 'exit-code'.
e

@xamogast xamogast changed the title Rules fail to load with yaml unmarshal error Rules fail to load with yaml unmarshal error in alpha release Jul 7, 2017

@xamogast xamogast changed the title Rules fail to load with yaml unmarshal error in alpha release Rules fail to load with yaml unmarshal error in 2.00 alpha.3 release Jul 7, 2017

@tinytub

This comment has been minimized.

Copy link

tinytub commented Jul 7, 2017

Try use promtool update rules your.rules command,
it will generate a new rules file called your.rules.yml

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Jul 12, 2017

Yes, for more info read: https://prometheus.io/blog/2017/06/21/prometheus-20-alpha3-new-rule-format/

Closing as there is nothing here to do. Please re-open if you think otherwise.

@gouthamve gouthamve closed this Jul 12, 2017

@wangybelle

This comment has been minimized.

Copy link

wangybelle commented Jan 25, 2018

serverFiles:
rules.yml: |
groups:
- name: example
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

prometheus.yml:
rule_files:
- "/etc/config/rules.yml"

Hello, I installed prometheus 2.0 version. I got following error on the above configurations in prometheus. Please help me. Thanks a lot!

level=info ts=2018-01-25T07:25:34.887714618Z caller=kubernetes.go:100 component="target manager" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2018-01-25T07:25:34.888985459Z caller=kubernetes.go:100 component="target manager" discovery=k8s msg="Using pod service account via in-cluster config"
level=error ts=2018-01-25T07:25:34.890128487Z caller=manager.go:485 component="rule manager"
msg="loading groups failed" err="yaml: unmarshal errors:\n line 1: cannot unmarshal !!str groups:... into rulefmt.RuleGroups"
level=error ts=2018-01-25T07:25:34.890175318Z caller=main.go:413 msg="Failed to apply configuration" err="error loading rules, previous rule set restored"
level=info ts=2018-01-25T07:25:34.890570093Z caller=kubernetes.go:100 component=notifier discovery=k8s msg="Using pod service account via in-cluster config"

@vdice

This comment has been minimized.

Copy link

vdice commented Feb 28, 2018

@wangybelle I assume this is from using the prometheus helm chart? I ran into this as well.

Try:

rules.yml:
  groups:
  - name: example
    rules:
...

(Basically, remove the | or |-, etc. after rules.yaml:)

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.