Skip to content

Issue with PD integration (or how to debug) #2082

@ut0mt8

Description

@ut0mt8

What did you do?

I try to send alert from prometheus to PD. Some of them are failing and don't are ingested by PD.
Summary : there are surely a bad json generated but how to debug it ?

What did you expect to see?

An alert in PD.

What did you see instead? Under which circumstances?

so the alerts is not sent and here the log file :

level=error ts=2019-10-24T13:29:13.862Z caller=notify.go:372 component=dispatcher msg="Error on notify" err="cancelling notify retry for \"pagerduty\" due to unrecoverable error: unexpected status code 400: Event object format is unrecognized: JSON parse error" context_err=null
level=error ts=2019-10-24T13:29:13.862Z caller=dispatch.go:266 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for \"pagerduty\" due to unrecoverable error: unexpected status code 400: Event object format is unrecognized: JSON parse error"

Environment

docker / linux basics

  • Alertmanager version:

0.19

  • Prometheus version:

2.9.2

  • Alertmanager configuration file:
global:
    resolve_timeout: 5m
inhibit_rules:
-   equal:
    - cloud
    - region
    - env
    - role
    - alertgroups
    source_match:
        severity: critical
    target_match:
        severity: warning
receivers:
-   name: default
    slack_configs:
    -   api_url: https://hooks.slack.com/services/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
        channel: '#alerts'
        send_resolved: true
-   name: slack
    slack_configs:
    -   actions:
        -   style: danger
            text: '{{ template "csq.slack.silence.text" . }}'
            type: button
            url: '{{ template "csq.slack.silence.link" . }}'
        -   style: primary
            text: '{{ template "csq.slack.dashboard.text" . }}'
            type: button
            url: '{{ template "csq.slack.dashboard.link" . }}'
        -   style: primary
            text: '{{ template "csq.slack.confluence.text" . }}'
            type: button
            url: '{{ template "csq.slack.confluence.link" . }}'
        -   style: primary
            text: '{{ template "csq.slack.grafana.text" . }}'
            type: button
            url: '{{ template "csq.slack.grafana.link" . }}'
        api_url: https://hooks.slack.com/services/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
        channel: '{{ template "csq.slack.channel" . }}'
        color: '{{ template "csq.slack.color" . }}'
        fields:
        -   short: true
            title: cloud
            value: '{{ .CommonLabels.cloud }}'
        -   short: true
            title: region
            value: '{{ .CommonLabels.region }}'
        -   short: true
            title: env
            value: '{{ .CommonLabels.env }}'
        -   short: true
            title: role
            value: '{{ .CommonLabels.role }}'
        icon_emoji: '{{ template "csq.slack.emoji" . }}'
        image_url: '{{ template "csq.slack.image" . }}'
        send_resolved: true
        short_fields: true
        text: '{{ template "csq.slack.text" . }}'
        title: '{{ template "csq.slack.title" . }}'
-   name: pagerduty
    pagerduty_configs:
    -   description: '{{ template "csq.pagerduty.description" . }}'
        details:
            alertname: '{{ .CommonLabels.alertname }}'
            cloud: '{{ .CommonLabels.cloud }}'
            env: '{{ .CommonLabels.env }}'
            region: '{{ .CommonLabels.region }}'
            role: '{{ .CommonLabels.role }}'
        send_resolved: true
        service_key: '{{ template "csq.pagerduty.service_key" . }}'
        severity: '{{ .CommonLabels.severity }}'
route:
    group_by:
    - cloud
    - region
    - env
    - role
    - alertname
    group_interval: 5m
    group_wait: 1m
    receiver: default
    repeat_interval: 1h
    routes:
    -   continue: false
        match:
            test: 'yes'
        receiver: slack
    -   continue: false
        group_by:
        - '...'
        match:
            env: production
        receiver: pagerduty
    -   continue: false
        receiver: slack
templates:
- /etc/alertmanager/templates.tmpl
  • Prometheus configuration file:
NA
  • Logs:
level=error ts=2019-10-24T13:29:13.862Z caller=notify.go:372 component=dispatcher msg="Error on notify" err="cancelling notify retry for \"pagerduty\" due to unrecoverable error: unexpected status code 400: Event object format is unrecognized: JSON parse error" context_err=null
level=error ts=2019-10-24T13:29:13.862Z caller=dispatch.go:266 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for \"pagerduty\" due to unrecoverable error: unexpected status code 400: Event object format is unrecognized: JSON parse error"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions