Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VictorOps Notifier #417

Merged
merged 1 commit into from Jul 27, 2016
Merged

Add VictorOps Notifier #417

merged 1 commit into from Jul 27, 2016

Conversation

rhazdon
Copy link

@rhazdon rhazdon commented Jul 3, 2016

Hi,

I upgraded the alertmanager for sending alerts to VictorOps.
In my company we are using this feature already so please excuse I didn't open a discussion beforehand.

What do you think about it?

@beorn7
Copy link
Member

@beorn7 beorn7 commented Jul 6, 2016

@brian-brazil @fabxc I guess you are more likely than me to have an opinion on this.

@@ -1 +1 @@
0.2.1
0.2.2
Copy link
Member

@beorn7 beorn7 Jul 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version change will happen during release, not upon individual commits.

Copy link
Author

@rhazdon rhazdon Jul 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. :)
I will fix that asap.

EDIT: Removed.

NotifierConfig: NotifierConfig{
VSendResolved: true,
},
MessageType: `{{ if eq .Status "firing" }}CRITICAL{{ else }}RECOVERY{{ end }}`,
Copy link
Contributor

@brian-brazil brian-brazil Jul 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Victorops treat messages differently depending on this? We may wish to hardcode resolved notifications to send a recovery message.

Copy link
Author

@rhazdon rhazdon Jul 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Depends on the value, VictorOps will create or close an incident.
It expects one of the following keywords in this field: INFO, WARNING, ACKNOWLEDGEMENT, CRITICAL, RECOVERY. Available here: API Docs.

Is there maybe a better way to resolve the exact status/type of an alert?

Copy link

@mirthy mirthy Jul 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether or not this MessageType switching should occur instead impl.go. Kind of like how the pagerduty and opsgenies ones do it.

(PagerDuty): https://github.com/rhazdon/alertmanager/blob/001e0716bda50a9721b7c8b9cc32751969f6b8e6/notify/impl.go#L405

(Opsgenie):
https://github.com/rhazdon/alertmanager/blob/001e0716bda50a9721b7c8b9cc32751969f6b8e6/notify/impl.go#L673

But I can see an argument against it if we want to be able to use INFO, WARNING, etc.

Copy link
Contributor

@brian-brazil brian-brazil Jul 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the interesting questino. Allowing the user to do it this way will lead to users breaking recovery notifications. So what we might want to do is leave this configurable but only apply to firing alerts, and have some code to substitute in emergency if an invalid value is used.

@mirthy
Copy link

@mirthy mirthy commented Jul 12, 2016

Any more progress with this? What do we need to do here?

@rhazdon
Copy link
Author

@rhazdon rhazdon commented Jul 13, 2016

@mirthy Yes, in 2-3 days. I became a daddy two days ago. :)

@rhazdon
Copy link
Author

@rhazdon rhazdon commented Jul 19, 2016

I rebased my fork and solved the merge conflicts.

@mirthy
Copy link

@mirthy mirthy commented Jul 21, 2016

Hi guys, looks like @rhazdon made some more changes, any chance someone could look at this again? Sorry to be a pain. Is there anything I can do to help?

@rhazdon
Copy link
Author

@rhazdon rhazdon commented Jul 26, 2016

Hi Guys,
is there anything I should change or improve? :)

@fabxc
Copy link
Contributor

@fabxc fabxc commented Jul 26, 2016

I was on vacation. Am looking now.

}

fmt.Printf("unexpected VictorOps response from %s (POSTed %s), %s: %s",
apiURL, msg, resp.Status, body)
Copy link
Contributor

@fabxc fabxc Jul 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No line break here.

Also this should be using log.Debugf rather than fmt.Printf.

Copy link
Author

@rhazdon rhazdon Jul 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fabxc, thank you very much. You are right, I fixed that. :)

@fabxc
Copy link
Contributor

@fabxc fabxc commented Jul 26, 2016

@rhazdon please also check the comment on the entity ID above.

@rhazdon
Copy link
Author

@rhazdon rhazdon commented Jul 26, 2016

I did. Testing it currently on prodution.

@fabxc
Copy link
Contributor

@fabxc fabxc commented Jul 26, 2016

Thanks. Can you squash that into one commit and then we should be good for merge.

@fabxc
Copy link
Contributor

@fabxc fabxc commented Jul 26, 2016

Nope, compile error ahead.

Add default VictorOpsAPIURL

Add VictorOps default config

Add VictorOpsConfig struct in notifiers

Add new template tags for victorops

Add notifications logic for victorops

Compiled template tags with make assets

Remove common labels from entity_id template

Set messageType default value to CRITICAL

Recovery messageType is not configurable anymore. Firing state only allows specific keys

Make assets

Using log.Debugf

EntityID should not be configureable

Remove entity_id from template

Use GroupKey(ctx) as entity_id

Improve debug logging

Fix type of entity_id
@fabxc
Copy link
Contributor

@fabxc fabxc commented Jul 27, 2016

👍

@fabxc fabxc merged commit 42696d9 into prometheus:master Jul 27, 2016
1 of 2 checks passed
@fuzzyami
Copy link

@fuzzyami fuzzyami commented Oct 19, 2016

@rhazdon
Thanks for your work on VO integration - its something we'd want to use too.

I'm trying it now - and my alerts have payloads ("StateMessage" in VO terms) that could be somewhat improved. The current format contains (among other things) a space-delimited list of values:

"TooManyConfs1 AMS 3.3.3.3 GER-03 1.1.1.1"

This list originates in the key-value pairs that are part of the alert-data in the AlertManager. It would be awesome if that list contained both the keys and the values from the alert:

alertname:TooManyConfs1 datacetner:AMS external_ip:3.3.3.3 region:GER-03 internal_ip:1.1.1.1

Those key/value pairs provide important context to the alert. Without the key, the readability of the alert is greatly reduced. Perhaps this can be easily fixed?

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Oct 19, 2016

That's a default for all receivers where the message is considered short, as usually you'll know what the values are from experience and there's things like SMS size limits to worry about. You can override it with notification templates.

@fuzzyami
Copy link

@fuzzyami fuzzyami commented Oct 19, 2016

Thanks, Brian.

For future readers of this thread, I'm pasting below our current VictorOps receiver config. This is a little crude, but it provides all the info we need (alert name is first, then summary and finally full 'payload'). In the future we might customize the summary text and drop the labels.

receivers:
- name: victorOps-receiver
  victorops_configs:
    - api_key: <our_secrect_api_key>
        routing_key: <our_routing_key>
        message: 'Alert: {{ .CommonLabels.alertname }}. Summary:{{ .CommonAnnotations.summary }}. RawData: {{ .CommonLabels }}'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants