Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alerting: provide more examples for alert rules #581

Open
kayrus opened this Issue Oct 6, 2016 · 20 comments

Comments

Projects
None yet
@kayrus
Copy link

kayrus commented Oct 6, 2016

I.e. most common (critical and warning severity):

  • disk usage
  • memory usage
  • CPU usage
  • kubelet limit for pods per node
  • etc.
@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Oct 6, 2016

There is also http://tips.robustperception.io/ , where concrete examples are supposed to live.

@kayrus

This comment has been minimized.

Copy link
Author

kayrus commented Oct 6, 2016

@beorn7 try to google for "prometheus disk alert example" and you'll find nothing about this page.

Also when I tried to search for "alert" keyword (http://tips.robustperception.io/search:site/q/alert) I got An error occurred when processing your request.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Oct 6, 2016

I'm not saying the solution is already on that site, or that the site is already in perfect operational condition. It's just the thing that @brian-brazil recently set up for Prometheus tips. When addressing the issue you have filed, we should think how many examples we want to add to the core docs (which are supposed to be very concise) and which might be better suited on the tips site.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Oct 6, 2016

To be clear I want the reference docs to stay as reference docs, I'm all for different types of docs being on prometheus.io if we can find a sane way to approach it.

@kayrus

This comment has been minimized.

Copy link
Author

kayrus commented Oct 6, 2016

@beorn7 don't think about which ones, just shoot. users need examples. and complete examples, try to avoid taken out of context examples.

@varac

This comment has been minimized.

Copy link

varac commented Mar 1, 2017

Coming from nagios, this is what lacks from my perception. I'm totally sold for most parts of prometheus, but I'm missing not only alert examples, I'm missing sane defaults (could be opt-in) for alerts. This would help lowering the bar for ppl who want to migrate from other solutions.

@arj22

This comment has been minimized.

Copy link

arj22 commented Jun 2, 2017

+1

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Jun 3, 2017

@juliusv has started an initiative to do this and started with metrics from the node_exporter.

@imickeyj

This comment has been minimized.

Copy link

imickeyj commented Sep 21, 2017

can anyone give some alert examples or some specific link, I am kind of stuck at setting alerts

@huangyanxiong01

This comment has been minimized.

Copy link

huangyanxiong01 commented Sep 23, 2017

+1

1 similar comment
@4220182

This comment has been minimized.

Copy link

4220182 commented Sep 27, 2017

+1

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Sep 27, 2017

@imickeyj For now, you can find plenty of alerting rule examples in GitLab's alerting configs: https://gitlab.com/gitlab-com/runbooks/tree/master/alerts

@auhlig

This comment has been minimized.

Copy link
Contributor

auhlig commented Sep 27, 2017

At SAP we use Prometheus for quite a while now. Some inspiration for alerting rules can be found here.

@tangyong

This comment has been minimized.

Copy link

tangyong commented Feb 7, 2018

@juliusv @auhlig Great sharing!

Still best wish team to add more alerting rules to sources and documents.
Thanks!

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 7, 2018

We agree with all of this and acknowledge that there is a need for this. The problem today is that there is no and likely will never be a standardized labeling of things, and that's probably a good thing. @tomwilkie and I have been extensively talking about some solutions to this and we hope that soon we will have a proposal out for actually shareable alerting rules and dashboard definitions.

Whether this should be on prometheus.io is in my opinion questionable, my opinion is that the respective "bundle" should be in the repository of the application itself that they describe, similar to how the etcd project has alerting rules and dashboard definitions in the op-guide. As I mentioned the how reusable these actually are today with a lot of assumptions on labeling is questionable, but we are trying to solve that problem.

@gauravgoyal0086

This comment has been minimized.

Copy link

gauravgoyal0086 commented Mar 15, 2018

@brian-brazil Are you planning to integrate alertmanager with hipchat.
It works well with slack but we do not get nice readable format with hipchat alerts.
image

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 15, 2018

@gauravgoyal0086 Please do not ask support questions on unrelated issues.

@gauravgoyal0086

This comment has been minimized.

Copy link

gauravgoyal0086 commented Mar 15, 2018

I am sorry. I thought to ask it here as this thread is related to alerts.
Please suggest to open a new issue for alertmanager + hipchat ?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 15, 2018

@piotrkochan

This comment has been minimized.

Copy link

piotrkochan commented Oct 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.