Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upProposal: A way to determine rule group order for important rule groups #4727
Comments
This comment has been minimized.
This comment has been minimized.
I'd suggest duplicating the rules, with each scoped to the relevant service within its group. This has the advantage of spreading the CPU load around.
You can already do this by putting all the constituent rules in one group. |
This comment has been minimized.
This comment has been minimized.
|
But if we duplicate the rules everywhere, when the time comes to change them (we find a better model, the data changes, etc.) we'd have to do it in many rule groups all together at once, which can be daunting as the # of rule groups grows. I guess that's my biggest goal, is avoiding that scenario. If that doesn't fit in the prometheus model, that's OK, we can manage, it just felt like something that was jumping out at us as we try to expand internal adoption |
This comment has been minimized.
This comment has been minimized.
|
That's a configuration management problem, and I'm afraid those are out of scope for Prometheus. |
This comment has been minimized.
This comment has been minimized.
|
I... Don't entirely feel satisfied by that answer, but that's fair enough. Thanks for your time! My preference would be to leave this issue open for a few days to hear other peep's input (or workarounds) if they exist, but I'll leave it to y'all to tell me if you'd prefer this issue be closed. |
This comment has been minimized.
This comment has been minimized.
|
I was expecting @brian-brazil to give that answer, though I can see how that wasn't the one you hoped for :) I also don't think introducing the extra level of complexity in Prometheus would be justified quite yet. It's the first time I'm hearing about this need, so I'm not sure how common it is (and then if it's not super common, but solvable with configuration management)... |
SpencerMalone commentedOct 11, 2018
Proposal
Use case. Why is this important?
We have some core rule groups that help define common functionality to be shared between services. This gives us a shared place to have and work with our core common rules. For example:
All of our web apps use traefik, we can create one set of rules for defining a web application's general availability that takes in a service specification, and each service can define their own alerting level for their specific service. It'd be nice to tweak these core rulesets for everyone involved, but as is, either we have a single very large ruleset for everyone, or accept that sometimes data will be out of order if the rules are loaded in the wrong order, or have rule duplication all over the place.
My yaml example that we do right now and accept that order will be wrong sometimes. Ours is a little more complicated with two or three layers of dependent rules in the general ruleset before we get a service specific definition, but you get the idea:
And then a specific service might say...
in it's own unique rule group.
My thought is maybe we could do a numerical system on rule groups, with the default being the first run alongside things w/ an order of
1? Ex: