Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topological sorting of rules #17

Closed
juliusv opened this Issue Jan 4, 2013 · 15 comments

Comments

Projects
None yet
4 participants
@juliusv
Copy link
Member

juliusv commented Jan 4, 2013

Rules need to be evaluated in topological sort order in order to respect data dependencies between them.

@ghost ghost assigned juliusv Jan 4, 2013

bernerdschaefer added a commit that referenced this issue Apr 9, 2014

Merge pull request #17 from prometheus/rename-test-helper-files
Rename test helper files to helpers_test.go
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 23, 2014

I suspect that this problem has corner cases that are unresolvable in the general case, such as when rules with the same name are applied to many variables, and thus that correctness can't be guaranteed. A mostly-working linter/sanity checker should be an achievable goal though.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented May 23, 2014

Yeah, you can create two rules with the same name, but you usually wouldn't encourage that, right? Maybe we should even forbid that? Or are there good use cases for this?

I was btw. not thinking about a linter, but about Prometheus automatically sorting rules topologically according to their dependencies on other rules.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 24, 2014

Yeah, you can create two rules with the same name, but you usually wouldn't encourage that, right? Maybe we should even forbid that?

I think it's an option to forbid it depending on what exact naming and rule conventions we decide on.
Even if could make that work out in theory including all the knock-on effects, in practice I expect many users won't follow the conventions and or end up with clashes on metrics names. I'm willing to bet we'll have clashes just within the "official" client libraries for something like GC, before you even get to applications themselves.

There's also cases like rules that depend on themselves.

@stapelberg

This comment has been minimized.

Copy link

stapelberg commented May 22, 2015

I think as a first step it’d be a good idea to evaluate rules in the order in which they were defined in the config file.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2015

Currently we are reading the files and their rules in order. On each evaluation iteration, we evaluate concurrently, though, because naive sequential evaluation can take longer than the evaluation interval. In that case iterations are missed.

Side note: if the a single query in the concurrent evaluation takes longer, a whole iteration will be missed for all rules. We should not be waiting for all rules in an iteration to finish and only skip iterations for the slow queries.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 22, 2015

I think as a first step it’d be a good idea to evaluate rules in the order in which they were defined in the config file.

I've proposed that previously, some users are already at a point where that don't work for the reason @fabxc explains.

The last time we talked about this the plan was to add syntax to let you define a groups of rules, which would be run in order. Different groups could run at the same time then. This could also be used to enable things like allowing rules to access remote storage (which you wouldn't want enabled by default).

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2015

Mh, it seems like we can solve this internally. Not sure whether taking the issue to the user is the best approach here.

Building a dependency graph will also be more optimal than one level of logical grouping by the user.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 22, 2015

Building a dependency graph will also be more optimal than one level of logical grouping by the user.

This is only one aspect to the problem. What we also need to do is spread out the load of the rule evaluations and allow for cases when there's loops or things we can't determine about a dependency graph.

I'm think ordered groups of rules with some micro-optimisation where it's safe is the route to take.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2015

I would think that cyclic dependencies error when reading the rules. Or are
there reasonable cases where it should be allowed?
With that, evaluating one chain of rules without blocking independent ones
is not an issue.

On Fri, May 22, 2015 at 12:48 PM Brian Brazil notifications@github.com
wrote:

Building a dependency graph will also be more optimal than one level of
logical grouping by the user.

This is only one aspect to the problem. What we also need to do is spread
out the load of the rule evaluations, allow for cases when there's loops or
things we can't determine about a dependency graph.

I'm think ordered groups of rules with some micro-optimisation where it's
safe is the route to take.


Reply to this email directly or view it on GitHub
#17 (comment)
.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 22, 2015

I would think that cyclic dependencies error when reading the rules. Or are
there reasonable cases where it should be allowed?

There are advanced use cases where it comes up. It's also possible that there's rules that aren't actually cyclic, but where it's not possible for us to determine that due to e.g. use of regexes. You also wouldn't want to couple rule evaluation between jobs, as that could lead to a problem in one job taking out evaluation in another.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 23, 2015

Can you elaborate on the "use of regexes" part. I know that there were plans to have relabel() in the QL at some point - are you referring to that?

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented May 23, 2015

@fabxc For example, if someone uses a regex matcher on the metric name in the rule expression, you can't analyze anymore whether the resulting metric name(s) would need some rule to be executed first. Same for other labels (you could have parts of one metric being selected in one rule, and other parts in another rule).

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 23, 2015

Ah, those regexes – of course, thanks!

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Sep 17, 2015

Superseded by #1095.

@juliusv juliusv closed this Sep 17, 2015

simonpasquier referenced this issue in simonpasquier/prometheus Oct 12, 2017

Merge pull request #17 from brian-brazil/clarify
Make it clear what the full config is.

bobmshannon pushed a commit to bobmshannon/prometheus that referenced this issue Nov 19, 2018

Bob Shannon GitHub Enterprise
Merge pull request prometheus#17 from SRX/bs/update_sls_service_unhea…
…lthy_alert

Don't include message in SLS service alert

simonpasquier referenced this issue in simonpasquier/prometheus Jan 31, 2019

Merge pull request #17 from simonpasquier/add-owners
Add OWNERS file and Dockerfile
@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.