Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upOptimize queries using regex matchers for set lookups #2651
Comments
This comment has been minimized.
This comment has been minimized.
|
I think that should be part of the storage itself. Depending no how you look at it, PromQL queries are translated to database queries + additional processing. So it makes sense for the DB to optimise its own queries. See my TODO here, which exactly aims at your suggestion: introspecting regexp matchers and inferring more optimised lookups from there. While I think having it as a first-class concept in PromQL is valuable, it may be a bit lade by now and just complicate things. |
This comment has been minimized.
This comment has been minimized.
|
I saw your TODO, but wasn't quite sure what you meant with interface upgrading in this context. I can't provide any performance data at this point, as I don't have access to a production prometheus machine right now (especially not with a huge 2.0 database). Would be cool to have a dataset available for such purposes. I was actually not thinking at adding this to PromQL itself, but a separate query optimizer package. At the moment this would have to implement the If you're fine with adding it to tsdb for now, this looks like an easy start. I foresee we'll have more needs for query optimization later, e.g. optimizing queries like |
This comment has been minimized.
This comment has been minimized.
That sounds like a better idea actually.
Yes, stuff like this will be interesting. In this case here one would also add the |
brian-brazil
referenced this issue
Jun 7, 2017
Closed
some_metric{{a="1",b="2"}} & some_metric{a=-"1,2,3"} #2815
This comment has been minimized.
This comment has been minimized.
|
Another idea is detecting regexes that are actually just (potentially escaped) simple strings.
This doesn't work with remote read, as it's not required to return metrics that match the matchers. My thinking is that as that's not the usual case that we could get such endpoints marked by the user in the configuration. |
This comment has been minimized.
This comment has been minimized.
That IS the original idea of this issue.
I don't get that. This refers to a query planning step. Remote read is just retrieval of the data for said query. |
This comment has been minimized.
This comment has been minimized.
That's true only if the other end is storage. There is no requirement that a remote read request for |
brian-brazil
added
priority/P2
component/local storage
kind/enhancement
help wanted
low hanging fruit
and removed
help wanted
labels
Jul 14, 2017
gouthamve
added
the
hacktoberfest
label
Sep 28, 2017
This comment has been minimized.
This comment has been minimized.
dilyevsky
commented
Oct 30, 2017
|
I'm working on a pr for prefix regex optimization. Should I open a separate issue? Also I noticed that user-provided regexes are wrapped like so =~: Select labels that regex-match the provided string (or substring). |
This comment has been minimized.
This comment has been minimized.
|
Yes, please open a new PR with your implementation. If you have any questions, feel free to ask them here or via prometheus-developers@. All regular expressions in the Prometheus projects are anchored to prevent simple mistakes. Feel free to send a PR against prometheus/prometheus with changes to our |
This comment has been minimized.
This comment has been minimized.
rajkumarrvarma
commented
Nov 13, 2017
|
Hi Experts, |
This comment has been minimized.
This comment has been minimized.
|
@rajkumarrvarma Please do not ask support questions on unrelated issues. |
This comment has been minimized.
This comment has been minimized.
SivaKrishna05
commented
Apr 17, 2018
•
|
@brian-brazil @fabxc sorry i wasn't able to frame the question correctly. i want to scrape the deployment/service/ingress meta labels i am unable to seeing in the each scrape how to scrape deployment lables is there any way of getting those with out having kube-state-metrics |
This comment has been minimized.
This comment has been minimized.
|
@SivaKrishna05 It is impolite to ask support questions on unrelated issues. Please use the users mailing list. |
This comment has been minimized.
This comment has been minimized.
|
Since the PR referencing this issue in tsdb is merged I assume the work on the tsdb side is done? so I can remove the local_storage label? |
This comment has been minimized.
This comment has been minimized.
|
The tsdb issue is unrelated (and in fact can be reverted, it's dead code). |
This comment has been minimized.
This comment has been minimized.
|
just to be clear you are talking about prometheus/tsdb#111 right? Is there anything that needs to be done on the tsdb side? |
This comment has been minimized.
This comment has been minimized.
|
Yes. Personally I think this should be implemented entirely in the TSDB. |
grobie commentedApr 22, 2017
Use case
A common use case for regex matchers is to use them to query all series matching a set of label values, e.g.
up{instance=~"foo|bar|baz"}. Grafana's template variables feature is a big user of that pattern.Problem
Both the old and new storage implement fast-path series lookups for simple equality matchers. While the described set regex matchers could benefit from the same principles, they are treated like any other regex lookup, requiring the lookup of all label values for a given label. In large setups, this will result in a serious query latency penalty.
Proposal
Optimize this special regex usage, as it should be possible to achieve similar lookup times when equality matchers are used. A few options come to my mind:
up{instance=~"foo"}becomes simplyup{instance="foobar"}, or a more complicated examplerate(requests{instance=~"foo|bar"}[1m])becomesrate(requests{instance="foo"}[1m]) OR rate(requests{instance="bar"}[1m]).@fabxc