Add ability to specify some results as precomputed in PromQL expression's AST #10101

GiedriusS · 2022-01-03T09:49:16Z

Hello,

On Thanos side, we've been working on trying to fix one of the oldest tickets, query pushdown. Currently how Thanos works is that we always send data i.e. metrics data that matches given matchers via Select(), to a central location before executing the query. This can get quite inefficient. A lot of resources can be saved by executing part of any original query on the leaf nodes instead of sending raw data. For example, we've recently implemented a feature for pushing down max, max_over_time, and other functions: thanos-io/thanos#4917. It works well for these functions because you can apply them multiple times and get the same result. But, the current PromQL engine is not flexible enough for the other case where results are already precomputed and we don't want the PromQL engine to compute them again. So, in theory, we could calculate rate(foo[5m]) on leaf nodes but then the "global" PromQL engine would apply rate again. Thus, I think some improvements need to be made on the PromQL engine's side.

I am not the first one to come up with this. Here is some of the previous work that I am aware of:

promxy which has its own fork of Prometheus with what is called the NodeReplacer API (https://github.com/jacksontj/prometheus/commits/release-2.30_fork_promxy) (jacksontj@3db1780)
https://github.com/thanos-io/thanos/pulls?q=is%3Apr+is%3Aopen+pushdown
https://docs.google.com/document/d/1ajMPwVJYnedvQ1uJNW2GDBY6Q91rKAw3kgA5fykIRrg/edit?usp=sharing
Add engine option for NodeReplacer #3631

Some options to solve this problem that I can think of:

Having Select() return some kind of SelectHints which would indicate whether the result has already been computed. Then, the PromQL engine could remove the function call one layer above. Probably the easiest to implement but does this make sense?
Separate function for doing what NodeReplacer is doing. This would give the maximum flexibility to downstream users but probably the hardest to implement. Maybe it could even be called that way?
Adding the same NodeReplacer to Walk/Inspect.

All of this extra code would only be used by downstream users so all in all it should be relatively easy to add this kind of "internal API" to the PromQL engine.

The text was updated successfully, but these errors were encountered:

brian-brazil · 2022-01-03T10:19:41Z

Probably the easiest to implement but does this make sense?

It doesn't, as leaf nodes don't have access to enough data to correctly evaluate a function like rate() as they don't know what data other leaves may also have for the same time series. It's basically only correct to do this sort of thing for min/max.

GiedriusS · 2022-02-23T11:36:31Z

But things such as rate only modify each matching time series in place, right? Could you please elaborate? One blocker that I can think of is how from the result returned by query_range it is not clear whether there were any gaps in the data so it is impossible to deduplicate correctly without this information.

roidelapluie added component/promql component/remote storage not-as-easy-as-it-looks labels Feb 11, 2022

GiedriusS mentioned this issue Jun 2, 2022

Implement query pushdown for a subset of aggregations thanos-io/thanos#4917

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to specify some results as precomputed in PromQL expression's AST #10101

Add ability to specify some results as precomputed in PromQL expression's AST #10101

GiedriusS commented Jan 3, 2022

brian-brazil commented Jan 3, 2022

GiedriusS commented Feb 23, 2022

Add ability to specify some results as precomputed in PromQL expression's AST #10101

Add ability to specify some results as precomputed in PromQL expression's AST #10101

Comments

GiedriusS commented Jan 3, 2022

brian-brazil commented Jan 3, 2022

GiedriusS commented Feb 23, 2022