Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upLimit number of data points returned in a query #4414
Comments
This comment has been minimized.
This comment has been minimized.
|
The plan is to limit the number of points that a PromQL node can return. Limiting the ultimate output could be too late. |
This comment has been minimized.
This comment has been minimized.
|
That sounds reasonable. Is this discussed anywhere? |
This comment has been minimized.
This comment has been minimized.
|
It came up on the developers list a while back. |
This comment has been minimized.
This comment has been minimized.
When you said "the plan" you meant "my plan", right? :-) Is this something you have started working on or plan on working on soon? If not I'll take a stab. |
This comment has been minimized.
This comment has been minimized.
|
I've not started working on it yet, there's also #4384 which is taking a more complicated approach. |
This comment has been minimized.
This comment has been minimized.
|
Roger, thanks for the pointer. Do you have any thoughts on reasonable limits? IIRC we consider 100k to be a reasonable cardinality limit for a metric, and therefore each aggregation should be able to consider 100k samples for a given stpe. I guess a limit would also be appropriate on the output of aggregations (specifically, less than 100k - maybe 1k?) and on the number of samples in a vector selector (maybe a relatively small limit - 6k would be 1 day at 15s). Even after we do all this, it still sounds like queries that returns lots of timeseries over a long range could be a problem, if they are very simple - I'm thinking something as basic as Hence why I think we need to limit |
This comment has been minimized.
This comment has been minimized.
|
My main thought was that
That's why I'm proposing limiting output samples of a node, as it's more accurate than that. |
This comment has been minimized.
This comment has been minimized.
|
Basically limits:
(maybe 1. and 3. could be unified?) |
This comment has been minimized.
This comment has been minimized.
|
We need range vector function output plus rangeEval as well, those functions only cover raw data dumps. I'm not sure we can unify those functions. |
This comment has been minimized.
This comment has been minimized.
|
Picking this up. Just having a read over the discussion (here, the other PR, mailing list) and trying to get up to speed with the code in @brian-brazil are there any docs for the promql internals similar to what you wrote for discovery, or Julius' internal arch. doc? |
This comment has been minimized.
This comment has been minimized.
|
There are not, Tom's post has the code points you need to touch. |
This comment has been minimized.
This comment has been minimized.
|
|
cstyan
referenced this issue
Aug 16, 2018
Merged
WIP: keep track of samples per query, set a max # of samples #4513
This comment has been minimized.
This comment has been minimized.
|
Running some tests to confirm which types would have to be limited by input samples to expressions/functions, but it may just be NumberLiteral or other types that are limited by the max # of steps limit. If we're going to set a default value for the max # of samples and (or) make that value configurable, we should provide an explanation of how we get to the number and how people should go about calculating what they might want to set it to. Unfortunately my operations exp. with Prometheus is still limited. @brian-brazil would you mind explaining this comment
Are you saying 20M samples? I'm missing something in how your calculating these numbers. |
This comment has been minimized.
This comment has been minimized.
|
It's 16 bytes per sample, multiplied by 3 giving ~1GB with 20M points. |
This comment has been minimized.
This comment has been minimized.
|
Okay so 16 bytes per sample, multiplied by a max of 3 nodes, and again by 20M points is ~1GB. I guess I'm curious if you're saying 20M samples should be the max, or 1GB per query? 20M is in some cases not very many samples. If, for example, some query could return 100k series, that's only 200 samples per series. If the interval is 15s, that's only 13-14 steps, or 50 minutes. Does that sound reasonable to me? What would a typical query return in terms of # of series? I think the #'s we use to decide what the sane default for the max samples per query is should be in the docs for people who may want to configure that value themselves. |
This comment has been minimized.
This comment has been minimized.
|
When using range vector functions, we don't load in all the data at once - just one window at a time for one series. |
This comment has been minimized.
This comment has been minimized.
|
Sorry I still don't entirely understand. Are your saying that we only ever
load the samples for a single timestamp into memory at a time? So in the
situation with 20m samples for about 1GB we have 20m series as well.
…On Tuesday, August 21, 2018, Brian Brazil ***@***.***> wrote:
When using range vector functions, we don't load in all the data at once -
just one window at a time.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4414 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADGJnMTsQvrEgm7Ha_7HTE-6fyeCqRt1ks5uTEf9gaJpZM4Vf3g4>
.
|
This comment has been minimized.
This comment has been minimized.
|
That was the old design. In the new design things like rate() work time series by time series, and within that iterate over the data. So a 5m rate only needs 5m worth of samples from one time series in memory at a time - plus space for the output. The space taken by series themselves and their labels is not being taken into consideration here, we're just focusing on samples as that should be good enough. |
This comment has been minimized.
This comment has been minimized.
|
So in the case of range vector were you referring to a single timestamp step when you said 'one window'? |
This comment has been minimized.
This comment has been minimized.
|
Yes. |
This comment has been minimized.
This comment has been minimized.
|
Okay that makes sense, thanks. So then I guess I'm just back to the question of what seems reasonable for a default |
This comment has been minimized.
This comment has been minimized.
|
I'm saying that we should have a default setting that allows |
This comment has been minimized.
This comment has been minimized.
|
Okay, that makes sense. I'd like to have a test for this in my PR. I'll have a look around but not really seeing any examples of queries for |
tomwilkie
referenced this issue
Aug 23, 2018
Merged
Limit the number of samples remote read can return. #4532
This comment has been minimized.
This comment has been minimized.
|
Closed in #4532 |
tomwilkie commentedJul 25, 2018
We currently limit the number of steps to 11k, but if you query a very high cardinality metric we seem to be able OOM even a very large Prometheus. I propose we limit the number of points we return (step * timeseries).
Plumbing this through might be a little tricky though, I though @gouthamve had some ideas?