-
Notifications
You must be signed in to change notification settings - Fork 11.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus: allow step to be smaller than $__interval #19885
Conversation
Not sure about this, could this not have pretty major breaking behavior and also result in A LOT more data being queried for than before? Min step is only setting the minimum. With this change if minStep is it overrides the automatic data point limiting factors, long time ranges will always hit the prometheus max data limits and return 11K data points per series? |
I'm a bit too tired right now to look through all the implementation details. Especially, I have a really hard time to read the tests. (Where is Min step and Min time interval actually set in the tests?) In general, I like the idea that the user can independently determine what the min May I ask why there are the two separate settings? Depending on the reason, I might be able to say if they should be independent. The fact alone that there are two different settings suggests they should be independent, so I do understand the confusion of the OP in #14209 . However, the discussion there looks to me like the independence is only required because users want to set a dynamic duration in the PromQL expression that is larger than the |
Step still defaults to the dynamically calculated interval based on range and width. This PR simply allows you to dial up the resolution.
It's still adjusted automatically to respect the 11,000 points limit. |
The min interval is a datasource independent roll-up parameter that's dynamcially calculated based on the timepicker range and the screen size. It's shown below every metrics datasource and it's up to the datasource to make use of it. It is also available in the query variable as I'm guessing the min step was introduced to set a per-query step, but the main issue now is that that per-query step is hard-bound by the "min time interval" at the bottom of the queries, making the per-query step not really per-query. |
Any opinions from @roidelapluie or @peergynt? |
I think the current duality of Let's take a step back and look at what people really try to accomplish with those settings and variables, in particular what @peergynt was aiming for in #14209. It starts with dashboarding the classical
Currently, there is no obvious solution with the means of Grafana and/or Prometheus as is. What comes to mind as a solution is to extend the range interval by one scrape interval. However, since you cannot do math on the range interval (neither within PromQL nor within Grafana), there is no straightforward way to implement this for dynamic ranges. (You can easily do it for a graph with a fixed range, of course.) So what do people do? Here, I believe, we have arrived at exactly the situation described in #14209. @peergynt wanted the Prometheus Thus, I think discussing different minimal values for A totally different angle to approach the problem is to fix it within Prometheus. Thinking about Prometheus being primarily a data source for dashboarding leads to claims that the Finally, to play devil's advocate, one might as well say that all of this has limited relevance because |
My opinion is simple: there is something missing somewhere. I have tried ${2*__interval} in some dashboards, which did not work :) |
How about setting |
Regarding $__interval I am planning on making the calculation for this much more transparent and so that you can see & change the formula for how it is calculated.
Not everything with that PR / suggestion is ironed out, needs more UX & details figured out. The main purpose for that is to be able to set limits on the number of data points queries return in a time range agnostic way. Regarding wanting step lower than $__interval. What if we added a resolution option like 2/1, that way you would get a step that is half of $__interval. But not sure that solves it as currently, the resolution impacts the $__interval also. |
The better explanations above make a lot of sense. But IMHO they enforce the notion that min step and min interval are (or should be) the same. (You even call it now Min interval/step.) The problem my little essay above was about is that we want a way to template a duration in PromQL that is one scrape interval longer than Assuming that I'm right and my suggestion solves the problem at hand, is there then a reason left why we have that Min step field at all? What are scenarios where it would be different from Min interval? |
The range coverage "issue" seems to be prometheus-specific, and hence could be fixed in the datasource itself (instead of waiting for maxpoints logic), e.g., by adding a scrape interval to the dynamically calculated If we did this magically, would this be confusing? I.e., are there enough users out there, who already understand the coverage issue and have manually adapted the step?
The change from "Step" to "Min step" was done here: #8073 which seems to ensure that per-query step parameters make sense for short ranges. But it seems like that issue could have been solved by adding a scrape interval to the dynamically calculated With all the above, I think we can take these steps:
Thoughts? |
Related: #11451 |
We do have an existing field to configure the scrape interval? That would be my first concert here: That Grafana cannot even know the scrape interval. And even if there were a field (or there is, and I just haven't discovered it yet), there is possibly not only one scrape interval. Every target in Prometheus can have different scrape intervals, and while it's recommended to stick to a common interval as far as possible, there are situation in practice where the scrape intervals differs within an organization.
See above, I don't think Grafana can know the scrape interval. Sometimes not even in principle. But let's assume we have a scrape interval that is "usually" right, and we can configure it in the panel. Then the following question makes sense:
The answer is that we would definitely need a separate checkbox. That's mostly because users who do use a recording rule over a shorter interval and then want to do the dynamic range with The latter case (using recording rules with a short range and My suggestion is:
The new option would allow to modify
Default value for both would be 0. The user would set (1) to the scrape interval in case they are not using recording rules and want the perfect coverage as described above. In that case, they would also set (2) to something like 3x or 4x the scrape interval. A user that works with recording rules would leave (1) at 0 and would set (2) to the rule evaluation interval (which might be different from the scrape interval). (2) would not override Min interval, i.e. |
This would satisfy #11451. |
Of course, in the data source! Then the Min step parameter makes even less sense, as it now overlaps with both Scrape interval and Min interval in difficult to understand ways. |
This pull request has been automatically marked as stale because it has not had activity in the last 2 weeks. It will be closed in 30 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Not going forward with this. Great discussion though @beorn7 ⭐️ |
Are there any plans to do something similar to what I suggested above? I'm pretty happy with the insights gained from this whole discussion, but how to put them into action? |
What this PR does / why we need it:
In #14209 there is a request for uncoupling step and interval. This PR treats them separately.
Current behavior:
After this change, you can have a higher resolution, independent of the dynamically calculated interval:
Open questions:
Todos: