Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSummation received in with sum_over_time is different from actual values #1876
Comments
This comment has been minimized.
This comment has been minimized.
|
Can you share the raw data that you're working off? I see nothing implausible here based on what you've provided so far. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for reply Brian. |
This comment has been minimized.
This comment has been minimized.
|
I need basically |
brian-brazil
added
the
kind/question
label
Aug 9, 2016
This comment has been minimized.
This comment has been minimized.
|
Your questions made me curious to check values without aggregation function. So values are getting added at every intersection of time. Below are the raw values as per the query requested for two different values(10:00 and 11:00)
Below is the image of graph panel with max_over_time function to show spike at intersection, |
This comment has been minimized.
This comment has been minimized.
|
You're not allowing slack in your ranges for oddness, so this sort of thing will happen. What exactly are you monitoring? That exact an timestamp is unnatural. |
This comment has been minimized.
This comment has been minimized.
|
I am making use of Prometheus query language and alerting for data values in DB. As there are some filters associated with it I am using sum_over_time function to get summation over a hour, which somehow is also considering next range second. |
This comment has been minimized.
This comment has been minimized.
|
These are the semantics of the language, this is not the way you're meant to ingest this sort of data. It sounds like you should be using direct instrumentation and a counter instead. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for the suggestion. You can close the issue if you feel that's the not the idea behind the range vectors. |
This comment has been minimized.
This comment has been minimized.
|
We've discussed it before, and inclusive ranges are the right behaviour. In any case we'd not be able to consider changing this until 2.0. |
brian-brazil
closed this
Aug 9, 2016
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |

rahulghanate commentedAug 8, 2016
What did you do?
When I try to query the DB for a metric without sum_over_time function, then it shows the value correctly, whereas when I do the same with sum_over_time it gives different result.
But if I try for a different period of time for the same metric it shows correct value even with sum_over_time.
I am pushing values to Prometheus with help of Pushgateway server. And all values are getting pushed correctly.
What did you expect to see?
The value returned by sum_over_time function should return as per the value stored at individual times.
What did you see instead? Under which circumstances?
The value returned by sum_over_time is wrong for a certain event(20 percent of all values in day).
Given query results below for a period, but its seen for many timevalues of the day.
Environment
Test environment
Linux 3.2.0-80-generic x86_64
Query without sum_over_time
http://prometheus.mytest.com/api/v1/query_range?query=sum by (label_1, label_2,label_3) (metric1{label_1 ="2344461436",label_2 = "288364153", job="PerfData"})&start=2016-08-07T10:00:00.781Z&end=2016-08-07T11:05:00.781Z&step=1h
Query with sum_over_time
http://prometheus.mytest.com/api/v1/query_range?query=sum by (label_1, label_2,label_3) (sum_over_time(metric1{label_1 ="2344461436",label_2 = "288364153", job="PerfData"}[1h]))&start=2016-08-07T10:00:00.781Z&end=2016-08-07T11:05:00.781Z&step=1h
Check that the above value for second entry comes as 17(should be 8), but first value(9) in vector is same in both query results.
Another strange behaviour is seen when I minimize the step size to min level.
As I have multiple values spanned over 5 minutes at the start of hour, but still sum_over_time results in single value for 10th hour, whereas gives some different value for 11th hour.
Query without sum_over_time by smaller steps
http://prometheus.mytest.com/api/v1/query_range?query=sum by (label_1, label_2,label_3) (metric1{label_1 ="2344461436",label_2 = "288364153", job="PerfData"})&start=2016-08-07T10:00:00.781Z&end=2016-08-07T11:05:00.781Z&step=1m