Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign updelta, rate, irate over-report by up to 100% #3086
Comments
This comment has been minimized.
This comment has been minimized.
|
This behaviour is correct, see https://www.youtube.com/watch?v=67Ulrq6DxwA You should also not use delta, you want rate, irate or increase here. |
brian-brazil
closed this
Aug 17, 2017
This comment has been minimized.
This comment has been minimized.
|
If you look at the test diff I sent, the difference from the test directly above it is that I'm asking for a delta on a range that doesn't land exactly upon a sample time. Almost all the other tests run exactly on a sample point, which doesn't exercise the extrapolation code. In real world usage, I've never seen promql returning results synchronized with samples. Also, as the bug name states, rate and irate similarly report incorrect results. Here's what this looks like from a user's point of view. We have a gauge holding the number of open file descriptors. It isn't easy to see in this snapshot, but the value is almost always 123, sometimes down to 122, sometimes up to 124. Now take a delta: The delta always shows increments and decrements by factors of 2. That never occurs in this graph. irate also shows increments as a factor of 2. These graphs are taken from prometheus 1.6.1's graph interface. The video you linked is 40minutes long. If you'd like to reference a particular part, I'm happy to take a look. However the results we're seeing are creating misleading graphs, and clearly impossible graphs are how I discovered the issue. In the short term, we've started using correction factors to correct the graphs, eg divide by 2 when using 2m aggregation, divide by 1.25 when using 5m aggregation, etc. |
This comment has been minimized.
This comment has been minimized.
|
The entire talk is relevant, as it is far more likely that you are misunderstanding the behaviour rather than there being a bug. |
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |


vgough commentedAug 17, 2017
•
edited
What did you do?
delta(x[2m]) when collecting at 1m intervals shows 2x the expected values. The over-reporting appears to be based on a ratio of aggregation window and collection windows. If you aggregate over 5m window, then over-reporting drops to 25%. At 10m aggregation window, it drops to 11%.
Can also be seen using this test:
Seems like a problem with how sample bounds impact the delta computation: https://github.com/prometheus/prometheus/blob/master/promql/functions.go#L117
Environment
Linux, multiple versions, distributions, kernels.
production versions of prometheus, and github master.