Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign updelta function doing full vector scan #5218
Comments
This comment has been minimized.
This comment has been minimized.
|
IIRC we need the full scan for the extrapolation algorithm, i.e. to find out if a time series starts or stops within the range. |
This comment has been minimized.
This comment has been minimized.
|
The length of the slice, plus the start/stop timestamps should be enough for what we need to do. What are you using delta for? |
This comment has been minimized.
This comment has been minimized.
|
We have metrics representing "budget", they are gauges. Our users are often interested in how much budget they spent over e.g. last 15 days (single value). The query using delta function looks very natural, however, it is rather slow (in our set up). We are pretty sure we can go with a query where we are simply subtracting two value using offset, the question here is rather whether the delta should really do a full scan (especially when the documentation says "calculates the difference between the first and last value"). Cheers, |
This comment has been minimized.
This comment has been minimized.
|
What delta does is a bit more complicated, it sounds like you want offset
or increase.
…On Mon 18 Feb 2019, 10:12 bla-bu ***@***.*** wrote:
We have metrics representing “budget”, they are gauges. Our users are
often interested in how much budget they spent over e.g. last 15 days
(single value). The query using delta function looks very natural, however,
it is rather slow (in our set up). We are pretty sure we can go with a
query where we are simply subtracting two value using offset, the question
here is rather whether the delta should really do a full scan (especially
when the documentation says “calculates the difference between the first
and last value”)
Cheers,
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5218 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGyTdkxQ2Bcu6VsRmg6qiEFVlzTYYxzNks5vOnyXgaJpZM4a7pRj>
.
|
This comment has been minimized.
This comment has been minimized.
|
We're somehow deviating here from the issue. As you said "the length of the slice, plus the start/stop timestamps should be enough for what we need to do", yet, unless I'm terrible misreading the code, we are reading all the points from disk, pushing them to memory, iterating over them (where isCounter is always false) just to use the first and last point and length of the slice. If this is simply how the things are working, and e.g. we cannot calculate averageDurationBetweenSamples without having entire slice, we can then close the issue. Cheers, |
bla-bu commentedFeb 14, 2019
Bug Report
What did you do?
Running PromQL queries with millions of data points and using delta function.
What did you expect to see?
Accordingly to the documentation, “delta(v range-vector) calculates the difference between the first and last value of each time series element in a range vector v, returning an instant vector with the given deltas and equivalent labels.[…] delta should only be used with gauges.”. However, looking at the code (functions.go), it seems that delta is doing full scan anyway:
instead of simply taking the last value from the slice.
I would expect something like this (taken form the issue https://github.com/prometheus/prometheus/issues/3746)
Environment
System information:
Linux 4.15.0-1026-gcp x86_64
Prometheus version:
prometheus, version 2.6.0 (branch: HEAD, revision: dbd1d58)
build user: root@bf5760470f13
build date: 20181217-15:14:46
go version: go1.11.3