Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interpolate() should incorporate samples immediately before or after the query range, when available #6437

Open
mactyr opened this issue Jun 7, 2024 · 4 comments
Labels
bug Something isn't working metricsql

Comments

@mactyr
Copy link

mactyr commented Jun 7, 2024

Describe the bug

The current implementation of interpolate() gives inconsistent results depending on the query time range. For instance, suppose I have a time/value series like this (leaving blank lines to emphasize gaps):

1:00 0
1:01 1

1:03 3
1:04 4

1:06 6
1:07 7

If I run interpolate() on a query from 1:00-1:07 with a step of 1m, I'll get a value of 2 at 1:02 and 5 at 1:05, as expected. But if I run a range query from 1:02-1:05, I will see no value for 1:02 or 1:05, even though the information is available outside the range to interpolate what the values at the ends of the range should be.

I expected interpolate() to use samples before or after the query window to complete the interpolation similar to rate(), but it appears it doesn't do so.

To Reproduce

Zoom into an interpolated query_range so that there are gaps in the source data at the start and/or end of the range, and observe that the interpolated series had a values at the edges of the zoomed-in region when zoomed out, but not when zoomed in. E.g., in this screenshot, the interpolated series (blue) has a value at 2024-01-20 12:00 and 2024-01-23 00:30 (where the cursor is):

Screenshot from 2024-06-07 10-31-18

But if I zoom in so those times are at the edges of the time range, the interpolated series has no values at those times (step is 30m in both screenshots):

Screenshot from 2024-06-07 10-32-59

In this example, the gaps at the edges are only one step long, but they can be longer if the gap in the underlying samples is longer.

Version

$ ./victoria-metrics-prod --version
victoria-metrics-20230407-010036-tags-v1.90.0-0-gb5d18c0d2

Logs

No response

Screenshots

No response

Used command-line flags

-retentionPeriod=120 -dedup.minScrapeInterval=1s \
-search.disableCache -search.maxPointsPerTimeseries=100000 -search.maxPointsSubqueryPerTimeseries=600000 -search.latencyOffset=0s -search.maxUniqueTimeseries=1000000 -search.maxSeries=1000000

Additional information

Note that this is not a request to revert #3816. If there is no sample available before or after the query range, then it is appropriate for the interpolated series to have no values at that end of the query window. This request only concerns when there is a sample before and/or after the query window that can be used to interpolate the values at the beginning or end of the time range.

@mactyr mactyr added the bug Something isn't working label Jun 7, 2024
@mactyr
Copy link
Author

mactyr commented Jun 12, 2024

@hagen1778, do you have a sense whether this is actionable? It's currently causing embarrassingly incomplete/incorrect charts in our customer-facing web application, so I'd like to understand whether a fix might be forthcoming or if I need to hunt for workarounds. 😅

I've noticed that keep_last_value() and keep_next_value() also don't appear to utilize samples outside of the query window (if I'm reading the code correctly). I don't use those functions in my application so that's not an immediate problem for me, but I would have expected them to look outside the query window in the same way I expected interpolate() to.

@hagen1778
Copy link
Collaborator

Hello @mactyr! In the current implementation, all transform functions have no knowledge about datapoints outside the interval. I don't know yet if this is actionable - I need feedback from @valyala who originally added this function.

@hagen1778
Copy link
Collaborator

Hey @mactyr! It is unlikely we're going to change transform functions to capture data before and/or after the requested interval, in the same fashion as rollup functions do. This could be a severe change.
But I think it could be possible to use data samples on the requested interval in order to extrapolate missing data points. It should work perfectly with monotonically increasing counter from your example, but will be partially correct for other cases. wdyt?

@mactyr
Copy link
Author

mactyr commented Jun 13, 2024

Thanks for offering that change, @hagen1778, but extrapolating from the samples within the interval definitely isn't safe with our real data, where the rate of change varies quite a bit over time.

My first impulse is to ask if there could be a version of interpolate that is a rollup (like interpolate_rollup()?) but I see how it's a little unclear what the lookbehind window would mean for that function, which is probably why @valyala chose to implement interpolate() as a transform in the first place.

So my next question is whether you all could take another look at #3079 and see whether that might be actionable? What I'm really trying to do here is just get the increase() without assuming that the counter had a value of 0 if there's no sample found prior to the query window. Because I couldn't get the result I needed with the existing family of "increase" functions, I settled on the workaround of doing ascent_over_time(interpolate(metric{})) instead, which usually works well enough but has the problem described in this issue. If I had an increase-like function that worked as desired, I think I could get away from relying on interpolate() so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working metricsql
Projects
None yet
Development

No branches or pull requests

2 participants