Support multiple targets that act on different time windows #1739

shanson7 · 2020-03-31T13:18:53Z

To support functions like timeShift and movingWindow, we need to support functions that can adjust the from and to at a per context level. The archive chosen for unrelated requests should not be influenced by time ranges that were adjusted by other functions.

shanson7 · 2020-04-09T10:49:04Z

Note: There are a lot of possibilities for datapoint reduction in the "soft" limit case. We can (should?) start with the largest PN Groups (largest defined as most datapoints? Longest time-range?) and then move on to the largest singles. Or maybe the other way around? I think the most important part is being consistent to avoid issues with refreshing causing graph to change (I think this is already consistent, not sure).

shanson7 · 2020-04-09T12:58:56Z

Another thing to consider here is performance for queries that might act on a large number of series.

api/graphite.go

Dieterbe · 2020-04-09T19:21:17Z

To support functions like timeShift and movingWindow, we need to support functions that can adjust the from and to at a per context level.

In NewPlan, each target has its own context.
e.g. target=foo&target=sum(movingAverage(bar*,"1min")) will result in distinct contexts that can be influenced by the respective functions: "sum" and "movingAverage" can modify the context which will result in the bar* Req having modified from/to, whereas the context (and from/to) for foo are unaffected.
I do notice already an interesting case here though. Our current draft movingAverage code does this:

func (s *FuncMovingAverage) Context(context Context) Context {
        context.from -= uint32(s.window)
        return context
}

This is insufficient.

window can be specified as a time but also as a number of points.
in Graphite, number of points in respected literally. So:

taking movingAverage(...,10) of ...	... results in
1s raw data	10-secondly points
1min rollup data	10-minutely points
summarize(bar*,"1min","sum") where bar is 1s data	10-minutely points

Our current context type is incapable of tracking a "from offset" expressed in a number of points, neither is PlanRequests or anything else currently equipped to handle this, and the third case makes this even harder to persue. Perhaps your PR solves some of this, i need to dig more into it, but this looks like a big challenge unless we cut some corners.

And then there's a function like timeStack() which seems like it requires multiple from/to pairs.

api/query_engine.go

shanson7 · 2020-04-14T12:21:35Z

Our current draft movingAverage code...is insufficent. window can be specified as a time but also as a number of points. in Graphite, number of points in respected literally

Ooh, yeah that could prove quite difficult to support. Perhaps we could "partially" support functions based on the parameters given? Maybe some simple method to abort and proxy during the planning stage (e.g. when Context() is called). That way, when possible we can use the native (and fast) code, otherwise proxy to graphite.

Dieterbe · 2020-04-14T15:02:12Z

Perhaps we could "partially" support functions based on the parameters given? Maybe some simple method to abort and proxy during the planning stage (e.g. when Context() is called). That way, when possible we can use the native (and fast) code, otherwise proxy to graphite.

Yeah that seems like a very good idea. Especially because the most common case would be "simple" queries specifying a time window, not a number of points, which we can handle.

shanson7 · 2020-04-14T15:20:42Z

Hmm, I'm still a little stuck on reconciling:

Graphite compatibility
Supporting "soft" req behavior
Intuitive behavior

Graphite will always choose the first retention that satisfies the entire time range. It doesn't have any special behavior for if this is a significant amount of data. This "per-series" behavior is what this PR initially intends to emulate, but doing so in the "soft" req handling results in strange returned results.

For example, if we have a query that fetches 100 series with 10k datapoints each (raw) or 1k datapoints (next rollup) and our soft limit is 500k, this PR (at time of this comment) will return the 100 series with 44 of them at raw interval and 56 at the rollup interval.

I think intuitively, I would expect them to all be "softened" in step (per target?).

shanson7 · 2020-04-14T16:28:54Z

I'm leaning toward separating reqs by target and only in the "soft" limit case repeatedly "coarsen" the data with the most datapoints to fetch until soft limit is satisfied or there is nothing else we can do to reduce DPs.

In this case, if the same find expression is used in multiple targets, we may still need to fetch them separately if we need to get different archives.

shanson7

Ultimately, I decided to leave the "soft" handling the same as it is today and only do the planning differently. I think this takes us one step closer to graphite behavior and supports timeShift without complicating this PR too much

api/graphite.go

shanson7 · 2020-07-13T15:01:02Z

Any updates?

Dieterbe · 2020-08-03T11:30:54Z

For example, if we have a query that fetches 100 series with 10k datapoints each (raw) or 1k datapoints (next rollup) and our soft limit is 500k, this PR (at time of this comment) will return the 100 series with 44 of them at raw interval and 56 at the rollup interval.
I think intuitively, I would expect them to all be "softened" in step (per target?).

I agree that this per-single-granularity will look strange. I don't understand why we need to introduce this granularity.
If the only reason to do this is because they may have different from/to values, then why don't we just first categorize them in groups based on their from/to, and then reduce each group as a whole?

shanson7 · 2020-08-03T11:33:15Z

I agree that this per-single-granularity will look strange. I don't understand why we need to introduce this granularity.

As I said above:

Ultimately, I decided to leave the "soft" handling the same as it is today and only do the planning differently.

Dieterbe · 2020-08-03T11:43:39Z

the stats computation in 4) send out some metrics and we're done! needs to be updated.
also you might be interested in this commit: 1a1d233

shanson7 · 2020-08-03T11:58:07Z

the stats computation in 4) send out some metrics and we're done! needs to be updated.

Yes, likely. I'll confess I'm not entirely sure what it's doing right now. It seems like it should be indicating the number of the chosen archive as a meter, but I'm not sure why it would multiply by the number of requests. That would distort the input right? Like how would it differentiate between 100 reqs with archive 1 and 50 reqs with archive 2?

Dieterbe · 2020-08-03T12:08:42Z

oops. that's a bug. we should be reporting each archive len(reqs) times (at least when they all have the same archive id). like this #1868 , perhaps we should merge this to master, and then you can rebase on top of master?

Dieterbe · 2020-08-03T12:39:04Z

I stand corrected. we only check progress (and possibly bail out) every time we're done processing all requests in rp.single.mdpno that correspond to each same archiveID. So that makes sense.

The PR as a whole makes sense too.
When we do a PR such as for movingAverage, let's make sure to test a case like target=foo&movingAverage(foo.... too make sure we properly process that, don't mangle up the requests, etc.

Dieterbe · 2020-08-05T14:23:18Z

left todo in this PR:

sean to review and apply (hopefully) 1a1d233
now that fix reporting of archives #1868 has been merged, rebase on master and fix the stats. (if you don't have time, we can take care of the code changes)

shanson7 · 2020-08-05T14:52:57Z

sean to review and apply (hopefully) 1a1d233

now that fix reporting of archives #1868 has been merged, rebase on master and fix the stats. (if you don't have time, we can take care of the code changes)

shanson7 requested a review from Dieterbe March 31, 2020 15:54

Dieterbe reviewed Apr 9, 2020

View reviewed changes

api/graphite.go Show resolved Hide resolved

Dieterbe reviewed Apr 9, 2020

View reviewed changes

api/query_engine.go Outdated Show resolved Hide resolved

Dieterbe reviewed Apr 9, 2020

View reviewed changes

api/query_engine.go Show resolved Hide resolved

shanson7 force-pushed the more_time branch from 39f56dc to fcb33a9 Compare April 16, 2020 12:42

shanson7 commented Apr 16, 2020

View reviewed changes

api/graphite.go Show resolved Hide resolved

shanson7 changed the title ~~WIP - support multiple targets that act on different time windows~~ Support multiple targets that act on different time windows Apr 17, 2020

shanson7 force-pushed the more_time branch from fcb33a9 to 033ff2f Compare April 27, 2020 09:00

vaguiar mentioned this pull request May 20, 2020

[DNM] Native implementation of movingWindow and children #1830

Closed

shanson7 and others added 4 commits August 5, 2020 15:41

Add test for independent planning

ae0e66c

support separate from/to windows

e6b2296

Remove unused variable

1753d1c

clarify & tweak comments

dc3093b

shanson7 force-pushed the more_time branch from 033ff2f to 501c5ef Compare August 5, 2020 14:45

Archive metrics separately for singles

6f2ba93

shanson7 force-pushed the more_time branch from 501c5ef to 6f2ba93 Compare August 5, 2020 14:46

Dieterbe approved these changes Aug 5, 2020

View reviewed changes

Dieterbe merged commit 3b20cae into grafana:master Aug 5, 2020

shanson7 deleted the more_time branch October 22, 2021 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple targets that act on different time windows #1739

Support multiple targets that act on different time windows #1739

shanson7 commented Mar 31, 2020

shanson7 commented Apr 9, 2020

shanson7 commented Apr 9, 2020

Dieterbe commented Apr 9, 2020 •

edited

shanson7 commented Apr 14, 2020 •

edited

Dieterbe commented Apr 14, 2020

shanson7 commented Apr 14, 2020

shanson7 commented Apr 14, 2020

shanson7 left a comment

shanson7 commented Jul 13, 2020

Dieterbe commented Aug 3, 2020

shanson7 commented Aug 3, 2020

Dieterbe commented Aug 3, 2020

shanson7 commented Aug 3, 2020

Dieterbe commented Aug 3, 2020 •

edited

Dieterbe commented Aug 3, 2020

Dieterbe commented Aug 5, 2020

shanson7 commented Aug 5, 2020

Support multiple targets that act on different time windows #1739

Support multiple targets that act on different time windows #1739

Conversation

shanson7 commented Mar 31, 2020

shanson7 commented Apr 9, 2020

shanson7 commented Apr 9, 2020

Dieterbe commented Apr 9, 2020 • edited

shanson7 commented Apr 14, 2020 • edited

Dieterbe commented Apr 14, 2020

shanson7 commented Apr 14, 2020

shanson7 commented Apr 14, 2020

shanson7 left a comment

Choose a reason for hiding this comment

shanson7 commented Jul 13, 2020

Dieterbe commented Aug 3, 2020

shanson7 commented Aug 3, 2020

Dieterbe commented Aug 3, 2020

shanson7 commented Aug 3, 2020

Dieterbe commented Aug 3, 2020 • edited

Dieterbe commented Aug 3, 2020

Dieterbe commented Aug 5, 2020

shanson7 commented Aug 5, 2020

Dieterbe commented Apr 9, 2020 •

edited

shanson7 commented Apr 14, 2020 •

edited

Dieterbe commented Aug 3, 2020 •

edited