Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add derivative functions #2569

Merged
merged 9 commits into from
May 14, 2015
Merged

Add derivative functions #2569

merged 9 commits into from
May 14, 2015

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented May 13, 2015

This PR adds support for derivative and non_negative_derivative aggregate function. These are useful for calculating a rate of change on counter style measurements. They support derivatives over fields as well as nested aggregate functions over a field.

In the case of non_negative_derivative, negative values are dropped.

One limitation with this implementation is that if a query uses a derivative function, there can be no other fields or functions in the select clause. The parser will return an error in this case.

The functions require a field or another aggregate function over a field and optional duration interval. If a duration interval is not specified, and there is a group by clause time clause, the group by time is used. If there is no group by time clause, the interval defaults to 1 second.

Fixes #1822 #1477

Some examples given the following data:

> select * from cpu
name: cpu
---------
time            value
2015-05-09T23:00:00Z    1
2015-05-10T23:00:00Z    2
2015-05-11T23:00:00Z    0
2015-05-12T23:00:00Z    4

Derivative normalized per day

 > select derivative(value, 1d) from cpu
name: cpu
---------
time            value
2015-05-10T23:00:00Z    1
2015-05-11T23:00:00Z    -2
2015-05-12T23:00:00Z    4

Derivative normalized per hour

> select derivative(value, 1h) from cpu
name: cpu
---------
time            value
2015-05-10T23:00:00Z    0.041666666666666664
2015-05-11T23:00:00Z    -0.08333333333333333
2015-05-12T23:00:00Z    0.16666666666666666

Derivative over the mean value normalized per day

> select derivative(mean(value), 1d) from cpu where time > now() - 5d group by time(1d) fill(0)
name: cpu
---------
time            derivative
2015-05-09T00:00:00Z    1
2015-05-10T00:00:00Z    1
2015-05-11T00:00:00Z    -2
2015-05-12T00:00:00Z    4
2015-05-13T00:00:00Z    -4

Non-negative derivative over the mean value normalized per day

> select non_negative_derivative(mean(value), 1d) from cpu where time > now() - 5d group by time(1d) fill(0)
name: cpu
---------
time            non_negative_derivative
2015-05-09T00:00:00Z    1
2015-05-10T00:00:00Z    1
2015-05-12T00:00:00Z    4

May be supported in the future but workaround is to run separate
queries.
Derivative must be of the form derviative(field, duration) or
derivative(agg(field), duration).
Calculates the derivative of consequtive points and normalizes the
value to a given interval.  It supports simple derivates over
fields as well as nested derivatives over another aggregate function.

Fixes #1822
@toddboom
Copy link
Contributor

Looks pretty solid to me. Nothing immediately jumps out as crazy, so 👍

### Bugfixes
- [#2545](https://github.com/influxdb/influxdb/pull/2545): Use "value" as the field name for graphite input. Thanks @cannium.
- [#2558](https://github.com/influxdb/influxdb/pull/2558): Fix client response check - thanks @vladlopes!

## PRs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but I'm not sure this section adds anything to the CHANGELOG. The PR will already be referenced by each page for the issue links. My vote would be to remove it.

@jwilder
Copy link
Contributor Author

jwilder commented May 14, 2015

All comments addressed.

@@ -642,6 +642,35 @@ type SelectStatement struct {
FillValue interface{}
}

// HasDerivative returns true if one of the fields in the statement is a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment appears to be incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is "fields"?

// derivative aggregate
func (s *SelectStatement) HasDerivative() bool {
for _, f := range s.FunctionCalls() {
if strings.HasSuffix(f.Name, "derivative") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion, not a blocker right now, but you could instead add IsDerivative() to functions.go. I added IsNumeric() recently. That way all the code for checking stuff about functions is in one file. Then this code becomes:

if f.IsDerivative() {
...
}

It would also allow the check to be more precise -- if "derivate" or "non_negative_derivative".

If it's not specified, it defaults to 1s for raw queries and to the
group by duration on group by queries.
return nil, fmt.Errorf("expected two arguments for percentile()")
return nil, fmt.Errorf("expected two arguments for %s()", c.Name)
}
} else if strings.HasSuffix(c.Name, "derivative") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, since this is used so much, definitely worth adding IsDerivative() to functions.go. Next time.

@otoolep
Copy link
Contributor

otoolep commented May 14, 2015

+1, thanks for the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wire up DERIVATIVE aggregate
5 participants