Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom continuous query options per query rather than per node #5194

Merged
merged 1 commit into from
Dec 29, 2015

Conversation

jsternberg
Copy link
Contributor

@jsternberg jsternberg changed the title WIP issue #5136 [WIP] issue #5136 Dec 21, 2015
@jsternberg jsternberg changed the title [WIP] issue #5136 [WIP] Per continuous query options Dec 21, 2015
@@ -1403,6 +1403,13 @@ func (p *Parser) parseCreateContinuousQueryStatement() (*CreateContinuousQuerySt
}
stmt.Database = ident

if p.parseTokenMaybe(RESAMPLE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why parseTokenMaybe here? Instead of

tok, _, _ := p.scanIgnoreWhitespace()
if tok == RESAMPLE {
    stmt.ResampleEvery, stmt.ResampleFor, err = p.parseResample()
} else {
    p.unscan()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed it was a very common pattern in the parser to do your suggestion. While that method works, I was hoping to introduce a new convenience function that does the same thing and reduce some of the bloat and variable shadowing the current method introduces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the idea. I think parseOptionalToken is a better name.

s.lastRuns[cq.Name] = time.Time{}
// Remove the last run time for the CQ
if _, ok := s.lastRuns[cq.Name]; ok {
delete(s.lastRuns, cq.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial run of a continuous query acts differently than one that has previous been run. One that has previously been run will try to calculate the optimal time it should have run the next query (so as to calculate the oldest interval to regenerate). The initial run ignores this.

This line of code set the time to January 1, 1970 and caused the code attempting to calculate the intervals to try running every query from 1970. Instead of trying to differentiate a default value vs a real value, I just decided removing it from the map was a better solution.

@jsternberg jsternberg changed the title [WIP] Per continuous query options Custom continuous query options per query rather than per node Dec 23, 2015
@jsternberg
Copy link
Contributor Author

I think this is ready for review and (potential) merging. I have updated the changelog and the documentation in INFLUXQL.md.

@gunnaraasen @dgnorton @pauldix

return 0, 0, newParseError(tokstr(tok, lit), []string{"EVERY", "FOR"}, pos)
}
return resampleInterval, resampleMaxDuration, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefixing variables with resample in this function seems redundant since it's in the parseResample() function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the resample prefix.

return nil
// Calculate and set the time range for the query. Go from most recent to least.
startTime := now.Add(-resampleEvery).Truncate(interval)
for ; startTime.UnixNano() >= oldestTime.UnixNano(); startTime = startTime.Add(-interval) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about doing oldestTime.Before(startTime) for the for condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I'll update the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. It needs to be !startTime.Before(oldestTime), but it works.

@benbjohnson
Copy link
Contributor

Overall, lgtm. 👍

@jsternberg jsternberg force-pushed the js-5136-per-cq-options branch 2 times, most recently from 4a30a90 to ab73061 Compare December 28, 2015 21:02
@@ -229,6 +229,7 @@ ALTER RETENTION POLICY policy1 ON somedb DURATION 1h REPLICATION 4

```
create_continuous_query_stmt = "CREATE CONTINUOUS QUERY" query_name on_clause
[ "RESAMPLE" [ "EVERY" duration_lit ] [ "FOR" duration_lit ] ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix this. It allows CREATE CONTINUOUS QUERY foo ON bar RESAMPLE BEGIN...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted to make this more clear, but it's very difficult to do a "at least one, but in this order" in BNF since that's not really a thing in context free grammars. It's possible in a context-free grammar, but it looks weird since you basically have to say S = EF | E | F.

This makes the following syntax possible:

    CREATE CONTINUOUS QUERY mycq ON mydb
        RESAMPLE EVERY 1m FOR 1h
        BEGIN
          SELECT mean(value) INTO cpu_mean FROM cpu GROUP BY time(5m)
        END

The RESAMPLE option customizes how often an interval will be sampled and
the duration. The interval is customized with EVERY. Any intervals
within the resampling duration on a multiple of the resample interval
will be updated with the new results from the query.

The duration is customized with FOR. This determines how long an
interval will participate in resampling.

Both options are optional. If RESAMPLE is in the syntax, at least one of
the two needs to be given. The default for both is the interval of the
continuous query.

The service also improves tracking of the last run time and the logic of
when a query for an interval should be run. When determining the oldest
interval to run for a query, the continuous query service determines
what would have been the optimal time to perform the next query based on
the last run time. It then uses this time to determine the oldest
interval that should be run using the resample duration and will
resample all intervals between this time and the current time as opposed
to potentially forgetting about the last run in an interval if the
continuous query service gets delayed for some reason.

This removes the previous config options for customizing continuous
queries since they are no longer relevant and adds a new option of
customizing the run interval. The run interval determines how often the
continuous query service polls for when it should execute a query. This
option defaults to 1s, but can be set to 1m if the least common factor
of all continuous queries' intervals is a higher value (like 1m).
@dgnorton
Copy link
Contributor

+1

jsternberg added a commit that referenced this pull request Dec 29, 2015
Custom continuous query options per query rather than per node
@jsternberg jsternberg merged commit 0931e30 into master Dec 29, 2015
@jsternberg jsternberg deleted the js-5136-per-cq-options branch December 29, 2015 19:00
@jwilder jwilder added this to the 0.10.0 milestone Feb 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants