Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: multi-measurement query optimization #22301

Merged
merged 15 commits into from
Sep 13, 2021

Conversation

williamhbaker
Copy link
Contributor

@williamhbaker williamhbaker commented Aug 25, 2021

Closes #22156

This replaces the existing single measurement query optimization with an expanded version that applies to certain queries containing multiple measurements.

Background

The optimization applies for for queries where:

  • All measurements only use equality for comparison, for example r["_measurement"] == "measName".

  • One or more measurement expressions constituting a "group" of measurements are evaluated using only OR.
    For example, a query like this applies the optimization:

    from(bucket: "bucket") 
      |> range(start: -10m) 
      |> filter(fn: (r) => r["_measurement"] == "Meas-1" or r["_measurement"] == "Meas-2") 
      |> filter(fn: (r) => r["_field"] == "Field-1" or r["_field"] == "Field-2")

    ...but this does not:

    from(bucket: "bucket") 
      |> range(start: -10m) 
      |> filter(fn: (r) => r["_measurement"] == "Measurement-1" and r["_measurement"] == "Measurement-2") 
      |> filter(fn: (r) => r["_field"] == "Field-1" or r["_field"] == "Field-2")
  • There are measurements at no other place in the query than in the single group of OR'd measurements.

  • The group of OR'd measurements is applied to the query by a "top-level" AND. This ensures that all results from the query must belong to series containing one of those measurements. Put another way, there are no operators other than AND separating the group of OR'd measurements from the other conditions of the query.
    As an influxql.Expr string, such a query could look like this:

    (tag1 != 'foo' OR tag2 = 'bar') AND (_measurement = 'm0' OR _measurement = 'm1' OR _measurement = 'm2') AND (_field = 'val1' OR _field = 'val2')

    ...but not this:

    (tag1 != 'foo' OR tag2 = 'bar') OR (_measurement = 'm0' OR _measurement = 'm1' OR _measurement = 'm2') AND (_field = 'val1' OR _field = 'val2')

    Note that a query consisting only of a group of OR'd measurements will apply the optimization as well:

    (_measurement = 'm0' OR _measurement = 'm1' OR _measurement = 'm2')

The code for determining if the optimization applies and what measurement names are contained in the query uses traversal of the influxql.Expr AST, as did the previous single measurement optimization. It locates subtrees consisting entirely of binary expressions with OR operators (or parens) where the leaf nodes are EQ comparisons between _measurement and the string value of the measurement. These subtrees must be accessible from the head of the tree by exclusively traveling through binary expressions with AND operators (or parens). If there is more than one such subtree, the optimization cannot be applied.

Performance Improvement

For high-cardinality data spanning across several shards, the performance increase from this optimization is substantial. I tested locally using a dataset spanning 1 year (~52 shards) with 60,000 unique measurements containing 6,000,000 lines with each line having 20 fields (comparable to the data reported in #22156).

The following flux query was used, with the time range encompassing the entire dataset:

from(bucket: "benchmark_db") 
  |> range(start: 2019-01-01) 
  |> filter(fn: (r) => r["_measurement"] == "Measurement-1" or r["_measurement"] == "Measurement-2" or r["_measurement"] == "Measurement-3" or r["_measurement"] == "Measurement-4" or r["_measurement"] == "Measurement-5") 
  |> filter(fn: (r) => r["_field"] == "Field-1" or r["_field"] == "Field-2" or r["_field"] == "Field-3" or r["_field"] == "Field-4" or r["_field"] == "Field-5")

Pre-optimization, the average time for a single query was around ~20 seconds:

{
   "count":7,
   "max":24907.185157,
   "maxRate":0.040149057137392204,
   "mean":21416.873211857142,
   "meanRate":0.04669215669850277,
   "min":15555.256859,
   "minRate":0.06428694871865247,
   "sum":149.91811248300002
}

With the optimization in place, the query time was reduced to an average of ~55 milliseconds:

{
   "count":533,
   "max":252.204301,
   "maxRate":3.9650394384035508,
   "mean":56.5385272007505,
   "meanRate":17.6870542886502,
   "min":35.950905,
   "minRate":27.815711454273544,
   "sum":30.135034998000023
}

The optimized query time is comparable to an equivalent InfluxQL query, which averages around 35 milliseconds:

{
   "count":862,
   "max":107.6885,
   "maxRate":9.286042613649554,
   "mean":34.82083440023203,
   "meanRate":28.718438751523326,
   "min":23.437095,
   "minRate":42.66740395940709,
   "sum":30.01555925300001
}

@williamhbaker williamhbaker marked this pull request as ready for review August 25, 2021 18:27
@lesam
Copy link
Contributor

lesam commented Aug 25, 2021

I have quickly looked through this change. While I think it is good, I think it might be worth spending a bit more time understanding why the pre-optimization case was so bad - there might be better places to put this logic than right at the top level.

Discussing with @wbaker85 offline about it.

@williamhbaker williamhbaker force-pushed the wb-multi-measurement-optimization branch from e5842f4 to 542ee7c Compare August 30, 2021 18:51
tsdb/shard.go Outdated Show resolved Hide resolved
tsdb/shard.go Show resolved Hide resolved
tsdb/shard.go Outdated
// subtree containing the OR'd measurements accessible from root of the tree
// either directly (tree contains nothing but OR'd measurements) or by
// traversing AND binary expression nodes.
func (v measurementEvaluator) Visit(node influxql.Node) influxql.Visitor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make the logic better to have two visitors? You could have one visitor responsible for finding a potential head ( the LHS or RHS of an AND node that was not itself an AND node) and a second visitor responsible for finding if a given head is valid (composed of only measurement EQ conditions) or fatal (some measurement conditions and some non-measurements).

Separating the complicated algorithm into two simple algorithms would probably make the whole thing more clear when we read this code in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split it up and I think it did make it easier to understand. Check it out and see what you think!

Copy link
Contributor

@lesam lesam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

@williamhbaker williamhbaker force-pushed the wb-multi-measurement-optimization branch 2 times, most recently from 19f4ce5 to ca4dee2 Compare August 30, 2021 22:06
tsdb/shard.go Show resolved Hide resolved
tsdb/shard.go Outdated
return v
}

if name, ok := measurementNameFromBinary(n, v.measurementKey); ok {
Copy link
Contributor

@lesam lesam Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for something this misses? e.g.

(r._name = "foo" || r._name = "bar") && (r._value = r._name)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some updates that we discussed which should alleviate this concern. Since an expression like (_name = 'm0' OR _name = 'm1' OR _name = 'm2') AND (tag1 != 'foo' OR _name = 'm1') should be able to apply the optimization (we know that the values must come from m0, m1, or m2, even though m1 occurs on the left and right side of the AND), I updated the code to only look for a single group of exclusive OR'd measurements, and not invalidate the optimization if a measurement occurs elsewhere.

This simplifies looking for measurements anywhere else in the query, since it basically doesn't matter. I renamed this funciton to measurementNameFromEqBinary and added clarification that it only returns measurement names for EQ operations in the from of _measurement = name. For an EQ that has anything else, the tree will be invalidated.

@williamhbaker williamhbaker force-pushed the wb-multi-measurement-optimization branch from ca4dee2 to bbb3e67 Compare August 31, 2021 20:19
tsdb/shard.go Outdated
// A BinaryExpr must have an operation of OR or EQ in a valid tree
if n.Op == influxql.OR {
// The children of ORs must be either BinaryExprs themselves, or Parens
if binaryOrParen(n.LHS) && binaryOrParen(n.RHS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this anymore - you can just return v now, if not binary or paren then the visitor will return false (hence invalid) anyway.

tsdb/shard.go Show resolved Hide resolved
Copy link
Contributor

@lesam lesam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments and I think this is good to go.

@lesam
Copy link
Contributor

lesam commented Sep 1, 2021

This looks good to me now, but a flux test failed...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate flux vs. influxql performance with multiple measurement filters
3 participants