-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support var_pop, var_samp, stddev_pop and stddev_samp etc in sql #7801
Conversation
Test failure with sql compatible null handling looks maybe legitimate:
https://travis-ci.org/apache/incubator-druid/jobs/539273414#L5703 You should be able to run tests locally in this mode with |
Hi, @clintropolis , thanks for suggestion. I updated the codes and rebased on master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm 👍
docs/content/querying/sql.md
Outdated
@@ -132,6 +132,13 @@ Only the COUNT aggregation can accept DISTINCT. | |||
|`APPROX_QUANTILE_DS(expr, probability, [k])`|Computes approximate quantiles on numeric or [Quantiles sketch](../development/extensions-core/datasketches-quantiles.html) exprs. The "probability" should be between 0 and 1 (exclusive). The `k` parameter is described in the Quantiles sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.html) must be loaded to use this function.| | |||
|`APPROX_QUANTILE_FIXED_BUCKETS(expr, probability, numBuckets, lowerLimit, upperLimit, [outlierHandlingMode])`|Computes approximate quantiles on numeric or [fixed buckets histogram](../development/extensions-core/approximate-histograms.html#fixed-buckets-histogram) exprs. The "probability" should be between 0 and 1 (exclusive). The `numBuckets`, `lowerLimit`, `upperLimit`, and `outlierHandlingMode` parameters are described in the fixed buckets histogram documentation. The [approximate histogram extension](../development/extensions-core/approximate-histograms.html) must be loaded to use this function.| | |||
|`BLOOM_FILTER(expr, numEntries)`|Computes a bloom filter from values produced by `expr`, with `numEntries` maximum number of distinct values before false positve rate increases. See [bloom filter extension](../development/extensions-core/bloom-filter.html) documentation for additional details.| | |||
|`VAR_POP(expr)`|Computes variance population of `expr`.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these functions, could you please indicate in the doc that the druid-stats
extension needs to be loaded and link to it [stats extension](../development/extensions-core/stats.html)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review, @clintropolis . updated the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! lgtm 👍
Current extension-core has
stats
module which could computes variance population(sample) and standard deviation population(sample). We could expose them in sql based on calcite's var_pop,var_samp,stddev_pop,stddev_samp etc operators.The main modifications are:
BaseVarianceSqlAggregator
VarianceSqlAggregatorTest