Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support var_pop, var_samp, stddev_pop and stddev_samp etc in sql #7801

Merged
merged 5 commits into from
Jun 10, 2019

Conversation

xueyumusic
Copy link
Contributor

Current extension-core has stats module which could computes variance population(sample) and standard deviation population(sample). We could expose them in sql based on calcite's var_pop,var_samp,stddev_pop,stddev_samp etc operators.
The main modifications are:

  1. add BaseVarianceSqlAggregator
  2. add tests VarianceSqlAggregatorTest

@clintropolis
Copy link
Member

Test failure with sql compatible null handling looks maybe legitimate:

Failed tests: 
org.apache.druid.query.aggregation.variance.sql.VarianceSqlAggregatorTest.testStdDevPop(org.apache.druid.query.aggregation.variance.sql.VarianceSqlAggregatorTest)
  Run 1: VarianceSqlAggregatorTest.testStdDevPop:339
  Run 2: VarianceSqlAggregatorTest.testStdDevPop:339
  Run 3: VarianceSqlAggregatorTest.testStdDevPop:339
  Run 4: VarianceSqlAggregatorTest.testStdDevPop:339
...

https://travis-ci.org/apache/incubator-druid/jobs/539273414#L5703

You should be able to run tests locally in this mode with
-Ddruid.generic.useDefaultValueForNull=false

@xueyumusic
Copy link
Contributor Author

Hi, @clintropolis , thanks for suggestion. I updated the codes and rebased on master.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm 👍

@@ -132,6 +132,13 @@ Only the COUNT aggregation can accept DISTINCT.
|`APPROX_QUANTILE_DS(expr, probability, [k])`|Computes approximate quantiles on numeric or [Quantiles sketch](../development/extensions-core/datasketches-quantiles.html) exprs. The "probability" should be between 0 and 1 (exclusive). The `k` parameter is described in the Quantiles sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.html) must be loaded to use this function.|
|`APPROX_QUANTILE_FIXED_BUCKETS(expr, probability, numBuckets, lowerLimit, upperLimit, [outlierHandlingMode])`|Computes approximate quantiles on numeric or [fixed buckets histogram](../development/extensions-core/approximate-histograms.html#fixed-buckets-histogram) exprs. The "probability" should be between 0 and 1 (exclusive). The `numBuckets`, `lowerLimit`, `upperLimit`, and `outlierHandlingMode` parameters are described in the fixed buckets histogram documentation. The [approximate histogram extension](../development/extensions-core/approximate-histograms.html) must be loaded to use this function.|
|`BLOOM_FILTER(expr, numEntries)`|Computes a bloom filter from values produced by `expr`, with `numEntries` maximum number of distinct values before false positve rate increases. See [bloom filter extension](../development/extensions-core/bloom-filter.html) documentation for additional details.|
|`VAR_POP(expr)`|Computes variance population of `expr`.|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these functions, could you please indicate in the doc that the druid-stats extension needs to be loaded and link to it [stats extension](../development/extensions-core/stats.html)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review, @clintropolis . updated the doc.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! lgtm 👍

@fjy fjy merged commit ce591d1 into apache:master Jun 10, 2019
@gianm gianm added this to the 0.16.0 milestone Jul 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants