Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Fix bug regarding histograms usage in scripting #36866

Merged
merged 4 commits into from Dec 20, 2018

Conversation

costin
Copy link
Member

@costin costin commented Dec 19, 2018

Allow scripts to correctly reference grouping functions
Fix bug in translation of date/time functions mixed with histograms.
Enhance Verifier to prevent histograms being nested inside other
functions inside GROUP BY (as it implies double grouping)
Extend Histogram docs

Allow scripts to correctly reference grouping functions
Fix bug in translation of date/time functions mixed with histograms.
Enhance Verifier to prevent histograms being nested inside other
 functions inside GROUP BY (as it implies double grouping)
Extend Histogram docs
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few comments about tests and docs. Otherwise LGTM


["source","sql",subs="attributes,callouts,macros"]
----
include-tagged::{sql-specs}/docs.csv-spec[histogramDateExpression]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query here is exactly the same as the one referenced in the previous section: docs.csv-spec[expressionOnHistogramNotAllowed].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

include-tagged::{sql-specs}/docs.csv-spec[histogramNumericExpression]
----

Do note that histograms (and grouping functions in general) allow custom expressions but cannot have any functions applied to them in the `GROUP BY`.In other words, the following statement is *NOT* allowed:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

White space after applied to them in the GROUP BY.

include-tagged::{sql-specs}/docs.csv-spec[expressionOnHistogramNotAllowed]
----

as it requires two groupings (one for histogram and then another for apply the function on top of the histogram groups).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "then another for apply the function", I think it's better "then another for applying the function" or "then another to apply the function".


histogramDateWithYearOnTop
schema::h:i|c:l
SELECT HISTOGRAM(MONTH(birth_date), 2) AS h, COUNT(*) as c FROM test_emp GROUP BY h ORDER BY h DESC;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

histogramDateWithYearOnTop but the query is with MONTH()

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left 2 comments




histogramNumericWithExpression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need that, since we have this: https://github.com/elastic/elasticsearch/pull/36866/files#diff-271679983098ae41c89a482374a0e984R731
If we want to test the ORDER BY too then let's added to the test name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a typo - the incorrect query should have had the MONTH out of the histogram - I've updated that now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I mean this and the next one are similar, emp_no instead of salary, it's only the order by.

filter.condition().forEachDown(e -> {
if (Functions.isGrouping(e) || e instanceof GroupingFunctionAttribute) {
localFailures
.add(fail(e, "Cannot filter on grouping function [%s], use the grouped field instead", Expressions.name(e)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's too verbose, but I suggest to rephrase to apply filtering on the argument of the grouping function instead, as grouped field maybe be confusing.

@costin
Copy link
Member Author

costin commented Dec 20, 2018

run docs check

@costin
Copy link
Member Author

costin commented Dec 20, 2018

run the gradle build tests 1

@costin costin merged commit ac032a0 into elastic:master Dec 20, 2018
@costin costin deleted the improve-histograms branch December 20, 2018 21:12
@costin costin added the v6.6.0 label Dec 20, 2018
costin added a commit that referenced this pull request Dec 20, 2018
Allow scripts to correctly reference grouping functions
Fix bug in translation of date/time functions mixed with histograms.
Enhance Verifier to prevent histograms being nested inside other
 functions inside GROUP BY (as it implies double grouping)
Extend Histogram docs

(cherry picked from commit ac032a0)
costin added a commit that referenced this pull request Dec 20, 2018
Allow scripts to correctly reference grouping functions
Fix bug in translation of date/time functions mixed with histograms.
Enhance Verifier to prevent histograms being nested inside other
 functions inside GROUP BY (as it implies double grouping)
Extend Histogram docs

(cherry picked from commit ac032a0)
(cherry picked from commit 3a0fd4c)
jasontedor added a commit to ywelsch/elasticsearch that referenced this pull request Dec 21, 2018
* elastic/master: (539 commits)
  SQL: documentation improvements and updates (elastic#36918)
  [DOCS] Merges list of discovery and cluster formation settings (elastic#36909)
  Only compress responses if request was compressed (elastic#36867)
  Remove duplicate paragraph (elastic#36942)
  Fix URI to cluster stats endpoint on specific nodes (elastic#36784)
  Fix typo in unitTest task (elastic#36930)
  RecoveryMonitor#lastSeenAccessTime should be volatile (elastic#36781)
  [CCR] Add `ccr.auto_follow_coordinator.wait_for_timeout` setting (elastic#36714)
  Scripting: Remove deprecated params.ctx (elastic#36848)
  Refactor the REST actions to clarify what endpoints are deprecated. (elastic#36869)
  Add JDK 12 to CI rotation (elastic#36915)
  Improve error message for 6.x style realm settings (elastic#36876)
  Send clear session as routable remote request (elastic#36805)
  [DOCS] Remove redundant ILM attributes (elastic#36808)
  SQL: Fix bug regarding histograms usage in scripting (elastic#36866)
  Update index mappings when ccr restore complete (elastic#36879)
  Docs: Bump version to alpha2 after release
  Enable IPv6 URIs in reindex from remote (elastic#36874)
  Watcher: Remove unused local variable in doExecute (elastic#36655)
  [DOCS] Synchs titles of X-Pack APIs
  ...
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Dec 21, 2018
* master: (31 commits)
  Move ingest-geoip default databases out of config (elastic#36949)
  [ILM][DOCS] add extra scenario to policy update docs (elastic#36871)
  [Painless] Add String Casting Tests (elastic#36945)
  SQL: documentation improvements and updates (elastic#36918)
  [DOCS] Merges list of discovery and cluster formation settings (elastic#36909)
  Only compress responses if request was compressed (elastic#36867)
  Remove duplicate paragraph (elastic#36942)
  Fix URI to cluster stats endpoint on specific nodes (elastic#36784)
  Fix typo in unitTest task (elastic#36930)
  RecoveryMonitor#lastSeenAccessTime should be volatile (elastic#36781)
  [CCR] Add `ccr.auto_follow_coordinator.wait_for_timeout` setting (elastic#36714)
  Scripting: Remove deprecated params.ctx (elastic#36848)
  Refactor the REST actions to clarify what endpoints are deprecated. (elastic#36869)
  Add JDK 12 to CI rotation (elastic#36915)
  Improve error message for 6.x style realm settings (elastic#36876)
  Send clear session as routable remote request (elastic#36805)
  [DOCS] Remove redundant ILM attributes (elastic#36808)
  SQL: Fix bug regarding histograms usage in scripting (elastic#36866)
  Update index mappings when ccr restore complete (elastic#36879)
  Docs: Bump version to alpha2 after release
  ...
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants