Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IN_SUBQUERY support #6022

Merged
merged 1 commit into from Nov 1, 2020
Merged

Conversation

Jackie-Jiang
Copy link
Contributor

Description

Add IN_SUBQUERY transform function to support IDSET aggregation function as the subquery. The subquery is handled as a separate query on broker side.

E.g. The following 2 queries can be combined into one query:
SELECT ID_SET(col) FROM table WHERE date = 20200901
SELECT DISTINCT_COUNT(col), date FROM table WHERE IN_ID_SET(col, '<serializedIdSet>') = 1 GROUP BY date
->
SELECT DISTINCT_COUNT(col), date FROM table WHERE IN_SUBQUERY(col, 'SELECT ID_SET(col) FROM table WHERE date = 20200901') = 1 GROUP BY date

@siddharthteotia
Copy link
Contributor

Why do we need a transform function to support subquery? Can we use the standard SQL syntax of providing the subquery inside parentheses to clearly distinguish between outer and inner queries.

@Jackie-Jiang
Copy link
Contributor Author

Why do we need a transform function to support subquery? Can we use the standard SQL syntax of providing the subquery inside parentheses to clearly distinguish between outer and inner queries.

@siddharthteotia Good question. Currently Pinot does not support nested query or join (both query parser and engine), so we use transform function as a work-around to achieve the same semantic. In the future we might want to provide the native nested query and join support, or provide a translation layer where such queries can be rewritten into the format in this PR. The scope of supporting nested query and join is way larger than this PR, so right now we force users to use the transform function for the id_set subquery.

@Jackie-Jiang Jackie-Jiang force-pushed the in_subquery branch 2 times, most recently from 2a02b47 to 6777f23 Compare October 13, 2020 22:12
@codecov-io
Copy link

codecov-io commented Oct 13, 2020

Codecov Report

Merging #6022 into master will increase coverage by 6.62%.
The diff coverage is 61.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6022      +/-   ##
==========================================
+ Coverage   66.44%   73.07%   +6.62%     
==========================================
  Files        1075     1236     +161     
  Lines       54773    58510    +3737     
  Branches     8168     8671     +503     
==========================================
+ Hits        36396    42757    +6361     
+ Misses      15700    12927    -2773     
- Partials     2677     2826     +149     
Flag Coverage Δ
#integration 46.37% <52.49%> (?)
#unittests 63.99% <33.38%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ot/broker/broker/AllowAllAccessControlFactory.java 100.00% <ø> (ø)
.../helix/BrokerUserDefinedMessageHandlerFactory.java 52.83% <0.00%> (-13.84%) ⬇️
...ava/org/apache/pinot/client/AbstractResultSet.java 53.33% <0.00%> (-3.81%) ⬇️
.../main/java/org/apache/pinot/client/Connection.java 44.44% <0.00%> (-4.40%) ⬇️
.../org/apache/pinot/client/ResultTableResultSet.java 24.00% <0.00%> (-10.29%) ⬇️
...not/common/assignment/InstancePartitionsUtils.java 78.57% <ø> (+5.40%) ⬆️
.../apache/pinot/common/exception/QueryException.java 90.27% <ø> (+5.55%) ⬆️
...pinot/common/function/AggregationFunctionType.java 98.27% <ø> (-1.73%) ⬇️
.../pinot/common/function/DateTimePatternHandler.java 83.33% <ø> (ø)
...ot/common/function/FunctionDefinitionRegistry.java 88.88% <ø> (+44.44%) ⬆️
... and 999 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d08fd5c...84bb1c7. Read the comment docs.

@Jackie-Jiang Jackie-Jiang merged commit d586801 into apache:master Nov 1, 2020
@Jackie-Jiang Jackie-Jiang deleted the in_subquery branch November 1, 2020 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants