Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-stage intersect ignores all modifier #13126

Closed
gortiz opened this issue May 10, 2024 · 2 comments
Closed

Multi-stage intersect ignores all modifier #13126

gortiz opened this issue May 10, 2024 · 2 comments
Assignees
Labels
beginner-task Small task for new contributors to ramp up multi-stage Related to the multi-stage query engine

Comments

@gortiz
Copy link
Contributor

gortiz commented May 10, 2024

In SQL intersect may be modified with the all modifier. For example:

if table A contains a column a with values [1,1,2,3,4] and B contains a column b with values [1, 1, 2]:

select * 
from (select a from A)
intersect (select b from B)

Returns [1, 2] while

select * 
from (select a from A)
intersect ALL (select b from B)

returns [1,1,2].

Currently Pinot accepts the ALL modifier and it is shown in the explain plan, but the semantics are always the same. Specifically, Pinot semantics are the ones of intersect without all modifier.

You can verify that by running ColocatedJoinEngineQuickStart and executing:

select userUUID
from (select userUUID from userGroups)
intersect all
(select userUUID from userGroups)

with and without all.

intersect all should return the same number of rows than count() (which is 2494) while intersect should return the same number of rows than count(distinct(userUUID)) (which is 2470).

But the returned number of rows is 2470 with and without all modifier

As a short term solution we can fail if all modifier is supplied, as it would be better than returning incorrect results.

@gortiz gortiz added beginner-task Small task for new contributors to ramp up multi-stage Related to the multi-stage query engine labels May 10, 2024
@yashmayya
Copy link
Collaborator

Hi @gortiz, could you please assign this issue and #13127 to me? I'd like to take a look.

@gortiz
Copy link
Contributor Author

gortiz commented May 16, 2024

Done in #13151 by @yashmayya

@gortiz gortiz closed this as completed May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beginner-task Small task for new contributors to ramp up multi-stage Related to the multi-stage query engine
Projects
None yet
Development

No branches or pull requests

2 participants