Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize SELECT UNNEST in lateral joins #6035

Merged
merged 21 commits into from
Feb 1, 2023

Conversation

taniabogatsch
Copy link
Contributor

This PR adds the unnest_rewriter to the optimizer. While the two queries below are semantically equal, their query plans currently differ. The second query performs a duplicate eliminated join (DelimJoin), which is the default approach for lateral joins.

# UNNEST in the projection
SELECT i, UNNEST(i) AS j FROM (VALUES ([1, 2, 3]), ([4, 5])) t(i);
# different query plan, same result
SELECT * FROM (VALUES ([1, 2, 3]), ([4, 5])) t(i), (SELECT UNNEST(i)) t2(j);

However, a DelimJoin is not required for UNNESTs in the FROM clause. Instead, we can eliminate the DelimJoin. Then, we make the UNNEST the direct child of the former parent of the DelimJoin. Together with #5982, this PR tackles the performance issues in #5827.

Here are two query plans for comparison.

EXPLAIN SELECT * FROM (VALUES ([1, 2, 3])) t(i), (SELECT UNNEST(i)) t2(j);

Current master branch.
Screen Shot 2023-01-30 at 11 28 18

This PR.
Screen Shot 2023-01-30 at 11 28 27

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good! Great performance gains.

I have been playing around with this for a bit and found this query does not work yet:

SELECT UNNEST(j) FROM (VALUES ([[1, 2, 3]]), ([[4, 5]])) t(i), (SELECT UNNEST(i)) t2(j);

Could you have a look at that?

Looks good to me otherwise.

@taniabogatsch
Copy link
Contributor Author

Thanks for the find! I updated the implementation to skip unknown parent operators instead of throwing an internal exception, so this query works now! :)

@Mytherin Mytherin merged commit d8add1e into duckdb:master Feb 1, 2023
@Mytherin
Copy link
Collaborator

Mytherin commented Feb 1, 2023

Thanks for the changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants