Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support preferring exact partitioning #12495

Closed
arhimondr opened this issue May 20, 2022 · 1 comment · Fixed by #12967
Closed

Support preferring exact partitioning #12495

arhimondr opened this issue May 20, 2022 · 1 comment · Fixed by #12967
Assignees

Comments

@arhimondr
Copy link
Contributor

arhimondr commented May 20, 2022

By default Trino partitions data by all available columns for partitioning. For example for JOIN Trino partitions data based on all the columns appearing in the equality condition. For GROUP BY Trino partitions data on all available grouping keys.

However due to the relatively high cost of a partitioning (shuffle) operation Trino may try to avoid unnecessary re-partitions when possible.

For example if data is already partitioned on column_a to perform a certain operation (such as a JOIN or a GROUP BY) Trino may decide not to partition data again to perform a subsequent operation that has column_a as one of the partitioning keys among the other keys. For example consider a query:

SELECT t1.column_a, t2.column_b, count(*)
FROM t1, t2
WHERE t1.column_a = t2.column_a
GROUP BY t1.column_a, t2.column_b

To join t1 with t2 trino has to partition both tables on column_a. To run a subsequent aggregation Trino may decide to either partition again on both, column_a and column_b or to preserve an existing partitioning on column_a and run the subsequent aggregation in place.

By default Trino decided to avoid unnecessary partitioning and run a subsequent operation in place if possible. However it introduces a risk of a memory skew, as for certain values of column_a there could be significantly more values of column_b than for other values of column_a.

In the future Trino should be able to detect such conditions automatically and apply runtime based optimizations to fix this problem automatically.

However today, in the absence of such an optimization it would be great to allow users to instruct Trino to prefer adding an extra exchange to avoid potential skews.

An alternative would be to suggest users adding an identity projection to trick the optimizer:

SELECT t1.column_a, t2.column_b, count(*)
FROM t1, t2
WHERE t1.column_a = t2.column_a
GROUP BY t1.column_a + 0, t2.column_b

However this approach is fragile as it may silently break when optimizer becomes smarter at detecting identity projections.

@arhimondr
Copy link
Contributor Author

Here's an example PR from Presto: prestodb/presto#13354

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants