You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By default Trino partitions data by all available columns for partitioning. For example for JOIN Trino partitions data based on all the columns appearing in the equality condition. For GROUP BY Trino partitions data on all available grouping keys.
However due to the relatively high cost of a partitioning (shuffle) operation Trino may try to avoid unnecessary re-partitions when possible.
For example if data is already partitioned on column_a to perform a certain operation (such as a JOIN or a GROUP BY) Trino may decide not to partition data again to perform a subsequent operation that has column_a as one of the partitioning keys among the other keys. For example consider a query:
SELECT t1.column_a, t2.column_b, count(*)
FROM t1, t2
WHERE t1.column_a = t2.column_a
GROUP BY t1.column_a, t2.column_b
To join t1 with t2 trino has to partition both tables on column_a. To run a subsequent aggregation Trino may decide to either partition again on both, column_a and column_b or to preserve an existing partitioning on column_a and run the subsequent aggregation in place.
By default Trino decided to avoid unnecessary partitioning and run a subsequent operation in place if possible. However it introduces a risk of a memory skew, as for certain values of column_a there could be significantly more values of column_b than for other values of column_a.
In the future Trino should be able to detect such conditions automatically and apply runtime based optimizations to fix this problem automatically.
However today, in the absence of such an optimization it would be great to allow users to instruct Trino to prefer adding an extra exchange to avoid potential skews.
An alternative would be to suggest users adding an identity projection to trick the optimizer:
SELECT t1.column_a, t2.column_b, count(*)
FROM t1, t2
WHERE t1.column_a = t2.column_a
GROUP BY t1.column_a + 0, t2.column_b
However this approach is fragile as it may silently break when optimizer becomes smarter at detecting identity projections.
The text was updated successfully, but these errors were encountered:
By default Trino partitions data by all available columns for partitioning. For example for
JOIN
Trino partitions data based on all the columns appearing in the equality condition. ForGROUP BY
Trino partitions data on all available grouping keys.However due to the relatively high cost of a partitioning (shuffle) operation Trino may try to avoid unnecessary re-partitions when possible.
For example if data is already partitioned on
column_a
to perform a certain operation (such as aJOIN
or aGROUP BY
) Trino may decide not to partition data again to perform a subsequent operation that hascolumn_a
as one of the partitioning keys among the other keys. For example consider a query:To join
t1
witht2
trino has to partition both tables oncolumn_a
. To run a subsequent aggregation Trino may decide to either partition again on both,column_a
andcolumn_b
or to preserve an existing partitioning oncolumn_a
and run the subsequent aggregation in place.By default Trino decided to avoid unnecessary partitioning and run a subsequent operation in place if possible. However it introduces a risk of a memory skew, as for certain values of
column_a
there could be significantly more values ofcolumn_b
than for other values ofcolumn_a
.In the future Trino should be able to detect such conditions automatically and apply runtime based optimizations to fix this problem automatically.
However today, in the absence of such an optimization it would be great to allow users to instruct Trino to prefer adding an extra exchange to avoid potential skews.
An alternative would be to suggest users adding an identity projection to trick the optimizer:
However this approach is fragile as it may silently break when optimizer becomes smarter at detecting identity projections.
The text was updated successfully, but these errors were encountered: