You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The coalesce partitions operator simply reduces the number of partitions to the specified amount.
The target partition count must be >=1
If the target partition count is >= the number of input partitions then this is a no-op and can be optimized out of the plan.
The simplest implementation would be to assign one or more input partitions to each output partition. This works well where the number of input partitions is divisible by the number of output partitions e.g. going from 64 input partitions to 8 output partitions. In other cases, the resulting partitions may have data skew e.g. going from 3 partitions to 2. It would be possible to do the partitioning at the row level but that would add a lot of overhead and the "repartition" operator should be used for that case.
The coalesce partitions operator simply reduces the number of partitions to the specified amount.
The target partition count must be >=1
If the target partition count is >= the number of input partitions then this is a no-op and can be optimized out of the plan.
The simplest implementation would be to assign one or more input partitions to each output partition. This works well where the number of input partitions is divisible by the number of output partitions e.g. going from 64 input partitions to 8 output partitions. In other cases, the resulting partitions may have data skew e.g. going from 3 partitions to 2. It would be possible to do the partitioning at the row level but that would add a lot of overhead and the "repartition" operator should be used for that case.
Reporter: Andy Grove / @andygrove
Note: This issue was originally created as ARROW-10583. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: