-
Notifications
You must be signed in to change notification settings - Fork 582
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Description
one common performance issue of Spark the shuffle. Shuffle data size may impact the performance directly.
While one advantage of columnar shuffle is that we can easily use dictionary based, which is expected to decrease the shuffle data size. Meanwhile the Velox pipeline supports dictionary but now all data is flattened after shuffle. With dictionary shuffle support, the next stage can still use the dictionary data which is expected to save memory as well.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Type
Projects
Status
Done