[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice. #25341

mccheah · 2019-08-02T23:48:06Z

What changes were proposed in this pull request?

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.

How was this patch tested?

Existing unit tests.

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.

mccheah · 2019-08-03T01:11:52Z

core/src/main/java/org/apache/spark/shuffle/api/ShufflePartitionWriter.java

-   * stream might compress or encrypt the bytes before persisting the data to the backing
-   * data store.
-   */
-  long getNumBytesWritten();


I think we _might- still need this - appears to be used by the sort shuffle writer per https://github.com/apache/spark/pull/25342/files#diff-fe378a929dd1f5c5ac8bff90dab743b1R87... hmm.

Could you update the metrics after the commit (by adding up all the partition lengths)?

Otherwise it doesn't seem horrible to keep this in the API.

Let's just keep the API for now.

SparkQA · 2019-08-03T02:21:13Z

Test build #108588 has finished for PR 25341 at commit 3b26014.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

whatlulumomo

good job~

mccheah · 2019-08-17T00:14:44Z

Addressed comments

SparkQA · 2019-08-17T01:02:14Z

Test build #109244 has finished for PR 25341 at commit 3171ad2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mccheah · 2019-08-22T12:33:05Z

@vanzin is this ok to merge?

vanzin · 2019-08-22T15:59:21Z

Don't know. Need time to review and I've been busy.

vanzin · 2019-08-26T17:39:11Z

Merging to master.

xuanyuanking · 2019-08-30T11:29:27Z

core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java

          }
        }
-        lengths[i] = writer.getNumBytesWritten();


Just a quick question here. So after this change, there's no place to call ShufflePartitionWriter.getNumBytesWritten()?

That method is still there.

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API. Existing unit tests. Closes apache#25341 from mccheah/dont-redundantly-store-part-lengths. Authored-by: mcheah <mcheah@palantir.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>

* Bring implementation into closer alignment with upstream. Step to ease merge conflict resolution and build failure problems when we pull in changes from upstream. * Cherry-pick BypassMergeSortShuffleWriter changes and shuffle writer API changes * [SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API. Existing unit tests. Closes apache#25341 from mccheah/dont-redundantly-store-part-lengths. Authored-by: mcheah <mcheah@palantir.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> * [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter Use the shuffle writer APIs introduced in SPARK-28209 in the sort shuffle writer. Existing unit tests were changed to use the plugin instead, and they used the local disk version to ensure that there were no regressions. Closes apache#25342 from mccheah/shuffle-writer-refactor-sort-shuffle-writer. Lead-authored-by: mcheah <mcheah@palantir.com> Co-authored-by: mccheah <mcheah@palantir.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> * [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API. * Resolve build issues and remaining semantic conflicts * More build fixes * More build fixes * Attempt to fix build * More build fixes * [SPARK-29072] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and ShufflePartitionPairsWriter. * Address comments * Import ordering * Fix stream reference

mccheah commented Aug 3, 2019

View reviewed changes

dongjoon-hyun added SPARK CORE SHUFFLE labels Aug 3, 2019

whatlulumomo approved these changes Aug 15, 2019

View reviewed changes

Add back getNumBytesWritten API.

3171ad2

vanzin closed this in 2efa6f5 Aug 26, 2019

xuanyuanking reviewed Aug 30, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice. #25341

[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice. #25341

mccheah commented Aug 2, 2019

mccheah Aug 3, 2019

vanzin Aug 6, 2019

mccheah Aug 16, 2019

SparkQA commented Aug 3, 2019

whatlulumomo left a comment

mccheah commented Aug 17, 2019

SparkQA commented Aug 17, 2019

mccheah commented Aug 22, 2019

vanzin commented Aug 22, 2019

vanzin commented Aug 26, 2019

xuanyuanking Aug 30, 2019

vanzin Aug 30, 2019

[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice. #25341

[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice. #25341

Conversation

mccheah commented Aug 2, 2019

What changes were proposed in this pull request?

How was this patch tested?

mccheah Aug 3, 2019

Choose a reason for hiding this comment

vanzin Aug 6, 2019

Choose a reason for hiding this comment

mccheah Aug 16, 2019

Choose a reason for hiding this comment

SparkQA commented Aug 3, 2019

whatlulumomo left a comment

Choose a reason for hiding this comment

mccheah commented Aug 17, 2019

SparkQA commented Aug 17, 2019

mccheah commented Aug 22, 2019

vanzin commented Aug 22, 2019

vanzin commented Aug 26, 2019

xuanyuanking Aug 30, 2019

Choose a reason for hiding this comment

vanzin Aug 30, 2019

Choose a reason for hiding this comment