Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice. #25341

Closed

Conversation

mccheah
Copy link
Contributor

@mccheah mccheah commented Aug 2, 2019

What changes were proposed in this pull request?

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.

How was this patch tested?

Existing unit tests.

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.
* stream might compress or encrypt the bytes before persisting the data to the backing
* data store.
*/
long getNumBytesWritten();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we _might- still need this - appears to be used by the sort shuffle writer per https://github.com/apache/spark/pull/25342/files#diff-fe378a929dd1f5c5ac8bff90dab743b1R87... hmm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the metrics after the commit (by adding up all the partition lengths)?

Otherwise it doesn't seem horrible to keep this in the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep the API for now.

@SparkQA
Copy link

SparkQA commented Aug 3, 2019

Test build #108588 has finished for PR 25341 at commit 3b26014.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link

@whatlulumomo whatlulumomo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job~

@mccheah
Copy link
Contributor Author

mccheah commented Aug 17, 2019

Addressed comments

@SparkQA
Copy link

SparkQA commented Aug 17, 2019

Test build #109244 has finished for PR 25341 at commit 3171ad2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mccheah
Copy link
Contributor Author

mccheah commented Aug 22, 2019

@vanzin is this ok to merge?

@vanzin
Copy link
Contributor

vanzin commented Aug 22, 2019

Don't know. Need time to review and I've been busy.

@vanzin
Copy link
Contributor

vanzin commented Aug 26, 2019

Merging to master.

@vanzin vanzin closed this in 2efa6f5 Aug 26, 2019
}
}
lengths[i] = writer.getNumBytesWritten();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick question here. So after this change, there's no place to call ShufflePartitionWriter.getNumBytesWritten()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That method is still there.

mccheah added a commit to palantir/spark that referenced this pull request Sep 11, 2019
The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.

Existing unit tests.

Closes apache#25341 from mccheah/dont-redundantly-store-part-lengths.

Authored-by: mcheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
mccheah added a commit to palantir/spark that referenced this pull request Sep 13, 2019
* Bring implementation into closer alignment with upstream.

Step to ease merge conflict resolution and build failure problems when we pull in changes from upstream.

* Cherry-pick BypassMergeSortShuffleWriter changes and shuffle writer API changes

* [SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.

Existing unit tests.

Closes apache#25341 from mccheah/dont-redundantly-store-part-lengths.

Authored-by: mcheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>

* [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter

Use the shuffle writer APIs introduced in SPARK-28209 in the sort shuffle writer.

Existing unit tests were changed to use the plugin instead, and they used the local disk version to ensure that there were no regressions.

Closes apache#25342 from mccheah/shuffle-writer-refactor-sort-shuffle-writer.

Lead-authored-by: mcheah <mcheah@palantir.com>
Co-authored-by: mccheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>

* [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API.

* Resolve build issues and remaining semantic conflicts

* More build fixes

* More build fixes

* Attempt to fix build

* More build fixes

* [SPARK-29072] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and ShufflePartitionPairsWriter.

* Address comments

* Import ordering

* Fix stream reference
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants