[SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter #25342

mccheah · 2019-08-03T00:48:13Z

What changes were proposed in this pull request?

Use the shuffle writer APIs introduced in SPARK-28209 in the sort shuffle writer.

How was this patch tested?

Existing unit tests were changed to use the plugin instead, and they used the local disk version to ensure that there were no regressions.

…tShuffleWriter.

SparkQA · 2019-08-03T03:21:27Z

Test build #108591 has finished for PR 25342 at commit 2f9b4ca.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-08-05T01:12:25Z

Retest this please.

dongjoon-hyun · 2019-08-05T01:12:38Z

cc @jerryshao

SparkQA · 2019-08-05T03:37:04Z

Test build #108635 has finished for PR 25342 at commit 2f9b4ca.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mccheah · 2019-08-05T18:44:35Z

@squito @yifeih also

dongjoon-hyun · 2019-08-06T03:49:00Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

-   *
-   * @param blockId block ID to write to. The index file will be blockId.name + ".index".
-   * @return array of lengths, in bytes, of each partition of the file (used by map output tracker)
+   * TODO remove this, as this is only used by UnsafeRowSerializerSuite in the SQL project.


nit. Could you file a JIRA and make this IDed TODO, please?

vanzin · 2019-08-06T21:57:05Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

+   * @return array of lengths, in bytes, of each partition of the file (used by map output tracker)
+   */
+  def writePartitionedMapOutput(
+      shuffleId: Int, mapId: Int, mapOutputWriter: ShuffleMapOutputWriter): Array[Long] = {


nit: multi-line arg style

vanzin · 2019-08-06T22:06:09Z

core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala

+  }
+
+  override def close(): Unit = {
+    if (isOpen) {


Minor, but if there's an error in open() (e.g. when initializing wrappedStream) this will leave the underlying partitionStream opened.

Maybe this flag isn't needed and you can just check whether the fields are initialized?

The worry is unnecessary because wrappedStream and objOut would must be initialized successfully if partitionStream is opened as OutputStream without exception.
And I think flag isOpen makes code easier to understand.

because wrappedStream and objOut would must be initialized successfully

That's not necessarily a valid assumption. Compression codecs, e.g., may throw exceptions if the file is corrupt.

gcz2022 · 2019-08-08T08:41:59Z

Will spill be supported in the series of PRs? @mccheah

jerryshao · 2019-08-08T11:36:32Z

core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala

+    curNumBytesWritten = numBytesWritten
+  }
+
+  private class CloseShieldOutputStream(delegate: OutputStream)


What is the usage of this class. Sorry I can only see the definition here.

squito · 2019-08-08T14:16:39Z

@gczsjdy

Will spill be supported in the series of PRs?

No, spill is still to local disk. trying to generalize local spills was explicitly out of scope for now.

gcz2022 · 2019-08-12T02:40:24Z

Thanks @squito

gcz2022 · 2019-08-12T03:34:46Z

core/src/main/scala/org/apache/spark/util/collection/PairsWriter.scala

+
+package org.apache.spark.util.collection
+
+private[spark] trait PairsWriter {


: nit add docs where can this be used?

gcz2022 · 2019-08-12T03:38:37Z

core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala

+ * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes the bytes to an
+ * arbitrary partition writer instead of writing to local disk through the block manager.
+ */
+private[spark] class ShufflePartitionPairsWriter(


This should instead be in o.a.s.s package?

gcz2022 · 2019-08-12T03:43:08Z

core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala

@@ -46,7 +47,8 @@ private[spark] class DiskBlockObjectWriter(
    writeMetrics: ShuffleWriteMetricsReporter,
    val blockId: BlockId = null)
  extends OutputStream
-  with Logging {
+  with Logging
+  with PairsWriter {


:nit add override to one function

Think this should be done now

whatlulumomo

good job

mccheah · 2019-08-17T00:09:42Z

Addressed comments.

SparkQA · 2019-08-17T00:11:57Z

Test build #109240 has finished for PR 25342 at commit e99274f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mccheah · 2019-08-22T12:33:41Z

@vanzin @squito or @dongjoon-hyun - is this good to merge?

core/src/main/scala/org/apache/spark/shuffle/ShufflePartitionPairsWriter.scala

…ctor-sort-shuffle-writer

SparkQA · 2019-08-27T01:06:22Z

Test build #109760 has finished for PR 25342 at commit 132bba9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mccheah · 2019-08-27T01:21:53Z

retest this please

SparkQA · 2019-08-27T03:32:41Z

Test build #109766 has finished for PR 25342 at commit 132bba9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

squito

other than marcelo's comment, looks good

squito · 2019-08-27T03:37:01Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

-   *
-   * @param blockId block ID to write to. The index file will be blockId.name + ".index".
-   * @return array of lengths, in bytes, of each partition of the file (used by map output tracker)
+   * TODO(SPARK-28764): remove this, as this is only used by UnsafeRowSerializerSuite in the SQL


can't that test just call sorter.writePartitionedMapOutput(..., new LocalDiskMapOutputWriter(...)) ? Anyway fine to leave it for the follow up jira.

squito · 2019-08-27T14:51:44Z

core/src/main/scala/org/apache/spark/shuffle/ShufflePartitionPairsWriter.scala

+          presentPrev
+        }).orElse(Some(e))
+    }
+    resolvedException


I can't think of anything wrong here, but seems safer to be using finally. kind of a stretch, but if some (badly implemented) stream throws a RuntimeException instead of an IOException you wouldn't clean up properly this way. The nesting gets a bit ugly, but you could do this:

def closeIfNonNull[T <: Closeable](x: T): T = { if (x != null) x.close() null.asInstanceOf[T] } Utils.tryWithSafeFinally { objOut = closeIfNonNull(objOut) } { // normally closing objOut would close the inner streams as well, but just in case there was // an error in initialization etc. we make sure we clean the other streams up too Utils.tryWithSafeFinally { wrappedStream = closeIfNonNull(wrappedStream) } { partitionStream = closeIfNonNull(partitionStream) } }

I also prefer Imran's approach. I'm just a tiny bit worried about bad stream implementations that don't have an idempotent close(), since both your code and Imran's are calling it multiple times on certain streams.

Probably ok not to deal with that though.

vanzin · 2019-08-27T18:08:59Z

core/src/main/scala/org/apache/spark/shuffle/ShufflePartitionPairsWriter.scala

+  override def write(key: Any, value: Any): Unit = {
+    if (!isOpen) {
+      open()
+      isOpen = true


Speaking of being nitpicky about error handling, this flag has weird semantics. If you call write and it fails to initialize the streams, and then you call write again, you'll potentially dereference still open streams.

vanzin · 2019-08-27T18:14:32Z

core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala

+    val mapOutputWriter = shuffleExecutorComponents.createMapOutputWriter(
+      dep.shuffleId, mapId, context.taskAttemptId(), dep.partitioner.numPartitions)
+    val partitionLengths = sorter.writePartitionedMapOutput(dep.shuffleId, mapId, mapOutputWriter)
+    mapOutputWriter.commitAllPartitions()


Just checking that you don't need any changes here? Given your other change that made commitAllPartitions return the partition lengths.

SparkQA · 2019-08-27T19:10:16Z

Test build #109825 has finished for PR 25342 at commit eec363c.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2019-08-27T21:20:52Z

core/src/main/scala/org/apache/spark/shuffle/ShufflePartitionPairsWriter.scala

+    null.asInstanceOf[T]
+  }
+
+  private def tryCloseOrAddSuppressed(


Not used anymore.

vanzin · 2019-08-27T21:22:51Z

core/src/main/scala/org/apache/spark/shuffle/ShufflePartitionPairsWriter.scala

+        partitionStream = closeIfNonNull(partitionStream)
+      }
+    }
+    isOpen = false


One last comment about error handling. I'll just quote the AutoCloseable documentation instead:

It is strongly advised to relinquish the underlying resources and to internally mark the resource as closed, prior to throwing the exception.

Meaning, track whether you've closed the object, not whether it's opened. (isOpen can be replaced with objOut != null.) Then in close() do nothing if the stream has already been closed.

vanzin · 2019-08-27T21:25:49Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

+              partitionPairsWriter.write(elem._1, elem._2)
+            }
+          }
+          var threwException = false


Shadow variable. But I wonder if tryWithSafeFinally isn't better here (and in the
"mirror" block above for the no-spill case).

I think this mirrors how UnsafeShuffleWriter and BypassMergeSortShuffleWriter approaches these cases - but those are written in Java so it's harder to use tryWithSafeFinally from there.

SparkQA · 2019-08-27T21:30:18Z

Test build #109829 has finished for PR 25342 at commit 84cfc29.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-08-28T00:37:29Z

Test build #109835 has finished for PR 25342 at commit 2451185.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-08-28T03:43:12Z

Test build #109840 has finished for PR 25342 at commit 5ba53bd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin

Looks ok with a minor thing. Will wait a bit to see if others have any comments.

vanzin · 2019-08-28T21:07:29Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

+          if (partitionPairsWriter != null) {
+            partitionPairsWriter.close()
+          }
+          if (partitionWriter != null) {


Dead code? Or missing code?

SparkQA · 2019-08-28T23:57:40Z

Test build #109884 has finished for PR 25342 at commit d483157.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2019-08-30T16:42:46Z

Alright, no more comments, so merging to master.

…rtShuffleWriter Use the shuffle writer APIs introduced in SPARK-28209 in the sort shuffle writer. Existing unit tests were changed to use the plugin instead, and they used the local disk version to ensure that there were no regressions. Closes apache#25342 from mccheah/shuffle-writer-refactor-sort-shuffle-writer. Lead-authored-by: mcheah <mcheah@palantir.com> Co-authored-by: mccheah <mcheah@palantir.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>

* Bring implementation into closer alignment with upstream. Step to ease merge conflict resolution and build failure problems when we pull in changes from upstream. * Cherry-pick BypassMergeSortShuffleWriter changes and shuffle writer API changes * [SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API. Existing unit tests. Closes apache#25341 from mccheah/dont-redundantly-store-part-lengths. Authored-by: mcheah <mcheah@palantir.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> * [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter Use the shuffle writer APIs introduced in SPARK-28209 in the sort shuffle writer. Existing unit tests were changed to use the plugin instead, and they used the local disk version to ensure that there were no regressions. Closes apache#25342 from mccheah/shuffle-writer-refactor-sort-shuffle-writer. Lead-authored-by: mcheah <mcheah@palantir.com> Co-authored-by: mccheah <mcheah@palantir.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> * [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API. * Resolve build issues and remaining semantic conflicts * More build fixes * More build fixes * Attempt to fix build * More build fixes * [SPARK-29072] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and ShufflePartitionPairsWriter. * Address comments * Import ordering * Fix stream reference

tianczha · 2020-07-13T21:20:47Z

core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala

@@ -157,7 +157,8 @@ private[spark] class SortShuffleManager(conf: SparkConf) extends ShuffleManager
          metrics,
          shuffleExecutorComponents)
      case other: BaseShuffleHandle[K @unchecked, V @unchecked, _] =>
-        new SortShuffleWriter(shuffleBlockResolver, other, mapId, context)
+        new SortShuffleWriter(
+          shuffleBlockResolver, other, mapId, context, shuffleExecutorComponents)


shuffleBlockResolver is not needed.

mccheah added 3 commits August 2, 2019 16:52

[SPARK-28571][CORE]SHUFFLE] Use the shuffle writer plugin for the Sor…

155bcb1

…tShuffleWriter.

Fix some compilation

3101a4a

More build fixes

2f9b4ca

dongjoon-hyun added SPARK CORE SHUFFLE labels Aug 3, 2019

dongjoon-hyun changed the title ~~[SPARK-28571][CORE]SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter~~ [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter Aug 3, 2019

dongjoon-hyun reviewed Aug 6, 2019

View reviewed changes

vanzin reviewed Aug 6, 2019

View reviewed changes

jerryshao reviewed Aug 8, 2019

View reviewed changes

gcz2022 reviewed Aug 12, 2019

View reviewed changes

whatlulumomo approved these changes Aug 15, 2019

View reviewed changes

Address comments

e99274f

vanzin reviewed Aug 26, 2019

View reviewed changes

core/src/main/scala/org/apache/spark/shuffle/ShufflePartitionPairsWriter.scala Show resolved Hide resolved

mccheah added 3 commits August 26, 2019 15:48

Merge remote-tracking branch 'origin/master' into shuffle-writer-refa…

a6168f4

…ctor-sort-shuffle-writer

Close resources safely

92f1fa4

Closeables.close again

132bba9

squito reviewed Aug 27, 2019

View reviewed changes

vanzin reviewed Aug 27, 2019

View reviewed changes

Address comments

eec363c

Fix compilation

84cfc29

vanzin reviewed Aug 27, 2019

View reviewed changes

Address comments

2451185

Don't close underlying streams twice

5ba53bd

vanzin reviewed Aug 28, 2019

View reviewed changes

Remove dead code

d483157

vanzin closed this in ea90ea6 Aug 30, 2019

tianczha reviewed Jul 13, 2020

View reviewed changes


		package org.apache.spark.util.collection

		private[spark] trait PairsWriter {

[SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter #25342

[SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter #25342

Conversation

mccheah commented Aug 3, 2019

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Aug 3, 2019

dongjoon-hyun commented Aug 5, 2019

dongjoon-hyun commented Aug 5, 2019

SparkQA commented Aug 5, 2019

mccheah commented Aug 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whatlulumomo Aug 15, 2019 • edited Loading

Choose a reason for hiding this comment

vanzin Aug 15, 2019 • edited Loading

Choose a reason for hiding this comment

gcz2022 commented Aug 8, 2019 • edited Loading

Choose a reason for hiding this comment

squito commented Aug 8, 2019

gcz2022 commented Aug 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whatlulumomo left a comment

Choose a reason for hiding this comment

mccheah commented Aug 17, 2019

SparkQA commented Aug 17, 2019

mccheah commented Aug 22, 2019

SparkQA commented Aug 27, 2019

mccheah commented Aug 27, 2019

SparkQA commented Aug 27, 2019

squito left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Aug 27, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Aug 27, 2019

SparkQA commented Aug 28, 2019

SparkQA commented Aug 28, 2019

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Aug 28, 2019

vanzin commented Aug 30, 2019

Choose a reason for hiding this comment

whatlulumomo Aug 15, 2019 •

edited

Loading

vanzin Aug 15, 2019 •

edited

Loading

gcz2022 commented Aug 8, 2019 •

edited

Loading