[SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API #25007

mccheah · 2019-06-28T18:19:13Z

What changes were proposed in this pull request?

As part of the shuffle storage API proposed in SPARK-25299, this introduces an API for persisting shuffle data in arbitrary storage systems.

This patch introduces several concepts:

ShuffleDataIO, which is the root of the entire plugin tree that will be proposed over the course of the shuffle API project.
ShuffleExecutorComponents - the subset of plugins for managing shuffle-related components for each executor. This will in turn instantiate shuffle readers and writers.
ShuffleMapOutputWriter interface - instantiated once per map task. This provides child ShufflePartitionWriter instances for persisting the bytes for each partition in the map task.

The default implementation of these plugins exactly mirror what was done by the existing shuffle writing code - namely, writing the data to local disk and writing an index file. We leverage the APIs in the BypassMergeSortShuffleWriter only. Follow-up PRs will use the APIs in SortShuffleWriter and UnsafeShuffleWriter, but are left as future work to minimize the review surface area.

How was this patch tested?

New unit tests were added. Micro-benchmarks indicate there's no slowdown in the affected code paths.

Introduces the new Shuffle Writer API. Ported from bloomberg#5.

@ifilonenko

…524) Implements the shuffle writer API by writing shuffle files to local disk and using the index block resolver to commit data and write index files. The logic in `BypassMergeSortShuffleWriter` has been refactored to use the base implementation of the plugin instead. APIs have been slightly renamed to clarify semantics after considering nuances in how these are to be implemented by other developers. Follow-up commits are to come for `SortShuffleWriter` and `UnsafeShuffleWriter`. Ported from bloomberg#6, credits to @ifilonenko.

@ifilonenko

Ported from bloomberg#9. Credits to @ifilonenko!

#532) * [SPARK-25299] Use the shuffle writer plugin for the SortShuffleWriter. * Remove unused * Handle empty partitions properly. * Adjust formatting * Don't close streams twice. Because compressed output streams don't like it. * Clarify comment

Implements the shuffle locations API as part of SPARK-25299. This adds an additional field to all `MapStatus` objects: a `MapShuffleLocations` that indicates where a task's map output is stored. This module is optional and implementations of the pluggable shuffle writers and readers can ignore it accordingly. This API is designed with the use case in mind of future plugin implementations desiring to have the driver store metadata about where shuffle blocks are stored. There are a few caveats to this design: - We originally wanted to remove the `BlockManagerId` from `MapStatus` entirely and replace it with this object. However, doing this proves to be very difficult, as many places use the block manager ID for other kinds of shuffle data bookkeeping. As a result, we concede to storing the block manager ID redundantly here. However, the overhead should be minimal: because we cache block manager ids and default map shuffle locations, the two fields in `MapStatus` should point to the same object on the heap. Thus we add `O(M)` storage overhead on the driver, where for each map status we're storing an additional pointer to the same on-heap object. We will run benchmarks against the TPC-DS workload to see if there are significant performance repercussions for this implementation. - `KryoSerializer` expects `CompressedMapStatus` and `HighlyCompressedMapStatus` to be serialized via reflection, so originally all fields of these classes needed to be registered with Kryo. However, the `MapShuffleLocations` is now pluggable. We think however that previously Kryo was defaulting to Java serialization anyways, so we now just explicitly tell Kryo to use `ExternalizableSerializer` to deal with these objects. There's a small hack in the serialization protocol that attempts to avoid serializing the same `BlockManagerId` twice in the case that the map shuffle locations is a `DefaultMapShuffleLocations`.

…tion ids (#540) We originally made the shuffle map output writer API behave like an iterator in fetching the "next" partition writer. However, the shuffle writer implementations tend to skip opening empty partitions. If we used an iterator-like API though we would be tied down to opening a partition writer for every single partition, even if some of them are empty. Here, we go back to using specific partition identifiers to give us more freedom to avoid needing to create writers for empty partitions.

… writer (#541)

…535) * Propose a new NIO transfer API for partition writing. This solves the consistency and resource leakage concerns with the first iteration of thie API, where it would not be obvious that the streamable resources created by ShufflePartitionWriter needed to be closed by ShuffleParittionWriter#close as opposed to closing the resources directly. This introduces the following adjustments: - Channel-based writes are separated out to their own module, SupportsTransferTo. This allows the transfer-to APIs to be modified independently, and users that only provide output streams can ignore the NIO APIs entirely. This also allows us to mark the base ShufflePartitionWriter as a stable API eventually while keeping the NIO APIs marked as experimental or developer-api. - We add APIs that explicitly encodes the notion of transferring bytes from one source to another. The partition writer returns an instance of TransferrableWritableByteChannel, which has APIs for accepting a TransferrableReadableByteChannel and can tell the readable byte channel to transfer its bytes out to some destination sink. - The resources returned by ShufflePartitionWriter are always closed. Internally, DefaultMapOutputWriter keeps resources open until commitAllPartitions() is called. * Migrate unsafe shuffle writer to use new byte channel API. * More sane implementation for unsafe * Fix style * Address comments * Fix imports * Fix build * Fix more build problems * Address comments.

mccheah · 2019-06-28T18:19:33Z

ok to test

mccheah · 2019-06-28T18:20:26Z

@jerryshao @squito @yifeih

core/src/main/java/org/apache/spark/api/shuffle/ShuffleWriteSupport.java

SparkQA · 2019-06-28T19:42:16Z

Test build #107020 has finished for PR 25007 at commit 3167030.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-06-28T22:13:12Z

Test build #107029 has finished for PR 25007 at commit 70f59db.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-06-29T00:34:42Z

Test build #107031 has finished for PR 25007 at commit 3083d86.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-06-29T01:21:25Z

Test build #107032 has finished for PR 25007 at commit 4c3d692.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jerryshao · 2019-07-01T06:02:57Z

core/src/main/java/org/apache/spark/api/shuffle/ShuffleDataIO.java

+ * limitations under the License.
+ */
+
+package org.apache.spark.api.shuffle;


Not sure if it is proper to add the interfaces to here o.a.s.api? Looks like most of the things under the api package are related to rdd functions. How about this package o.a.s.shuffle.api?

jerryshao · 2019-07-01T09:55:35Z

core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java

+      mapOutputWriter.commitAllPartitions();
+      mapStatus = MapStatus$.MODULE$.apply(
+          blockManager.shuffleServerId(),
+          partitionLengths);


Seems the indention here is not correct.

yifeih · 2019-07-01T18:29:58Z

core/src/main/java/org/apache/spark/api/shuffle/ShufflePartitionWriter.java

+
+/**
+ * :: Experimental ::
+ * An interface for giving streams / channels for shuffle writes.


nit: should we omit "channel"? there's nothing else in the API referencing it

SparkQA · 2019-07-02T01:55:23Z

Test build #107088 has finished for PR 25007 at commit 2421c92.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito

for folks that haven't been following along in your fork, can you also give a link to what the complete implementation looks like? Even if that code is not merge-able quality, it can still be helpful to see how the pieces fit together. Also if you have a link to some new shuffle storage implementation.

squito · 2019-07-03T21:21:50Z

core/src/main/java/org/apache/spark/api/shuffle/ShuffleDataIO.java

+ * limitations under the License.
+ */
+
+package org.apache.spark.api.shuffle;


squito · 2019-07-03T21:38:28Z

core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java

+            TransferrableWritableByteChannel outputChannel = null;
+            try (FileChannel inputChannel = in.getChannel()) {
+              if (writer instanceof SupportsTransferTo) {
+                outputChannel = ((SupportsTransferTo) writer).openTransferrableChannel();


I remember we discussed this before -- but why doesn't this just return a WritableByteChannel? Whatever the reason is should probably be a comment somewhere

Again we needed consistency between this writer and the UnsafeShuffleWriter. See palantir#535 (comment)

And I'm not sure we want to add a comment here until we have the parallel implementation in UnsafeShuffleWriter, which I've broken off into a separate patch. We can add the documentation there so that the comparison is more obvious. Thoughts?

squito · 2019-07-03T21:39:39Z

core/src/main/java/org/apache/spark/api/shuffle/TransferrableWritableByteChannel.java

+
+  /**
+   * Copy all bytes from the source readable byte channel into this byte channel.
+   *


though you mention "copy all", its probably worth repeating in this comment that this differs from FileChannel.transferTo(), in that this will block until all bytes have been transferred

squito · 2019-07-04T01:30:03Z

core/src/main/java/org/apache/spark/api/shuffle/ShuffleExecutorComponents.java

+public interface ShuffleExecutorComponents {
+  void initializeExecutor(String appId, String execId);
+
+  ShuffleWriteSupport writes();


this should have a doc. At the very least, I'd mention that its called once per ShuffleMapTask

squito · 2019-07-04T01:39:30Z

core/src/main/java/org/apache/spark/shuffle/sort/io/DefaultShuffleDataIO.java

+import org.apache.spark.api.shuffle.ShuffleExecutorComponents;
+import org.apache.spark.api.shuffle.ShuffleDataIO;
+
+public class DefaultShuffleDataIO implements ShuffleDataIO {


this should have a comment that its implementing the only shuffle storage available with spark <= 2.4, using local data & index fiiles.

In fact I'm wondering if it should be renamed to LocalShuffleStorageDataIO or something like that ...

squito · 2019-07-04T01:42:24Z

core/src/main/java/org/apache/spark/shuffle/sort/io/DefaultShuffleMapOutputWriter.java

+import org.apache.spark.storage.TimeTrackingOutputStream;
+import org.apache.spark.util.Utils;
+
+public class DefaultShuffleMapOutputWriter implements ShuffleMapOutputWriter {


again, a comment here on how this is creating local index & data files, the only option with spark <= 2.4 would be helpful.

squito · 2019-07-04T01:46:14Z

core/src/main/java/org/apache/spark/api/shuffle/ShuffleMapOutputWriter.java

+
+  void commitAllPartitions() throws IOException;
+
+  void abort(Throwable error) throws IOException;


these should have some more docs. Eg. at least saying one of these is created for the output of each ShuffleMapTask, and that the "partition" being referenced here is the reduce partition, so getPartitionedWriter will get called once per reduce partition

SparkQA · 2019-07-25T02:54:11Z

Test build #108136 has finished for PR 25007 at commit 9f17b9b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-07-25T05:36:51Z

Test build #108146 has finished for PR 25007 at commit b8b7b8d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gcz2022 · 2019-07-25T11:57:42Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleWriteSupport.java

+ * @since 3.0.0
+ */
+@Private
+public interface ShuffleWriteSupport {


This layer has already been removed. : )

vanzin

Looks ok bar the metrics stuff.

vanzin · 2019-07-25T18:21:59Z

core/src/main/java/org/apache/spark/shuffle/api/ShufflePartitionWriter.java

+   * Implementations that intend on combining the bytes for all the partitions written by this
+   * map task should reuse the same channel instance across all the partition writers provided
+   * by the parent {@link ShuffleMapOutputWriter}. If one does so, ensure that
+   * {@link WritableByteChannelWrapper#close()} does not close the resource, since it


I think "it" here should be replaced with "the underlying channel" (otherwise it seems to refer to a specific instance of WritableByteChannelWrapper).

vanzin · 2019-07-25T18:26:24Z

core/src/main/java/org/apache/spark/shuffle/api/ShufflePartitionWriter.java

+   * This method is primarily for advanced optimizations where bytes can be copied from the input
+   * spill files to the output channel without copying data into memory.
+   * <p>
+   * The default implementation should be sufficient for most situations. Only override this


Actually I'm not sure this is true, if the goal is to actually provide an optimization. In that case, the default implementation is only sufficient if your stream is a FileInputStream (just checked what Channels.newChannel() does).

Otherwise, the wrapper created will copy data into user memory, basically negating the optimization.

(Which maybe is an argument for returning null here and falling back to the normal IO path when that happens.)

What we're saying is that this kind of low-level optimization isn't the first place to look to improve performance most of the time, so to speak. So if one has to do the optimization, they should provide the proper override, but, the specific optimization isn't a critical factor to consider outside of the local disk implementation.

So why not follow my suggestion and return null here by default? It makes it much more clear that this implementation is not needed, and that by default the non-nio path is used.

It's primarily to avoid returning null from the API - in that case I'd rather return Optional, then Optional.empty.

Sure. The main thing is returning something that indicates that this feature is not supported, instead of by default wrapping things a way that might actually hurt performance.

vanzin · 2019-07-25T18:27:30Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleWriteSupport.java

+      int mapId,
+      long mapTaskAttemptId,
+      int numPartitions,
+      ShuffleWriteMetricsReporter mapTaskWriteMetrics) throws IOException;


Imran IIRC will only be back next week, so unless you're ok with waiting, probably should remove this and re-add it later after we figure out exactly what's needed.

vanzin · 2019-07-25T18:36:28Z

core/src/test/scala/org/apache/spark/shuffle/sort/io/LocalDiskShuffleMapOutputWriterSuite.scala

+    (0 until NUM_PARTITIONS).foreach { p =>
+      val writer = mapOutputWriter.getPartitionWriter(p)
+      val outputTempFile = File.createTempFile("channelTemp", "", tempDir)
+      val outputTempFileStream = new FileOutputStream(outputTempFile)


Files.write(Path, byte[]) is basically a one-line version of these statements.

jerryshao · 2019-07-29T13:14:40Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleExecutorComponents.java

+
+package org.apache.spark.shuffle.api;
+
+import java.io.IOException;


nit: white space between different import groups.

mccheah · 2019-07-29T18:02:53Z

retest this please

hiboyang · 2019-07-29T18:21:48Z

core/src/main/java/org/apache/spark/shuffle/api/ShufflePartitionWriter.java

+   * stream might compress or encrypt the bytes before persisting the data to the backing
+   * data store.
+   */
+  long getNumBytesWritten();


This class delegates writing to OutputStream by openStream(). Will getNumBytesWritten() in this class access internal state inside that OutputStream? How about let OutputStream track the number of bytes written so this class does not need to access OutputStream? One possible solution is to add a subclass of OutputStream to track number of bytes. Something like existing TimeTrackingOutputStream class in Spark which extends OutputStream.

The idea is that if the implementation also supports creating a custom WritableByteChannel, then the number of bytes written would be from that of the channel, not the output stream. One could see us having both a custom output stream and an added method on WritableByteChannelWrapper.

Ah I also remember why we didn't attach it to the output stream - it's particularly because of the lifecycle. If we have an output stream for the partition that pads bytes upon closing the stream, it's unclear that one will continue to call methods on the output stream object after it has been closed. That's why we have the contract:

Open stream for writing bytes.

Write bytes

Close stream

Get written bytes for that partition, accounting for the fact that the above step closed the stream.

In this case, the OutputStream returned by openStream() is tightly coupled with ShufflePartitionWriter. Could we merge them together into one class, e.g.

ShufflePartitionWriterStream extends OutputStream { open(); getNumBytesWritten(); }

An OutputStream instance is considered opened as soon as the object exists, which is why OutputStream extends Closeable. As soon as I have a reference to the OutputStream object I can call write on it to push bytes to the sink. So having a separate open method doesn't make sense.

The open method belongs in the ShufflePartitionWriter API, which is effectively what we have with openStream and openChannel.

Oh, I mean the OutputStream returned by openStream() is tightly coupled with ShufflePartitionWriter, thus suggest merging them together. for example, rename ShufflePartitionWriter to ShufflePartitionWriterStream which extends OutputStream:

ShufflePartitionWriterStream extends OutputStream {
void open();
long getNumBytesWritten();
}

In this case, user do not need to create a ShufflePartitionWriter and then call its openStream() method to get an OutputStream. Instead, user will create ShufflePartitionWriterStream, which is already an OutputStream.

But again, do we call getNumBytesWritten before or after calling close on this object? If before, does it include the bytes that might be padded in close-ing the stream? If after, are we going to be invoking methods on a closed resource, and is that reasonable?

hiboyang · 2019-07-29T18:30:15Z

core/src/main/java/org/apache/spark/shuffle/sort/io/LocalDiskShuffleMapOutputWriter.java

+        initStream();
+        partStream = new PartitionWriterStream(partitionId);
+      }
+      return partStream;


Feel a little uncomfortable to return internal field "partStream" to outside of this class. Is it possible to modify the design here to avoid returning internal field?

The idea is that we want to share the stream across all partition writes in this implementation. I think returning the internal field represents that paradigm properly.

hiboyang · 2019-07-29T18:33:15Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleMapOutputWriter.java

+   * guaranteed to be called for every partition id in the above described range. In particular,
+   * no guarantees are made as to whether or not this method will be called for empty partitions.
+   */
+  ShufflePartitionWriter getPartitionWriter(int reducePartitionId) throws IOException;


Why "calls to this method will be invoked with monotonically increasing reducePartitionIds"? This may cause potential issues in future and cause burden on implementation. for example, if people want to implement multiple partition writers and write shuffle data in parallel. It cannot guarantee monotonically increasing reducePartitionIds.

People using this will be using it with SortShuffleManager which has a specific algorithm that won't open streams in parallel. If these invariants are broken, it implies the algorithm has changed, in which case we'd need to reconsider these APIs.

hiboyang · 2019-07-29T18:44:14Z

core/src/main/java/org/apache/spark/shuffle/sort/io/LocalDiskShuffleMapOutputWriter.java

+  }
+
+  @Override
+  public ShufflePartitionWriter getPartitionWriter(int reducePartitionId) throws IOException {


This method looks a little risky. Its name is called getPartitionWriter, but it actually modifies this class's internal state. People need to call getPartitionWriter and finish writing for that partition before call getPartitionWriter again. This may cause confusion to user and may be misused as well.

I think we can clarify the documentation here, but that behavior is supposed to be part of the contract for these APIs, to remain consistent with the sort shuffle algorithm.

SparkQA · 2019-07-29T20:40:20Z

Test build #108342 has finished for PR 25007 at commit b8b7b8d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mccheah · 2019-07-29T22:59:01Z

@vanzin @squito latest patch addresses a bunch of comments. Please take a look.

SparkQA · 2019-07-30T01:09:51Z

Test build #108352 has finished for PR 25007 at commit 06ea01a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin

Looks good, I'll leave it here a bit in case others still have comments.

vanzin · 2019-07-30T17:48:07Z

core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java

      writeMetrics.incWriteTime(System.nanoTime() - writeStartTime);
    }
    partitionWriters = null;
    return lengths;
  }

+  private void writePartitionedDataWithChannel(
+      File file, WritableByteChannelWrapper outputChannel) throws IOException {


nit: one arg per line

squito · 2019-07-30T18:30:09Z

lgtm too

SparkQA · 2019-07-30T20:59:19Z

Test build #108416 has finished for PR 25007 at commit 7dceec9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2019-07-30T21:16:56Z

Merging to master. On to the next round...

HyukjinKwon · 2019-08-01T07:51:40Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleDataIO.java

+ * <code>spark.shuffle.sort.io.plugin.class</code>.
+ * @since 3.0.0
+ */
+@Private


Question from SPARK-28568. Is it an API or not? Looks so given the PR description. @Private is:

This should be used only when the standard Scala / Java means of protecting classes are

insufficient. In particular, Java has no equivalent of private[spark], so we use this annotation

in its place.

So @Private doesn't look like for APIs. Shall we change it to @Unstable (maybe with an explicit warning)?

@HyukjinKwon it'll all eventually be @Experimental, but we decided to start by making it @Private just in case spark 3.0 gets released in the middle. (discussed here: #25007 (comment))

Looks like we forgot to file a follow up jira about that, I just filed https://issues.apache.org/jira/browse/SPARK-28592

Ah, okie. That's good.
My impression was that @Unstable guarantees less than @Experimental. Maybe we can consider this point as well later.

cloud-fan · 2019-08-02T15:54:45Z

core/src/main/java/org/apache/spark/shuffle/api/WritableByteChannelWrapper.java

+ * @since 3.0.0
+ */
+@Private
+public interface WritableByteChannelWrapper extends Closeable {


why do we only need a wrapper for WritableByteChannel, but not OutputStream?

We need to return the FileChannel object directly to the caller, because FileChannel#transfer[from|to] checks instanceof on the argument channel to transfer to/from in order to decide to optimize via zero-memory copy. Extending FileChannel is nearly impossible since it's an internal JDK abstract class with a lot of methods. But if we return the FileChannel, we have no way to shield the channel from being closed so that we can share the same channel resource across partitions.

This has come up in #25007 (comment) and palantir#535 and especially palantir#535 (comment). Given that this has come up as a question a number of times, I wonder if there's a better way we can make the semantics more accessible. I don't see a way to improve the architecture itself, but perhaps better documentation in the right places explaining why we went about this the way we did is warranted.

cloud-fan · 2019-08-02T17:15:11Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleMapOutputWriter.java

+   * provided upon the creation of this map output writer via
+   * {@link ShuffleExecutorComponents#createMapOutputWriter(int, int, long, int)}.
+   * <p>
+   * Calls to this method will be invoked with monotonically increasing reducePartitionIds; each


How useful is this? I think we can make Spark shuffle more flexible if we don't guarantee this. Do you have a concrete example of how an implementation can leverage this guarantee?

spark's existing implementation makes this assumption. The index & data file assume they are in sequential order.

though it would be really easy to change the index format to allow for the order to random (just need to include a start and end, rather having the end be implicit).

xuanyuanking · 2019-08-05T14:48:52Z

core/src/main/java/org/apache/spark/shuffle/api/ShuffleExecutorComponents.java

+   * @param numPartitions The number of partitions that will be written by the map task. Some of
+*                      these partitions may be empty.
+   */
+  ShuffleMapOutputWriter createMapOutputWriter(


During the fix of SPARK-25341, we need to pass more param into shuffle writer and shuffle block resolver, give #25361 for the quick API change review. Thanks :)

gcz2022 · 2020-05-20T09:38:25Z

core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java

+        final File file = tempShuffleBlockIdPlusFile._2();
+        final BlockId blockId = tempShuffleBlockIdPlusFile._1();
+        partitionWriters[i] =
+            blockManager.getDiskWriter(blockId, file, serInstance, fileBufferSize, writeMetrics);


@mccheah Sorry to bring up such an old PR lol.
But why didn't we make this taken care of by specific plugin? This is not spill.

…and deprecate `spark.shuffle.unsafe.file.output.buffer` ### What changes were proposed in this pull request? Deprecate spark.shuffle.unsafe.file.output.buffer and add a new config spark.shuffle.localDisk.file.output.buffer instead. ### Why are the changes needed? The old config is desgined to be used in UnsafeShuffleWriter, but now it has been used in all local shuffle writers through LocalDiskShuffleMapOutputWriter, introduced by #25007. ### Does this PR introduce _any_ user-facing change? Old still works, advised to use new. ### How was this patch tested? Passed existing tests. Closes #39819 from wayneguow/shuffle_output_buffer. Authored-by: wayneguow <guow93@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>

mccheah added 13 commits June 26, 2019 14:21

[SPARK-25299] Introduce the new shuffle writer API (#5) (#520)

1957e82

Introduces the new Shuffle Writer API. Ported from bloomberg#5.

[SPARK-25299] Make UnsafeShuffleWriter use the new API (#536)

d13037f

Ported from bloomberg#9. Credits to @ifilonenko!

[SPARK-25299] Don't set map status twice in bypass merge sort shuffle…

f982df7

… writer (#541)

Remove shuffle location support.

7b44ed2

Remove changes to UnsafeShuffleWriter

df75f1f

Revert changes for SortShuffleWriter

a8558af

Revert a bunch of other stuff

806d7bb

More reverts

3167030

mccheah commented Jun 28, 2019

View reviewed changes

core/src/main/java/org/apache/spark/api/shuffle/ShuffleWriteSupport.java Outdated Show resolved Hide resolved

Set task contexts in failing test

70f59db

dongjoon-hyun added the SHUFFLE label Jun 28, 2019

mccheah added 2 commits June 28, 2019 15:28

Fix style

3083d86

Check for null on the block manager as well.

4c3d692

jerryshao reviewed Jul 1, 2019

View reviewed changes

yifeih reviewed Jul 1, 2019

View reviewed changes

Add task attempt id in the APIs

2421c92

squito reviewed Jul 4, 2019

View reviewed changes

gcz2022 reviewed Jul 25, 2019

View reviewed changes

vanzin reviewed Jul 25, 2019

View reviewed changes

jerryshao reviewed Jul 29, 2019

View reviewed changes

hiboyang reviewed Jul 29, 2019

View reviewed changes

Remove metrics from the API.

2d29404

Address more comments.

06ea01a

vanzin reviewed Jul 30, 2019

View reviewed changes

Args per line

7dceec9

vanzin closed this in abef84a Jul 30, 2019

HyukjinKwon reviewed Aug 1, 2019

View reviewed changes

cloud-fan reviewed Aug 2, 2019

View reviewed changes

xuanyuanking reviewed Aug 5, 2019

View reviewed changes

gcz2022 reviewed May 20, 2020

View reviewed changes

gatorsmile mentioned this pull request Aug 24, 2020

[SPARK-32658][CORE] Fix PartitionWriterStream partition length overflow #29474

Closed

wayneguow mentioned this pull request May 15, 2023

[SPARK-42252][CORE] Add spark.shuffle.localDisk.file.output.buffer and deprecate spark.shuffle.unsafe.file.output.buffer #39819

Closed


		void commitAllPartitions() throws IOException;

		void abort(Throwable error) throws IOException;


		package org.apache.spark.shuffle.api;

		import java.io.IOException;

[SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API #25007

[SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API #25007

Conversation

mccheah commented Jun 28, 2019

What changes were proposed in this pull request?

How was this patch tested?

mccheah commented Jun 28, 2019

mccheah commented Jun 28, 2019

SparkQA commented Jun 28, 2019

SparkQA commented Jun 28, 2019

SparkQA commented Jun 29, 2019

SparkQA commented Jun 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jul 2, 2019

squito left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jul 25, 2019

SparkQA commented Jul 25, 2019

Choose a reason for hiding this comment

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah commented Jul 29, 2019

hiboyang Jul 29, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jul 29, 2019

mccheah commented Jul 29, 2019

SparkQA commented Jul 30, 2019

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

squito commented Jul 30, 2019

SparkQA commented Jul 30, 2019

vanzin commented Jul 30, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hiboyang Jul 29, 2019 •

edited

Loading