[SPARK-12007] [network] Avoid copies in the network lib's RPC layer. #9987

vanzin · 2015-11-26T02:06:14Z

This change seems large, but most of it is just replacing byte[]
with ByteBuffer and new byte[] with ByteBuffer.allocate(),
since it changes the network library's API.

The following are parts of the code that actually have meaningful
changes:

The Message implementations were changed to inherit from a new
AbstractMessage that can optionally hold a reference to a body
(in the form of a ManagedBuffer); this is similar to how
ResponseWithBody worked before, except now it's not restricted
to just responses.
The TransportFrameDecoder was pretty much rewritten to avoid
copies as much as possible; it doesn't rely on CompositeByteBuf
to accumulate incoming data anymore, since CompositeByteBuf
has issues when slices are retained. The code now is able to
create frames without having to resort to copying bytes except
for a few bytes (containing the frame length) in very rare cases.
Some minor changes in the SASL layer to convert things back to
byte[] since the JDK SASL API operates on those.

This change seems large, but most of it is just replacing `byte[]` with `ByteBuffer` and `new byte[]` with `ByteBuffer.allocate()`, since it changes the network library's API. The following are parts of the code that actually have meaningful changes: - The Message implementations were changed to inherit from a new AbstractMessage that can optionally hold a reference to a body (in the form of a ManagedBuffer); this is similar to how ResponseWithBody worked before, except now it's not restricted to just responses. - The TransportFrameDecoder was pretty much rewritten to avoid copies as much as possible; it doesn't rely on CompositeByteBuf to accumulate incoming data anymore, since CompositeByteBuf has issues when slices are retained. The code now is able to create frames without having to resort to copying bytes except for a few bytes (containing the frame length) in very rare cases. - Some minor changes in the SASL layer to convert things back to `byte[]` since the JDK SASL API operates on those.

SparkQA · 2015-11-26T06:18:30Z

Test build #46729 has finished for PR 9987 at commit 83acf37.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2015-11-29T00:31:29Z

retest this please

SparkQA · 2015-11-29T02:09:04Z

Test build #46844 has finished for PR 9987 at commit 83acf37.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-11-29T04:57:09Z

Test build #46846 has finished for PR 9987 at commit 31f52e0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2015-11-30T17:20:59Z

/cc @rxin @zsxwing

zsxwing · 2015-11-30T18:00:39Z

This is really cool. I was thinking to do this this week :)

zsxwing · 2015-11-30T18:46:12Z

network/common/src/main/java/org/apache/spark/network/protocol/AbstractResponseMessage.java

-public abstract class ResponseWithBody implements ResponseMessage {
-  public final ManagedBuffer body;
-  public final boolean isBodyInFrame;
+public abstract class AbstractResponseMessage extends AbstractMessage implements ResponseMessage {


Looks AbstractResponseMessage is not necessary if we move createFailureResponse to ResponseMessage.

That would require dummy implementations in ChunkFetchFailure, RpcFailure and StreamFailure...

I actually don't see a lot of value in having separate interfaces for ResponseMessage and RequestMessage; this is not Scala, so you can't have a sealed trait and have the compiler help you when you miss something in a match. So you could just have Message with no need for any of the other interfaces / abstract classes. But I didn't want to do that cleanup in this change.

okey, let's keep it.

zsxwing · 2015-11-30T20:04:32Z

LGTM

andrewor14 · 2015-12-01T00:07:50Z

OK, I'd like to merge this into 1.6 as well to further mitigate the scheduler throughput regression.

vanzin · 2015-12-01T00:10:14Z

OK, I'd like to merge this into 1.6 as well to further mitigate the scheduler throughput regression.

Did you measure the effect this change has on the test?

andrewor14 · 2015-12-01T00:14:40Z

I think @zsxwing did some local benchmarks and saw like a 10% improvement

andrewor14 · 2015-12-01T01:20:39Z

Merging into master 1.6, thanks for fixing this @vanzin.

This change seems large, but most of it is just replacing `byte[]` with `ByteBuffer` and `new byte[]` with `ByteBuffer.allocate()`, since it changes the network library's API. The following are parts of the code that actually have meaningful changes: - The Message implementations were changed to inherit from a new AbstractMessage that can optionally hold a reference to a body (in the form of a ManagedBuffer); this is similar to how ResponseWithBody worked before, except now it's not restricted to just responses. - The TransportFrameDecoder was pretty much rewritten to avoid copies as much as possible; it doesn't rely on CompositeByteBuf to accumulate incoming data anymore, since CompositeByteBuf has issues when slices are retained. The code now is able to create frames without having to resort to copying bytes except for a few bytes (containing the frame length) in very rare cases. - Some minor changes in the SASL layer to convert things back to `byte[]` since the JDK SASL API operates on those. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #9987 from vanzin/SPARK-12007. (cherry picked from commit 9bf2120) Signed-off-by: Andrew Or <andrew@databricks.com>

yaooqinn · 2016-02-26T03:25:45Z

@vanzin @zsxwing

16/02/26 10:35:04 WARN TaskSetManager: Lost task 213.0 in stage 1.0 (TID 16971, SZV1000041645): FetchFailed(BlockManagerId(118, 192.168.75.193, 23325), shuffleId=0, mapId=51, reduceId=213, message=
org.apache.spark.shuffle.FetchFailedException: java.lang.RuntimeException: javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched nonce.
        at org.spark-project.guava.base.Throwables.propagate(Throwables.java:160)
        at org.apache.spark.network.sasl.SparkSaslServer.response(SparkSaslServer.java:121)
        at org.apache.spark.network.sasl.SaslRpcHandler.receive(SaslRpcHandler.java:100)
        at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:128)
        at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:99)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

When I test word count on a larger cluster(100 workers), these exception occurs every time, but on a small cluster never.

Do you have any idea about this?

Thanks!
Kent

vanzin · 2016-02-26T03:57:43Z

@yaooqinn please don't ask questions on github, especially on old, closed PRs. It's confusing.

Fix shuffle tests.

31f52e0

zsxwing reviewed Nov 30, 2015
View reviewed changes

asfgit closed this in 9bf2120 Dec 1, 2015

vanzin deleted the SPARK-12007 branch December 3, 2015 22:21

zsxwing mentioned this pull request Mar 3, 2016

[SPARK-13652][Core]Copy ByteBuffer in sendRpcSync as it will be recycled #11499

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-12007] [network] Avoid copies in the network lib's RPC layer. #9987

[SPARK-12007] [network] Avoid copies in the network lib's RPC layer. #9987

vanzin commented Nov 26, 2015

SparkQA commented Nov 26, 2015

vanzin commented Nov 29, 2015

SparkQA commented Nov 29, 2015

SparkQA commented Nov 29, 2015

vanzin commented Nov 30, 2015

zsxwing commented Nov 30, 2015

zsxwing Nov 30, 2015

vanzin Nov 30, 2015

zsxwing Nov 30, 2015

zsxwing commented Nov 30, 2015

andrewor14 commented Dec 1, 2015

vanzin commented Dec 1, 2015

andrewor14 commented Dec 1, 2015

andrewor14 commented Dec 1, 2015

yaooqinn commented Feb 26, 2016

vanzin commented Feb 26, 2016

[SPARK-12007] [network] Avoid copies in the network lib's RPC layer. #9987

[SPARK-12007] [network] Avoid copies in the network lib's RPC layer. #9987

Conversation

vanzin commented Nov 26, 2015

SparkQA commented Nov 26, 2015

vanzin commented Nov 29, 2015

SparkQA commented Nov 29, 2015

SparkQA commented Nov 29, 2015

vanzin commented Nov 30, 2015

zsxwing commented Nov 30, 2015

zsxwing Nov 30, 2015

Choose a reason for hiding this comment

vanzin Nov 30, 2015

Choose a reason for hiding this comment

zsxwing Nov 30, 2015

Choose a reason for hiding this comment

zsxwing commented Nov 30, 2015

andrewor14 commented Dec 1, 2015

vanzin commented Dec 1, 2015

andrewor14 commented Dec 1, 2015

andrewor14 commented Dec 1, 2015

yaooqinn commented Feb 26, 2016

vanzin commented Feb 26, 2016