Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6420] Driver's Block Manager does not use "spark.driver.host" in Yarn-Client mode #5095

Closed
wants to merge 1 commit into from

Conversation

marsishandsome
Copy link

In my cluster, the yarn node does not know the client's host name.
So I set "spark.driver.host" to the ip address of the client.
But the driver's Block Manager does not use "spark.driver.host" but the hostname in Yarn-Client mode.

I got the following error:

TaskSetManager: Lost task 1.1 in stage 0.0 (TID 2, hadoop-node1538098): java.io.IOException: Failed to connect to example-hostname
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
at org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:127)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644)
at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:193)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:200)
at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1029)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:496)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:481)
at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:496)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:481)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:463)
at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:849)
at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:199)
at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:165)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
... 1 more

@marsishandsome marsishandsome changed the title Driver's Block Manager does not use "spark.driver.host" in Yarn-Client mode [SPARK-6420] Driver's Block Manager does not use "spark.driver.host" in Yarn-Client mode Mar 19, 2015
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@tgravescs
Copy link
Contributor

@marsishandsome This seems like a bit of an odd use case. If the spark.driver.host is something different then actual host then errors are going to occur. Why are you wanting to be able to set it rather then have it automatically figure it out? Is it just to make it easier switching between modes?

@marsishandsome
Copy link
Author

@tgravescs In our local network, the yarn-node do not know the hostname of the client. So I have to set spark.driver.host to the client's ip address, so the driver will use it's ip address in stead of hostname. But the driver's blockmanager will still use it's hostname.

@marsishandsome
Copy link
Author

@tgravescs You are right. Maybe we should provide two choices: Ip and Hostname. Both will be automatically figured out by Spark.

@tgravescs
Copy link
Contributor

In our local network, the yarn-node do not know the hostname of the client. So I have to set spark.driver.host to the client's ip address, so the driver will use it's ip address in stead of hostname. But the driver's blockmanager will still use it's hostname.

Sorry I don't follow what you mean. Are you just saying DNS is not properly configured or when the client goes to figure out hostname it comes back null?

If you run the linux 'hostname' command what does it list?

@marsishandsome
Copy link
Author

@tgravescs Yes, DNS in our network is not properly configured.

The yarn-node cannot connect to the client by the hostname of the client machine.

My idea is to let the yarn-node use the client's ip address to connect to the client machine.

@andrewor14
Copy link
Contributor

@marsishandsome did you get it working on your network? Is this patch a false alarm?

@marsishandsome
Copy link
Author

@andrewor14 Yes, it works in my network.
Currently spark can only find the machine's hostname and use it to communicate with others.
I thinks spark should provide another choice to use ip instead of hostname. (find by spark or specify by user)

@yongjiaw
Copy link
Contributor

yongjiaw commented Apr 5, 2015

You can probably just use the command: "sudo hostname ip" to let spark's blocker manager pickup the ip you wanted from the OS hostname.
For more general solution, maybe you want to vote this JIRA issue. https://issues.apache.org/jira/browse/SPARK-5113.

@marsishandsome
Copy link
Author

@yongjiaw Thanks for your advice. As a workaround I set the hostname to the ip and it works. I like the solution of Spark-5113.

@srowen
Copy link
Member

srowen commented Apr 7, 2015

@marsishandsome is this change needed then or is that workaround actually the right way to do this?

@marsishandsome
Copy link
Author

@srowen This change is not needed.

As Spark-5331 said, Spark provides many services both to internal and external. Spark should provide a way for users to specify hostname or ip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants