-
Notifications
You must be signed in to change notification settings - Fork 69
[SparkRDMA] set up SPARK_LOCAL_IP for RoCE network #5
Comments
Hi, |
Hi, |
Since your hostname points to 172.31.101.104, Spark will bind to 172.31.101.104 by default, and SparkRDMA will follow. e.g., in my system for example, the master node RDMA IP address is "192.168.1.12", and the RDMA network device name (as it appears in ifconfig) is "ens2", so this how these line work on my setup: The above assumes you are running in standalone mode, let me know if you are running in a different mode. |
@yuvaldeg |
@yuvaldeg |
Happy to help! |
The How can we set the IP for workers if we submit Spark jobs with Yarn-cluster mode? @yuvaldeg |
Hello,
I am testing the SparkRDMA with Mellanox ConnectX-4Lx card. I installed the Spark-2.2.0 and download SparkTeraSort sample code. The sparkterasort sample code can ran successfully with spark-2.2.0, however, when run the terasort code with the SparkRDMA plugin, it throws out error which is show as following picture.
![errors1](https://user-images.githubusercontent.com/1718938/34857628-b09cb34e-f743-11e7-9b29-6ad9ca890bc1.png)
Do I need upgrade libibverb.so or do I need configure the RDMA network for Spark? Please help.
The text was updated successfully, but these errors were encountered: