Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6440][CORE]Handle IPv6 addresses properly when constructing URI #5424

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion core/src/main/scala/org/apache/spark/HttpServer.scala
Expand Up @@ -160,7 +160,7 @@ private[spark] class HttpServer(
throw new ServerStateException("Server is not started")
} else {
val scheme = if (securityManager.fileServerSSLOptions.enabled) "https" else "http"
s"$scheme://${Utils.localIpAddress}:$port"
s"$scheme://${Utils.localIpAddressURI}:$port"
}
}
}
Expand Up @@ -53,7 +53,7 @@ class LocalSparkCluster(
/* Start the Master */
val (masterSystem, masterPort, _, _) = Master.startSystemAndActor(localHostname, 0, 0, _conf)
masterActorSystems += masterSystem
val masterUrl = "spark://" + localHostname + ":" + masterPort
val masterUrl = "spark://" + Utils.localHostURI() + ":" + masterPort
val masters = Array(masterUrl)

/* Start the Workers */
Expand Down
Expand Up @@ -46,7 +46,7 @@ private[spark] object TestClient {
def main(args: Array[String]) {
val url = args(0)
val conf = new SparkConf
val (actorSystem, _) = AkkaUtils.createActorSystem("spark", Utils.localIpAddress, 0,
val (actorSystem, _) = AkkaUtils.createActorSystem("spark", Utils.localIpAddressHostname, 0,
conf = conf, securityManager = new SecurityManager(conf))
val desc = new ApplicationDescription("TestClient", Some(1), 512,
Command("spark.deploy.client.TestExecutor", Seq(), Map(), Seq(), Seq(), Seq()), "ignored")
Expand Down
2 changes: 1 addition & 1 deletion core/src/main/scala/org/apache/spark/ui/WebUI.scala
Expand Up @@ -48,7 +48,7 @@ private[spark] abstract class WebUI(
protected val handlers = ArrayBuffer[ServletContextHandler]()
protected val pageToHandlers = new HashMap[WebUIPage, ArrayBuffer[ServletContextHandler]]
protected var serverInfo: Option[ServerInfo] = None
protected val localHostName = Utils.localHostName()
protected val localHostName = Utils.localHostURI()
protected val publicHostName = Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
private val className = Utils.getFormattedClassName(this)

Expand Down
34 changes: 24 additions & 10 deletions core/src/main/scala/org/apache/spark/util/Utils.scala
Expand Up @@ -34,6 +34,7 @@ import scala.util.Try
import scala.util.control.{ControlThrowable, NonFatal}

import com.google.common.io.{ByteStreams, Files}
import com.google.common.net.InetAddresses
import com.google.common.util.concurrent.ThreadFactoryBuilder
import org.apache.commons.lang3.SystemUtils
import org.apache.hadoop.conf.Configuration
Expand Down Expand Up @@ -789,13 +790,14 @@ private[spark] object Utils extends Logging {
* Get the local host's IP address in dotted-quad format (e.g. 1.2.3.4).
* Note, this is typically not used from within core spark.
*/
lazy val localIpAddress: String = findLocalIpAddress()
lazy val localIpAddressHostname: String = getAddressHostName(localIpAddress)
lazy val localIpAddress: InetAddress = findLocalInetAddress()
lazy val localIpAddressHostname: String = localIpAddress.getHostName
lazy val localIpAddressURI: String = InetAddresses.toUriString(localIpAddress)

private def findLocalIpAddress(): String = {
private def findLocalInetAddress(): InetAddress = {
val defaultIpOverride = System.getenv("SPARK_LOCAL_IP")
if (defaultIpOverride != null) {
defaultIpOverride
InetAddress.getByName(defaultIpOverride)
} else {
val address = InetAddress.getLocalHost
if (address.isLoopbackAddress) {
Expand All @@ -806,23 +808,28 @@ private[spark] object Utils extends Logging {
// It's more proper to pick ip address following system output order.
val activeNetworkIFs = NetworkInterface.getNetworkInterfaces.toList
val reOrderedNetworkIFs = if (isWindows) activeNetworkIFs else activeNetworkIFs.reverse

for (ni <- reOrderedNetworkIFs) {
for (addr <- ni.getInetAddresses if !addr.isLinkLocalAddress &&
!addr.isLoopbackAddress && addr.isInstanceOf[Inet4Address]) {
val addresses = ni.getInetAddresses.toList
.filter { addr => !addr.isLinkLocalAddress && !addr.isLoopbackAddress }
if ( !addresses.isEmpty ) {
val addr = addresses.find( _.isInstanceOf[Inet4Address] ).getOrElse(addresses.head)
// because of Inet6Address.toHostName may add interface at the end if it knows about it
val strippedAddress = InetAddress.getByAddress(addr.getAddress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell me more about what this fixes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all about finding our external IP address: old logic was to use external ipv4 || 127.0.0.1.
That transformes it to ipv4 || ipv6 || 127.0.0.1

Whenewer we don't have any ipv4 it's better to use ipv6 than 127.0.0.1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, though I don't think that was the thrust of this JIRA, which was simply to format IPv6 addresses correctly. I think the reasoning makes some sense though.

Nits: can you filterNot(addr => addr.isLinkLocalAddress || addr.isLoopbackAddress)
and if (addresses.notEmpty) {? and find(_.isInstanceOf[Inet4Address])? Just questions of style in this block.

It might be worth a comment to the effect of your reply above here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i agree with that. Should i make additional commit or change existing one?
*nonEmpty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just push more commits to this branch

// We've found an address that looks reasonable!
logWarning("Your hostname, " + InetAddress.getLocalHost.getHostName + " resolves to" +
" a loopback address: " + address.getHostAddress + "; using " + addr.getHostAddress +
" instead (on interface " + ni.getName + ")")
" a loopback address: " + address.getHostAddress + "; using " +
strippedAddress.getHostAddress + " instead (on interface " + ni.getName + ")")
logWarning("Set SPARK_LOCAL_IP if you need to bind to another address")
return addr.getHostAddress
return strippedAddress
}
}
logWarning("Your hostname, " + InetAddress.getLocalHost.getHostName + " resolves to" +
" a loopback address: " + address.getHostAddress + ", but we couldn't find any" +
" external IP address!")
logWarning("Set SPARK_LOCAL_IP if you need to bind to another address")
}
address.getHostAddress
address
}
}

Expand All @@ -845,6 +852,13 @@ private[spark] object Utils extends Logging {
customHostname.getOrElse(localIpAddressHostname)
}

/**
* Get the local machine's URI.
*/
def localHostURI(): String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended to be an IP address or host name? looks like IP address. But then some of the changes above are changing a host name to an IP address. That might be OK but it's not obvious? what's the motivation there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is named localHostURI in compliance with localHostName function name.
The main idea is to use ip if and only if we don't have customHostname ( like in localHostName ).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'm wondering when you wouldn't want to use the "URI safe" name? they're identical in almost every case. Hm. I suppose the safer thing is to only format for URIs where, well, creating URIs. I'll have to dig into this more but generally want to minimize the change here and narrowly address the formatting issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how to use "URI safe" name, something like localHostNameURISafe or what?
It's good to format URIs only in places where we need it, but currently we have more than one bug because of it.
However we can use something like InetAddresses.toUriString(InetAddress.getByName(Utils.localHostName())) to get URI in place without fixing Utils module, but i think that special URI method may prevent URI-based bugs in future due to its existence.

customHostname.getOrElse(localIpAddressURI)
}

def getAddressHostName(address: String): String = {
InetAddress.getByName(address).getHostName
}
Expand Down