New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20547][REPL]Throw RemoteClassLoadedError for transient errors in ExecutorClassLoader #24683
Conversation
private def getClassFileInputStreamFromSparkRPC(path: String): InputStream = { | ||
val channel = env.rpcEnv.openChannel(s"$classUri/$path") | ||
val channel = env.rpcEnv.openChannel(s"$classUri/${urlEncode(path)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes an issue in the current ExecutorClassLoader: it doesn't encode the path.
Before this PR, when we try to fetch some class that contains special characters, it will always throw URISyntaxException: Illegal character in path ...
which will be converted to ClassNotFoundException
at last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to cover this with a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's covered by create encoder in executors
. Reverting this line would cause this test fail..
Test build #105704 has finished for PR 24683 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, only minor things found.
|
||
/** | ||
* An error when we cannot load a class due to exceptions. We don't know if this class exists, so | ||
* throw a special one that's not [[LinkageError]] nor [[ClassNotFoundException]] to make JVM retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: neither [[LinkageError]] nor
private def getClassFileInputStreamFromSparkRPC(path: String): InputStream = { | ||
val channel = env.rpcEnv.openChannel(s"$classUri/$path") | ||
val channel = env.rpcEnv.openChannel(s"$classUri/${urlEncode(path)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to cover this with a test.
@@ -115,23 +127,31 @@ class ExecutorClassLoader( | |||
} | |||
} | |||
|
|||
// See org.apache.spark.network.server.TransportRequestHandler.processStreamRequest. | |||
private val STREAM_NOT_FOUND_REGEX = s"Stream '.*' was not found.".r.pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this exists in TransportRequestHandler
+ here as well and the test mocked it I think nobody will ever realize the problem if the message in TransportRequestHandler
changed (only on cluster).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nonexistent class and transient errors should cause different errors
doesn't mock TransportRequestHandler
. It would fail if someone changed the error message.
I think nobody will ever realize the problem if the message in
TransportRequestHandler
changed (only on cluster).
Good point. I added a comment near the error message so that people can notice the error message string is used here.
throw new ClassNotFoundException(path, e) | ||
case NonFatal(e) => | ||
// scalastyle:off throwerror |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the decision point making this Error
instead of Exception
? Error
is more like an unrecoverable problem to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoteClassLoaderError
wraps both Error and Exception. It's better to make it an Error to not convert an Error to an Exception. In addition, since in Spark, most of codes use NonFatal
so this new Error will still be caught and handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test build #105800 has finished for PR 24683 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks. Merging to master. |
assert(e.getMessage.contains("ThisIsAClassName")) | ||
// RemoteClassLoaderError must not be LinkageError nor ClassNotFoundException. Otherwise, | ||
// JVM will cache it and doesn't retry to load a class. | ||
assert(!e.isInstanceOf[LinkageError] && !e.isInstanceOf[ClassNotFoundException]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zsxwing I just noticed this generates a compiler warning. RemoteClassLoaderError
can't be an instance of either of those. Did you mean to check the cause
of this exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen this line is to ensure we will never make RemoteClassLoaderError
extend LinkageError
or ClassNotFoundException
in future. Can we turn the warning off for this check, or just rewrite it to something like assert(e.getClass.getSimpleName != "LinkageError" && e.getClass.getSimpleName != "ClassNotFoundException")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Ideally we would avoid the warning. You wouldn't be able to check its name but can check isAssignableFrom without a warning.
What changes were proposed in this pull request?
ExecutorClassLoader
'sfindClass
may fail to fetch a class due to transient exceptions. For example, when a task is interrupted, ifExecutorClassLoader
is fetching a class, you may seeInterruptedException
orIOException
wrapped byClassNotFoundException
, even if this class can be loaded. Then the result offindClass
will be cached by JVM, and later when the same class is being loaded in the same executor, it will just throw NoClassDefFoundError even if the class can be loaded.I found JVM only caches
LinkageError
andClassNotFoundException
. Hence in this PR, I changed ExecutorClassLoader to throwRemoteClassLoadedError
if we cannot get a response from driver.How was this patch tested?
New unit tests.