Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20547][REPL]Throw RemoteClassLoadedError for transient errors in ExecutorClassLoader #24683

Closed
wants to merge 3 commits into from

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented May 22, 2019

What changes were proposed in this pull request?

ExecutorClassLoader's findClass may fail to fetch a class due to transient exceptions. For example, when a task is interrupted, if ExecutorClassLoader is fetching a class, you may see InterruptedException or IOException wrapped by ClassNotFoundException, even if this class can be loaded. Then the result of findClass will be cached by JVM, and later when the same class is being loaded in the same executor, it will just throw NoClassDefFoundError even if the class can be loaded.

I found JVM only caches LinkageError and ClassNotFoundException. Hence in this PR, I changed ExecutorClassLoader to throw RemoteClassLoadedError if we cannot get a response from driver.

How was this patch tested?

New unit tests.

private def getClassFileInputStreamFromSparkRPC(path: String): InputStream = {
val channel = env.rpcEnv.openChannel(s"$classUri/$path")
val channel = env.rpcEnv.openChannel(s"$classUri/${urlEncode(path)}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes an issue in the current ExecutorClassLoader: it doesn't encode the path.

Before this PR, when we try to fetch some class that contains special characters, it will always throw URISyntaxException: Illegal character in path ... which will be converted to ClassNotFoundException at last.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to cover this with a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's covered by create encoder in executors. Reverting this line would cause this test fail..

@SparkQA
Copy link

SparkQA commented May 22, 2019

Test build #105704 has finished for PR 24683 at commit 9395052.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • test(\"nonexistent class and transient errors should cause different errors\")
  • \"\"\"public class TestClassA implements java.io.Serializable
  • \"\"\"public class TestClassB implements java.io.Serializable
  • class ReplSuite extends SparkFunSuite with BeforeAndAfterAll
  • |case class Foo(s: String)

Copy link
Contributor

@gaborgsomogyi gaborgsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, only minor things found.


/**
* An error when we cannot load a class due to exceptions. We don't know if this class exists, so
* throw a special one that's not [[LinkageError]] nor [[ClassNotFoundException]] to make JVM retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: neither [[LinkageError]] nor

private def getClassFileInputStreamFromSparkRPC(path: String): InputStream = {
val channel = env.rpcEnv.openChannel(s"$classUri/$path")
val channel = env.rpcEnv.openChannel(s"$classUri/${urlEncode(path)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to cover this with a test.

@@ -115,23 +127,31 @@ class ExecutorClassLoader(
}
}

// See org.apache.spark.network.server.TransportRequestHandler.processStreamRequest.
private val STREAM_NOT_FOUND_REGEX = s"Stream '.*' was not found.".r.pattern
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this exists in TransportRequestHandler + here as well and the test mocked it I think nobody will ever realize the problem if the message in TransportRequestHandler changed (only on cluster).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonexistent class and transient errors should cause different errors doesn't mock TransportRequestHandler. It would fail if someone changed the error message.

I think nobody will ever realize the problem if the message in TransportRequestHandler changed (only on cluster).

Good point. I added a comment near the error message so that people can notice the error message string is used here.

throw new ClassNotFoundException(path, e)
case NonFatal(e) =>
// scalastyle:off throwerror
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the decision point making this Error instead of Exception? Error is more like an unrecoverable problem to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoteClassLoaderError wraps both Error and Exception. It's better to make it an Error to not convert an Error to an Exception. In addition, since in Spark, most of codes use NonFatal so this new Error will still be caught and handled.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented May 26, 2019

Test build #105800 has finished for PR 24683 at commit 0b5cb48.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@gaborgsomogyi gaborgsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zsxwing
Copy link
Member Author

zsxwing commented May 28, 2019

Thanks. Merging to master.

@asfgit asfgit closed this in 04f142d May 28, 2019
@zsxwing zsxwing deleted the SPARK-20547-fix branch May 28, 2019 20:16
assert(e.getMessage.contains("ThisIsAClassName"))
// RemoteClassLoaderError must not be LinkageError nor ClassNotFoundException. Otherwise,
// JVM will cache it and doesn't retry to load a class.
assert(!e.isInstanceOf[LinkageError] && !e.isInstanceOf[ClassNotFoundException])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zsxwing I just noticed this generates a compiler warning. RemoteClassLoaderError can't be an instance of either of those. Did you mean to check the cause of this exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen this line is to ensure we will never make RemoteClassLoaderError extend LinkageError or ClassNotFoundException in future. Can we turn the warning off for this check, or just rewrite it to something like assert(e.getClass.getSimpleName != "LinkageError" && e.getClass.getSimpleName != "ClassNotFoundException")?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Ideally we would avoid the warning. You wouldn't be able to check its name but can check isAssignableFrom without a warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants