Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-5358: Rework the classloader impelementation. #4166

Closed
wants to merge 3 commits into from

Conversation

MattWhelan
Copy link

The fundamental issue is that you can't change the delegation scheme
without overriding loadClass (rather than findClass). And, if you
override loadClass, you kind of have to do it in Java, because you need
a static initializer call to register yourself as parallel-capable.

This is an alternative to PR #4165. That PR requires Java 1.7. This
one sticks with 1.6, and implements the classloader as a trait, because
it can.

Matt Whelan added 3 commits January 22, 2015 12:01
The fundamental issue is that you can't change the delegation scheme
without overriding loadClass (rather than findClass). And, if you
override loadClass, you kind of have to do it in Java, because you need
a static initializer call to register yourself as parallel-capable.
The fundamental issue is that you can't change the delegation scheme
without overriding loadClass (rather than findClass). And, if you
override loadClass, you kind of have to do it in Java, because you need
a static initializer call to register yourself as parallel-capable.

This is an alternative to PR apache#4165.  That PR requires Java 1.7.  This
one sticks with 1.6, and implements the classloader as a trait, because
it can.
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@MattWhelan
Copy link
Author

BTW, I spent a few minutes pondering the deadlock scenario, and the delegation changes you see here. I'm pretty sure we're safe with 1.6-style coarse locking, because no class from a parent classloader will ever be able to refer to any of the child classloader's classes (statically - instances don't really matter). We don't have funky cyclic references between CLs, so I don't think we can have funky deadlocks.

@stephenh
Copy link
Contributor

stephenh commented Feb 4, 2015

@MattWhelan I believe you that loadClass is preferable to findClass, but can you articulate why? And maybe as a comment on this SO post so that future people that copy/paste from the 1st answer can know as well? :-)

http://stackoverflow.com/questions/5445511/how-do-i-create-a-parent-last-child-first-classloader-in-java-or-how-to-overr

Tangentially, I'd eventually like to port/copy/paste the Hadoop ApplicationClassLoader (which I believe they copy/pasted from Jetty) so that we can ignore any org.apache.spark.* classes that might accidentally be in the user's jar:

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java

I'm also half-tempted to copy/paste their ApplicationClassLoader into a separate utility project, so that going forward Jetty/Hadoop/Spark/etc. could all reuse the same tested functionality instead of copy/pasting the code around. But then I'd have to publish to maven central, it'd be another dependency to worry about, etc., etc...

* (cache, self, parent). This lets this CL "hide" or "override"
* class defs that also exist in the parent loader.
*/
private[spark] trait GreedyClassLoader extends ClassLoader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making this a trait, can you just modify ChildExecutorURLClassLoader to have this logic? We don't need it anywhere else at this point, and this is the main intended goal of ChildExecutorURLClassLoader (to do greedy loading like this).

@pwendell
Copy link
Contributor

pwendell commented Feb 8, 2015

If you bring this one up to date we may be able to slip it into 1.3. Just let me know, thanks!

@MattWhelan
Copy link
Author

Hey, sorry I was away for a bit.

After an extended discussion with Vanzin on his #3233, I'm convinced that PR covers the problem that this one solves. Since that's been merged, let's go with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants