Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for TOREE-430 #129

Closed
wants to merge 1 commit into from
Closed

Fix for TOREE-430 #129

wants to merge 1 commit into from

Conversation

Myllyenko
Copy link
Contributor

To fix TOREE-430, REPL-generated classes must be made visible to a classloader.

The only working way I found is to set the spark.repl.class.outputDir system property (and subsequently the corresponding Spark configuration property) to a path to an output directory of a REPL compiler. To achieve that, I had to move SparkIMain initialisation before creation of SparkContext.

@@ -61,6 +61,7 @@ trait ScalaInterpreterSpecific { this: ScalaInterpreter =>
): SparkIMain = {
val s = new SparkIMain(settings, out)
s.initializeSynchronous()
System.setProperty("spark.repl.class.outputDir", s.getClassOutputDirectory.getAbsolutePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pass this as conf while creating the Spark Session?

Copy link
Contributor Author

@Myllyenko Myllyenko Aug 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to figure out how to do that without heavy code modifications but haven't succeeded.

It worth noting that currently in a Scala 2.11 environment Toree sets this property in a similar manner. And in Scala 2.11 we can pass it directly to SparkConf with ease because spark.repl.class.outputDir's required value becomes known very early (when org.apache.spark.repl.Main is initialised).

But in a Scala 2.10 environment, things are different. spark.repl.class.outputDir' is initialised only with org.apache.spark.repl.SparkIMain, that is, after creation of ScalaInterpreter in Toree. But in order to pass spark.repl.class.outputDir directly to SparkConf we have to initialiseSparkIMain earlier than Toree's Kernel.

This is applicable for both Spark 2 and Spark 1.6.

@lresende
Copy link
Member

lresende commented Oct 5, 2017

@Myllyenko Could you please rebase and update your pr so we can make sure we have a clean build before merging the update.

@Myllyenko
Copy link
Contributor Author

I started to work on an improved version of this modification (with more accurate splitting between 2.10/2.11 specifics). Unfortunately I didn't have enough time to finish it yet, but I hope it will be ready soon.

…erty to a path to an output directory of a REPL compiler in the Scala 2.10 environment.

(cherry picked from commit 72261a6)

# Conflicts:
#	scala-interpreter/src/main/scala/org/apache/toree/kernel/interpreter/scala/ScalaInterpreter.scala
@Myllyenko
Copy link
Contributor Author

Myllyenko commented Oct 25, 2017

I've rebased this PR.
I'll open new PRs for the modifications I mentioned in my previous comment as they are unrelated to the issue resolved here.

@Myllyenko
Copy link
Contributor Author

This PR is no longer relevant as it is Scala 2.10-specific.

@Myllyenko Myllyenko closed this Nov 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants