Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SI-6502 Reenables loading jars into the running REPL (regression in 2.10) #4051

Merged
merged 8 commits into from Nov 18, 2014

Conversation

@heathermiller
Copy link
Member

@heathermiller heathermiller commented Oct 14, 2014

Fixes SI-6502, reenables loading jars into the running REPL (regression in 2.10). This PR allows adding a jar to the compile and runtime classpaths without resetting the REPL state (crucial for Spark SPARK-3257).

This follows the lead taken by @som-snytt in PR #3986, which differentiates two jar-loading behaviors (muddled by cp):

  • adding jars and replaying REPL expressions (using replay)
  • adding jars without resetting the REPL (deprecated cp, introduced require)
    This PR implements require (left unimplemented in #3986)

This PR is a simplification of a similar approach taken by @gkossakowski in #3884. In this attempt, we check first to make sure that a jar is only added if it only contains new classes/traits/objects, otherwise we emit an error. This differs from the old invalidation approach which also tracked deleted classpath entries.

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Oct 14, 2014

Review by anyone interested. Especially those I pummeled with questions: @dragos, @phaller, @gkossakowski

@scala-jenkins scala-jenkins added this to the 2.11.5 milestone Oct 14, 2014
@heathermiller heathermiller changed the title Fixes SI-6502, reenables loading jars into the running REPL (regression in 2.10) SI-6502 Reenables loading jars into the running REPL (regression in 2.10) Oct 14, 2014
@retronym
Copy link
Member

@retronym retronym commented Oct 14, 2014

It is probably best to submit a squashed PR and point reviewers to the fine grained commits on a branch. That means less work for our build 🐱 :)

* @return A pair consisting of
* - a list of invalidated packages
* - a list of of packages that should have been invalidated but were not because
* they are system packages.

This comment has been minimized.

@retronym

retronym Oct 14, 2014
Member

Rather than documenting this, how about encapsulating the return values in a case class?

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

Actually, there’s no need for return values, I realize. So this now returns Unit.

val arr = readFully(is)
val clazz = cloader.classOf(arr)
try {
Class.forName(clazz.getName(), false, intp.classLoader)

This comment has been minimized.

@retronym

retronym Oct 14, 2014
Member

Should document why we pass initialize = false here ("we don't want to execute class initializers")

This comment has been minimized.

@som-snytt

som-snytt Oct 15, 2014
Contributor

intp.classLoader.tryToLoadClass contrasts with tryToInitializeClass, in ScalaClassLoader.

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

Done 'n documented, thanks!

while (entries.hasMoreElements()) {
val entry = entries.nextElement()
// skip directories and manifests
if (!entry.isDirectory() && !entry.getName().endsWith("MF")) {

This comment has been minimized.

@retronym

retronym Oct 14, 2014
Member

What about other resources in the JAR (e.g. .properties)? I suppose the code below will work out okay, but it seems cleaner invert this condition to entry.isFile && entry.getName.endsWith(".class").

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

Thanks, done!

echo("Added '%s'. Your new classpath is:\n\"%s\"".format(f.path, totalClasspath))
replay()
intp.addUrlsToClassPath(f.toURI.toURL)
echo("Added '%s'. Your new classpath is:\n\"%s\"".format(f.path, intp.global.classPath.asClasspathString))

This comment has been minimized.

@retronym

retronym Oct 14, 2014
Member

This might be a scary wall of text that most users don't want to see, maybe demote the full classpath to debug logging.

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

Sure thing. Spark spits out the whole classpath, so that’s why I kept it. But I’ve now got it echoing just what’s been added, and the full classpath is now left for debug.

@@ -221,7 +221,7 @@ class ILoop(in0: Option[BufferedReader], protected val out: JPrintWriter)
nullary("power", "enable power user mode", powerCmd),
nullary("quit", "exit the interpreter", () => Result(keepRunning = false, None)),
cmd("replay", "[options]", "reset the repl and replay all previous commands", replayCommand),
//cmd("require", "<path>", "add a jar or directory to the classpath", require), // TODO
cmd("require", "<path>", "add a jar or directory to the classpath", require),

This comment has been minimized.

@retronym

retronym Oct 14, 2014
Member

"add a jar to the classpath".

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

Done :), thanks for catching this!

@heathermiller heathermiller force-pushed the heathermiller:repl-cp-fix2 branch from b765b76 to a835156 Oct 14, 2014
@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Oct 14, 2014

commits squashed :)

@retronym
Copy link
Member

@retronym retronym commented Oct 14, 2014

I'd like to see this built on top of our existing abstractions for JAR file handling (AbstractFile).

scala> def flatten(f: AbstractFile): Iterator[AbstractFile] = if (f.isClassContainer) f.iterator.flatMap(flatten) else Iterator(f)
flatten: (f: scala.reflect.io.AbstractFile)Iterator[scala.reflect.io.AbstractFile]

scala> val jar = AbstractFile.getDirectory(new java.io.File("/code/scala/build/pack/lib/scala-library.jar"))
jar: scala.reflect.io.AbstractFile = /code/scala/build/pack/lib/scala-library.jar

scala> flatten(jar).map(_.path).take(15).mkString("\n")
res7: String =
rootdoc.txt
META-INF/MANIFEST.MF
library.properties
scala/deprecatedName.class
scala/Function1$mcVI$sp$class.class
scala/Function0$mcJ$sp$class.class
scala/Predef$StringAdd$.class
scala/Function2$mcDID$sp$class.class
scala/Function2$mcJJD$sp.class
scala/Product2$mcJD$sp$class.class
scala/Product2$mcDD$sp$class.class
scala/Function2$mcFJI$sp$class.class
scala/PartialFunction$OrElse.class
scala/Product19$class.class
scala/Enumeration$Value.class

As an added bonus, you can then probably support directory based classpath entries without writing additional code:

scala> val dir = AbstractFile.getDirectory(new java.io.File("/code/scala/build/quick/classes/library"))
dir: scala.reflect.io.AbstractFile = PlainFile(PlainFile(), PlainFile(), PlainFile(PlainFile(PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile()), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(PlainFile(), PlainFile()), PlainFile(), PlainFile()), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), PlainFile(), Plain...


scala> flatten(dir).map(_.path).take(15).mkString("\n")
res6: String =
/code/scala/build/quick/classes/library/library.properties
/code/scala/build/quick/classes/library/rootdoc.txt
/code/scala/build/quick/classes/library/scala/annotation/Annotation.class
/code/scala/build/quick/classes/library/scala/annotation/bridge.class
/code/scala/build/quick/classes/library/scala/annotation/ClassfileAnnotation.class
/code/scala/build/quick/classes/library/scala/annotation/compileTimeOnly.class
/code/scala/build/quick/classes/library/scala/annotation/elidable$.class
/code/scala/build/quick/classes/library/scala/annotation/elidable.class
/code/scala/build/quick/classes/library/scala/annotation/implicitNotFound.class
/code/scala/build/quick/classes/library/scala/annotation/meta/beanGetter.class
/code/scala/build/quick/classes/library/scala/annotation/meta...
@@ -82,6 +86,9 @@ class IMain(@BeanProperty val factory: ScriptEngineFactory, initialSettings: Set
private var _classLoader: util.AbstractFileClassLoader = null // active classloader
private val _compiler: ReplGlobal = newCompiler(settings, reporter) // our private compiler

private trait ExposeAddUrl extends URLClassLoader { def addNewUrl(url: URL) = this.addURL(url) }
private var _runtimeClassLoader: URLClassLoader with ExposeAddUrl = null // wrapper exposing addURL

This comment has been minimized.

@som-snytt

som-snytt Oct 15, 2014
Contributor

ScalaClassLoader.URLClassLoader also widens addURL.

This comment has been minimized.

@heathermiller

heathermiller Oct 16, 2014
Author Member

Good catch, fixed, thanks!


/** Is given package class a system package class that cannot be invalidated?
*/
private def isSystemPackageClass(pkg: Symbol) =

This comment has been minimized.

@retronym

retronym Oct 15, 2014
Member

private def isSystemPackageClass(pkg: Symbol) = pkg == RootClass || (pkg.hasTransOwner(ScalaPackageClass) && !pkg.hasTransOwner(ScalaToolsPackageClass)

Noting that:

scala> ScalaPackageClass.hasTransOwner(ScalaPackageClass)
res4: Boolean = true

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

Done, thanks!

pkg == RootClass ||
pkg == definitions.ScalaPackageClass || {
val pkgname = pkg.fullName
(pkgname startsWith "scala.") && !(pkgname startsWith "scala.tools")

This comment has been minimized.

@retronym

retronym Oct 15, 2014
Member

Why the exception for scala.tools?

This comment has been minimized.

@heathermiller

heathermiller Oct 15, 2014
Author Member

This is actually logic left over from @gkossakowski's original invalidateClasspathEntries code. I’m not certain why scala.tools is excepted. I can remove this – but thought I was abiding by some inexplanable wisdom haha.

if (elems.size == 1) elems.head
else new MergedClassPath(elems, classPath.context)
val oldEntries = mkClassPath(subst.keys)
val newEntries = mkClassPath(subst.values)

This comment has been minimized.

@retronym

retronym Oct 15, 2014
Member

Is it a problem if that oldEntries and newEntries might have a in a non-deterministic order (given that subst is not a linked map)?

This comment has been minimized.

@heathermiller

heathermiller Oct 16, 2014
Author Member

I’m not sure, actually. To be safe, I just changed subst to be a TreeMap instead and defined an implicit ordering. So, this should force things to be in a deterministic order now.

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Oct 15, 2014

RE: @retronym:

I'd like to see this built on top of our existing abstractions for JAR file handling (AbstractFile).

Good idea. Done

Though it’ll still take a bit more work to be able to additionally deal with directories. This switch isn’t enough – I’m still having a ton of trouble reading in classes from dirs (for some reason, actual source is expected.)

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Oct 16, 2014

All feedback addressed, I believe.

exists = true
} catch {
case _: ClassNotFoundException => /* do nothing */
}

This comment has been minimized.

@som-snytt

som-snytt Oct 16, 2014
Contributor

intp.classLoader tryToLoadClass class.getName foreach (_ => exists = true) is what I meant to suggest, or wait:

if ((intp.classLoader tryToLoadClass class.getName).isDefined) exists = true.

Did you know scalac came up with the idea for Seinfeld on tv? It inferred a Show[Nothing].

This comment has been minimized.

@heathermiller

heathermiller Oct 16, 2014
Author Member

That may be shorter, but is it clearer?

This comment has been minimized.

@som-snytt

som-snytt Oct 16, 2014
Contributor

I dunno. I hate for convenience API to go unused. Also, maybe from Effective Java, and from paulp dogma, two funcs foo and notfoo are preferred to f(foo = true), or respectively, Foo and NotFoo with f implementations. But to clarify, recently I said somewhere that using guava's ImmutableMap.of() is not clearer than Collections.emptyMap(), just because everyone knows the core API.

In general, though, an API that reduces braces is a win. Here, I have to wonder if catching CNFE suffices. The ScalaCL also catches SecurityEx. Should it? I hope so, but in any case I don't want to think about that right now.

Off the record, I also said to myself, I thought I just looked at loading a class from a byte array, do we really need to introduce a class for that? But I didn't have a 30-second response. That's why it's nice to promote existing enhancements like ScalaClassLoader. But I don't always prefer DRY over quick'n'dirty re-implementations that avoid extraneous dependencies.

Sorry for the tldr, it's a long Hulu commercial during once upon a time.

This comment has been minimized.

@heathermiller

heathermiller Oct 16, 2014
Author Member

Ge-done-t. :)

val clazz = cloader.classOf(arr)
if ((intp.classLoader tryToLoadClass clazz.getName).isDefined) exists = true
}
}

This comment has been minimized.

@retronym

retronym Oct 17, 2014
Member

This section can still do with some polish.

def classNameOf(classFile: AbstractFile): String = cloader.classOf(classFile.toByteArray).getName
def alreadyDefined(clsName: String) = intp.classLoader.tryToLoadClass(clsName).isDefined
val exists = entries.filter(_.hasExtension("class")).map(classNameOf).exists(alreadyDefined)

You can then delete readFully.

} else if (exists) {
echo("The path '" + f + "' cannot be loaded, because existing classpath entries conflict.")
} else echo("The path '" + f + "' doesn't seem to exist.")
}

This comment has been minimized.

@retronym

retronym Oct 17, 2014
Member

This conditional seems awkwardly arranged.

How about:

if (!f.exists) echo(s"The path '$f' doesn't seem to exist.")
else if (exists) echo(s"The path '$f' cannot be loaded, because existing classpath entries conflict.") // TODO tell me which one
else ...
@retronym
Copy link
Member

@retronym retronym commented Oct 17, 2014

How has this been tested? Are we able to add automated tests?

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Oct 17, 2014

Manually, where I take a bunch of sometimes overlapping jars and load them in different ways and try to use them. If you can suggest a way for me to automatically test arbitrary jar loading (I add random jars to the test suite...? Not sure that makes sense) I'd be happy to add a test case.

@retronym
Copy link
Member

@retronym retronym commented Oct 17, 2014

Sometime manual testing is okay, but you should include a transcript of the test process in the pull request.

We do have a mechanism for using JARs in tests:

% find test  | grep desired
test/benchmarks/lib/jsr166_and_extra.jar.desired.sha1
test/files/codelib/code.jar.desired.sha1
test/files/lib/annotations.jar.desired.sha1
test/files/lib/enums.jar.desired.sha1
test/files/lib/genericNest.jar.desired.sha1
test/files/lib/jsoup-1.3.1.jar.desired.sha1
test/files/lib/macro210.jar.desired.sha1
test/files/lib/methvsfield.jar.desired.sha1
test/files/lib/nest.jar.desired.sha1
test/files/speclib/instrumented.jar.desired.sha1

One needs authorization to upload the JARs into the repository (in Github we just have the signature, and the ANT build downloads them). I haven't personally done this before and don't have the credentials.

I know that @xeno-by has done this, though. For example, in b10f45a, he depends on JAR in a new test case.

An alternative, which may be better in this case, is to use a run test case to programmatically generate the JAR files. Take a look at test/files/run/t6440b.scala for an example of compiling a few batches of code. You could use s"-d ${testOutput.toFile.getPath}/jar1.jar" as the options to a global instance to emit some code to a JAR.

Once the JARs were prepared, you could programatically drive the interpreter.

I know this represents another half a day of hacking about, but this feature seems subtle enough to warrant that.

@xeno-by
Copy link
Member

@xeno-by xeno-by commented Oct 17, 2014

Can tell you guys how to do this stuff. We could do this via email.

@mpociecha
Copy link
Member

@mpociecha mpociecha commented Oct 17, 2014

@heathermiller Do you mean certain concrete jars? Or maybe is only this overlapping important? Theoretically you could generate some jars (see https://github.com/mpociecha/scala/blob/flat-classpath/test/files/run/various-flat-classpath-types.scala).

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Oct 24, 2014

Hey guys, thanks for all the pointers. I think all comments are now addressed (clean ups as suggested by @retronym applied, and 4 tests that programmatically generate jars are now included)

@retronym
Copy link
Member

@retronym retronym commented Oct 24, 2014

LGTM!

@retronym
Copy link
Member

@retronym retronym commented Nov 5, 2014

That sounds okay to me. But perhaps we could review and discuss that as followup pull request?

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 6, 2014

I can break it out if you'd like – though it complicates this companion/concurrent PR to Spark that depends on this PR being merged :/

@retronym
Copy link
Member

@retronym retronym commented Nov 6, 2014

No problems, feel free to add it as an extra commit to this PR.

Looking more closely at mergeUrlsIntoClassPath, I think there are a few improvements.

  • It only uses platform.classpath, how about it a member on ClassPath to make it usable without a JavaPlatform?
  • The if/else/else to create the right sort of AbstractFile looks like it should be factored out to a new factory method on AbstractFile. AbstractFile.getURL is currently unused within the compiler codebase, and might be a good candidate to generalize/document to meet the needs.
- Moves mergeUrlsIntoClassPath from Global into ClassPath
- Revises and documents AbstractFile.getURL
@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 10, 2014

All good ideas! I just addressed your feedback and added a new commit with the suggested refactorings. Let me know if anything looks amiss!

Thanks :)

@gkossakowski
Copy link
Member

@gkossakowski gkossakowski commented Nov 10, 2014

Hey guys! I'm a bit late to the party (my Mac was down for almost 2 weeks) but I'll try to give my feedback in next 2-3 days.

@retronym
Copy link
Member

@retronym retronym commented Nov 10, 2014

LGTM

A note for next time: It would be easier to review if that were two commits: one that refactored the method in-situ, and a second that moved it. That way we can see the diff more clearly.

@retronym
Copy link
Member

@retronym retronym commented Nov 12, 2014

PLS REBUILD ALL

@scala-jenkins
Copy link

@scala-jenkins scala-jenkins commented Nov 12, 2014

(kitty-note-to-self: ignore 62669006)
🐱 Roger! Rebuilding pr-scala for 8192571, a84abd0, f65c430, 04ee526, d045dde, be3eb58, 9e56c7a, 24a2ef9. 🚨

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 12, 2014

True, good point. Duly noted :)

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 12, 2014

Thanks again for the detailed review!

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 17, 2014

Ping on this – just wondering when this will be merged? I ask because this PR is blocking Spark from moving to 2.11 (as things stood when I was in CA 1.5 weeks ago).

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 17, 2014

The sooner we get a SNAPSHOT or something, the better – gives us more time to have the new REPL support in the Spark 1.2 branch and more time to have this tested before 1.2 is released.

@retronym
Copy link
Member

@retronym retronym commented Nov 17, 2014

@gkossakowski Do you still plan to review this? If you don't have time in the next day or two, I propose we merge this as is.

@gkossakowski
Copy link
Member

@gkossakowski gkossakowski commented Nov 17, 2014

Unfortunately, I won't have time for detailed review. The infra work and dealing with backlog is consuming all of my time. I trust review powers of @retronym, though. :)

There's only one thing that concerns me: we are adding a bunch of logic back to Global. I know originally the classpath invalidation logic was in Global but that was a mistake. I think this logic should live in a separate class with properly spelled dependencies. I'd propose moving this logic into a class that is referenced from Global through composition. If Spark expects certain methods to live directly in Global, let's just add deprecated forwarders. This would be part of the quest for reducing Global's surface area.

The refactoring work can be done in separate PR (there's no need to block this one any further!) but I'd to see it happen. @heathermiller, WDYT?

@gkossakowski
Copy link
Member

@gkossakowski gkossakowski commented Nov 17, 2014

See also #4060 that is directly related to this work. It would benefit from clearly delineating the boundary between classpath invalidation and the rest of the compiler.

@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 18, 2014

Maybe later, sure, but this PR has been blocking me for over a month now. Would be nice to merge it...

retronym added a commit that referenced this pull request Nov 18, 2014
SI-6502 Reenables loading jars into the running REPL (regression in 2.10)
@retronym retronym merged commit b2ba80a into scala:2.11.x Nov 18, 2014
1 check passed
1 check passed
default pr-scala Took 91 min.
Details
@heathermiller
Copy link
Member Author

@heathermiller heathermiller commented Nov 18, 2014

Thanks very much!! :)

lrytz added a commit to lrytz/scala that referenced this pull request Mar 22, 2016
:require was re-incarnated in scala#4051,
it seems to be used by the spark repl. This commit makes it work when
using the flat classpath representation.
lrytz added a commit to lrytz/scala that referenced this pull request Mar 22, 2016
:require was re-incarnated in scala#4051,
it seems to be used by the spark repl. This commit makes it work when
using the flat classpath representation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

8 participants
You can’t perform that action at this time.