Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7527][Core]Fix createNullValue to return the correct null values and REPL mode detection #6735

Closed
wants to merge 4 commits into from

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Jun 10, 2015

The root cause of SPARK-7527 is createNullValue returns an incompatible value Byte(0) for char and boolean.

This PR fixes it and corrects the class name of the main class, and also adds an unit test to demonstrate it.

@zsxwing
Copy link
Member Author

zsxwing commented Jun 10, 2015

/cc @andrewor14 @srowen @preeze

@SparkQA
Copy link

SparkQA commented Jun 10, 2015

Test build #34557 has finished for PR 6735 at commit 5f92dc1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented Jun 10, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Jun 10, 2015

Test build #34564 has finished for PR 6735 at commit 5f92dc1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -109,7 +109,16 @@ private[spark] object ClosureCleaner extends Logging {

private def createNullValue(cls: Class[_]): AnyRef = {
if (cls.isPrimitive) {
new java.lang.Byte(0: Byte) // Should be convertible to any primitive type
if (cls == java.lang.Boolean.TYPE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use match syntax here?
I see why boolean needs a special case, but a byte widens to a char in Java so I don't think that's necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following code will fail:

case class X(x: Char)

object Foo1 extends App {
  val c = classOf[X]
  val x = c.getConstructors()(0)
  println(x)
  println(x.newInstance(new java.lang.Byte(0: Byte)))
}

I don't know why char is also a special case.

@srowen
Copy link
Member

srowen commented Jun 10, 2015

Interesting, I'd love to hear that this is the fix. The current failure is:

[info] - clean complicated nested non-serializable closures *** FAILED *** (15 milliseconds)
[info]   Expected exception org.apache.spark.SparkException to be thrown, but java.lang.reflect.InvocationTargetException was thrown. (ClosureCleanerSuite2.scala:98)
...
[info]   Cause: java.lang.NullPointerException:
[info]   at org.apache.spark.util.ClosureCleanerSuite2$$anonfun$36.<init>(ClosureCleanerSuite2.scala:512)

@zsxwing
Copy link
Member Author

zsxwing commented Jun 10, 2015

It's because enclosingObject in instantiateClass is null. The bytecodes in constructor will check the first parameter:

  public org.apache.spark.streaming.StreamingContext$$anonfun$binaryRecordsStream$1(org.apache.spark.streaming.StreamingContext, java.lang.String, int);
    Code:
       0: aload_1
       1: ifnonnull     12
       4: new           #119                // class java/lang/NullPointerException
       7: dup
       8: invokespecial #122                // Method java/lang/NullPointerException."<init>":()V
      11: athrow
      12: aload_0
      13: aload_1
      14: putfield      #27                 // Field $outer:Lorg/apache/spark/streaming/StreamingContext;
      17: aload_0
      18: aload_2
      19: putfield      #57                 // Field directory$2:Ljava/lang/String;
      22: aload_0
      23: iload_3
      24: putfield      #49                 // Field recordLength$1:I
      27: aload_0
      28: invokespecial #123                // Method scala/runtime/AbstractFunction0."<init>":()V
      31: return

Before this fix, Utils.isInInterpreter always returns true because of the wrong class name... Now the code path of Utils.isInInterpreter == false is activated.

I'm looking for a fix for this NullPointerException.

@zsxwing
Copy link
Member Author

zsxwing commented Jun 10, 2015

The issue is that if a closure is cleaned twice, when cleaning the closure at the second time, the $outer of this closure may have already been set to null. Then when we call instantiateClass for this closure, enclosingObject will be null because we cannot find a parent (https://github.com/zsxwing/spark/blob/SPARK-7527/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L264).

Because the compiler always checks the first parameter in the constructor, so NullPointerException is thrown.

Since the code for Utils.isInInterpreter == false is not used and nobody reports any issue, how about just removing it and always using the code for Utils.isInInterpreter == true?

@srowen
Copy link
Member

srowen commented Jun 10, 2015

Heh, well it's supposed to be used when in the spark-shell I suppose. However I agree that it seems to have not been used for a long time, accidentally.

I don't know the history of this bit of code. I am not sure about the assumption in the comments that the constructor has no effects -- did it mean a case class? It sort of looks like an optimization, to try to call the constructor directly via reflection.

Well... I think it wouldn't hurt to try removing it as you say and see what happens?

@SparkQA
Copy link

SparkQA commented Jun 10, 2015

Test build #34595 has finished for PR 6735 at commit bbdb271.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 10, 2015

Test build #34594 has finished for PR 6735 at commit b0a0e7e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

case java.lang.Boolean.TYPE => new java.lang.Boolean(false)
case java.lang.Character.TYPE => new java.lang.Character('\0')
case java.lang.Void.TYPE =>
// This should not happen because `Foo(void x) {}` cannot be compiled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cannot be compiled -> does not compile

@andrewor14
Copy link
Contributor

LGTM. All of my comments are documentation related. I will fix these on merge myself.

This is going into master.

@@ -1804,15 +1804,10 @@ private[spark] object Utils extends Logging {

lazy val isInInterpreter: Boolean = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just delete this now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this for now. I remember there's some discussion saying that this should be abstracted to a shared method so it can be used elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact #6734 uses it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, now I remember that. Now the logic is correct, hopefully.

@asfgit asfgit closed this in e90c9d9 Jun 10, 2015
@zsxwing zsxwing deleted the SPARK-7527 branch June 11, 2015 03:49
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…lues and REPL mode detection

The root cause of SPARK-7527 is `createNullValue` returns an incompatible value `Byte(0)` for `char` and `boolean`.

This PR fixes it and corrects the class name of the main class, and also adds an unit test to demonstrate it.

Author: zsxwing <zsxwing@gmail.com>

Closes apache#6735 from zsxwing/SPARK-7527 and squashes the following commits:

bbdb271 [zsxwing] Use pattern match in createNullValue
b0a0e7e [zsxwing] Remove the noisy in the test output
903e269 [zsxwing] Remove the code for Utils.isInInterpreter == false
5f92dc1 [zsxwing] Fix createNullValue to return the correct null values and REPL mode detection
asfgit pushed a commit that referenced this pull request Oct 8, 2015
…lues and REPL mode detection

The root cause of SPARK-7527 is `createNullValue` returns an incompatible value `Byte(0)` for `char` and `boolean`.

This PR fixes it and corrects the class name of the main class, and also adds an unit test to demonstrate it.

Author: zsxwing <zsxwing@gmail.com>

Closes #6735 from zsxwing/SPARK-7527 and squashes the following commits:

bbdb271 [zsxwing] Use pattern match in createNullValue
b0a0e7e [zsxwing] Remove the noisy in the test output
903e269 [zsxwing] Remove the code for Utils.isInInterpreter == false
5f92dc1 [zsxwing] Fix createNullValue to return the correct null values and REPL mode detection
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants