Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13456][SQL][follow-up] lazily generate the outer pointer for case class defined in REPL #11931

Closed
wants to merge 4 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

In #11410, we missed a corner case: define the inner class and use it in Dataset at the same time by using paste mode. For this case, the inner class and the Dataset are inside same line object, when we build the Dataset, we try to get outer pointer from line object, and it will fail because the line object is not initialized yet.

https://issues.apache.org/jira/browse/SPARK-13456?focusedCommentId=15209174&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15209174 is an example for this corner case.

This PR make the process of getting outer pointer from line object lazy, so that we can successfully build the Dataset and finish initializing the line object.

How was this patch tested?

new test in repl suite.

@cloud-fan
Copy link
Contributor Author

cc @marmbrus @liancheng @retronym

@liancheng
Copy link
Contributor

Haven't tried it, but for the test, does it work by simply passing a multi-line string starting with line :paste?

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54021 has finished for PR 11931 at commit 05cec9a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

@liancheng another problem is: how to simulate CTRL+D with string? We need to exit paste mode finally.

@cloud-fan
Copy link
Contributor Author

Did some experiment locally, we can use ; to combine multiple statements, it will generate same line object as paste mode does, and can reproduce this bug for master.

test("define case class and create Dataset together") {
val output = runInterpreter("local-cluster[1,1,1024]",
"""
|case class A(value: Int); sqlContext.createDataset(Seq(A(1)))(newProductEncoder[A])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use Seq(A(1)).toDS() here, but it exposes another bug of our spark shell: we can't find implicit for DatasetHolder. Will file a JIRA later.

@liancheng
Copy link
Contributor

@cloud-fan ASCII code of CTRL-D is 4. I tried to add a dummy test case in ReplSuite as follows and it works:

  test("foo") {
    val output = runInterpreter("local",
      """:paste
        |val a = 1
        |println(a)
      """.stripMargin + 4.toChar.toString
    )
    println(
      s"""###################
         |$output
         |###################
       """.stripMargin)
  }

@cloud-fan
Copy link
Contributor Author

It works! Many thanks to @liancheng !

@@ -59,6 +59,10 @@ class ReplSuite extends SparkFunSuite {
return out.toString
}

// Simulate the paste mode in Scala REPL.
def runInterpreterInPasteMode(master: String, input: String): String =
runInterpreter(master, ":paste\n" + input + 4.toChar)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment to explain that 4.toChar is CTRL-D.

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54036 has finished for PR 11931 at commit 752cdef.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • test(\"define case class and create Dataset together\")
    • |case class A(value: Int); sqlContext.createDataset(Seq(A(1)))(newProductEncoder[A])

@liancheng
Copy link
Contributor

LGTM pending Jenkins.

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54039 has finished for PR 11931 at commit 281758f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • test(\"define case class and create Dataset together with paste mode\")
    • |case class TestClass(value: Int)

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54043 has finished for PR 11931 at commit f7e1f25.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Mar 25, 2016

Test build #54130 has finished for PR 11931 at commit f7e1f25.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 25, 2016

Test build #54146 has finished for PR 11931 at commit f7e1f25.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor

Merging to master.

@asfgit asfgit closed this in e9b6e7d Mar 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants