Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13419][SQL] Update SubquerySuite to use checkAnswer for validation #12269

Closed
wants to merge 1 commit into from

Conversation

lresende
Copy link
Member

@lresende lresende commented Apr 9, 2016

What changes were proposed in this pull request?

Change SubquerySuite to validate test results utilizing checkAnswer helper method

How was this patch tested?

Existing tests

@SparkQA
Copy link

SparkQA commented Apr 9, 2016

Test build #55416 has finished for PR 12269 at commit d0652c9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55691 has finished for PR 12269 at commit e58cf23.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lresende lresende changed the title [SPARK-13419] Update SubquerySuite to use checkAnswer to validate res… [SPARK-13419][SQL] Update SubquerySuite to use checkAnswer to validate res… Apr 16, 2016
@davies
Copy link
Contributor

davies commented Apr 19, 2016

cc @liancheng , who knows more in SQL generation

@lresende
Copy link
Member Author

@davies @liancheng
So, when converting these tests, I noticed that the following :
test("uncorrelated scalar subquery on a DataFrame generated query") {
val df = Seq((1, "one"), (2, "two"), (3, "three")).toDF("key", "value")
df.registerTempTable("subqueryData")

checkAnswer(
  sql("select (select key from subqueryData where key > 2 order by key limit 1) + 1"),
  Array(Row(4))
)

}

Will fail with different plans, e.g.:

Project [(subquery#9 + 1) AS (scalarsubquery() + 1)#11]
: +- SubqueryAlias subquery#9
: +- GlobalLimit 1
: +- LocalLimit 1
: +- Sort [key#5 ASC], true
: +- Project [key#5]
: +- Filter (key#5 > 2)
: +- SubqueryAlias subqueryData
: +- Project [_1#2 AS key#5,_2#3 AS value#6]
: +- LocalRelation [_1#2,_2#3], [[0,1,1800000003,656e6f],[0,2,1800000003,6f7774],[0,3,1800000005,6565726874]]
+- OneRowRelation$

Project [(subquery#9 + 1) AS (scalarsubquery() + 1)#11]
: +- SubqueryAlias subquery#9
: +- GlobalLimit 1
: +- LocalLimit 1
: +- Sort [key#5 ASC], true
: +- Project [key#5]
: +- Filter (key#5 > 2)
: +- SubqueryAlias subqueryData
: +- Project [_1#2 AS key#5,_2#3 AS value#6]
: +- LocalRelation [_1#2,_2#3], [null,null,null]
+- OneRowRelation$

During some debuging I was tracking it down to TreeNode.parseToJson, where the missing data is falling into the case _ and not being properly added to json... but the actual fix might be in a different place.

Any thoughts on this issue ?

@lresende
Copy link
Member Author

@davies @liancheng looks like after re basing to latest code, this issue has been resolved. I am going to wait for a build to complete to double check.

@SparkQA
Copy link

SparkQA commented Apr 20, 2016

Test build #56307 has finished for PR 12269 at commit 79b49b7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lresende lresende changed the title [SPARK-13419][SQL] Update SubquerySuite to use checkAnswer to validate res… [SPARK-13419][SQL] Update SubquerySuite to use checkAnswer for validation Apr 20, 2016
@davies
Copy link
Contributor

davies commented Apr 20, 2016

LGTM,
Merging this into master, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants