[SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext #8764

andrewor14 · 2015-09-15T08:19:54Z

Instead of relying on DataFrames to verify our answers, we can just use simple arrays. This significantly simplifies the test logic for LocalNodes and reduces a lot of code duplicated from SparkPlanTest.

This also fixes an additional issue SPARK-10624 where the output of TakeOrderedAndProjectNode is not actually ordered.

This commit refactors DummyNode to take in data from LocalRelation. Then it rewrites FilterNodeSuite to make it read from DummyNode instead of from a DataFrame. Future commits will cover other LocalNode test suites.

…-cleanup Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/local/LocalNode.scala sql/core/src/test/scala/org/apache/spark/sql/execution/local/LocalNodeTest.scala

andrewor14 · 2015-09-15T08:20:49Z

@zsxwing please have a look in the mean time. Thanks.

andrewor14 · 2015-09-15T08:21:42Z

sql/core/src/main/scala/org/apache/spark/sql/execution/local/SampleNode.scala

-        (new PoissonSampler[InternalRow](upperBound - lowerBound, useGapSamplingIfPossible = false),
-          // Use the seed for partition 0 like PartitionwiseSampledRDD to generate the same result
-          // of DataFrame
-          random.nextLong())


@zsxwing I had to remove this to make testing deterministic. Looking at this further I still don't see the point of introducing another layer of randomness here. What change in behavior does this entail?

I was using DataFrame.sample to test SampleNode and it mocked the behavior of DataFrame.sample(withReplacement = true). Since you don't use DataFrame to test it now, I agree that we can remove this tricky logic.

SparkQA · 2015-09-15T10:30:35Z

Test build #42479 has finished for PR 8764 at commit 0030ba0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zsxwing · 2015-09-15T14:40:04Z

Fix TakeOrderedAndProjectSuite, where tests are currently ignored

I didn't notice java.util.PriorityQueue.iterator doesn't guarantee the order. Could you fix this line:

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/local/TakeOrderedAndProjectNode.scala

Line 53 in e626ac5

iterator = queue.iterator

?
It should be queue.toArray.sorted(ord).iterator.

…-cleanup

andrewor14 · 2015-09-15T22:51:46Z

Alright, as of the latest commit I believe this patch is basically ready. @zsxwing I fixed the take ordered issue we discussed, filed a new JIRA SPARK-10624 for it, and included the fix in this patch.

andrewor14 · 2015-09-16T00:24:18Z

Looks like the relevant tests have already passed. Merging into master.

zsxwing · 2015-09-16T00:25:20Z

Thanks @andrewor14 I will update #8769 as per your changes.

SparkQA · 2015-09-16T00:39:00Z

Test build #42510 has finished for PR 8764 at commit 3bd5ac7.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

SparkQA · 2015-09-16T01:25:54Z

Test build #1761 has finished for PR 8764 at commit 3bd5ac7.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

SparkQA · 2015-09-16T01:28:32Z

Test build #1763 has finished for PR 8764 at commit 3bd5ac7.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

SparkQA · 2015-09-16T01:30:31Z

Test build #1762 has finished for PR 8764 at commit 3bd5ac7.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

Andrew Or added 4 commits September 14, 2015 18:37

Rewrite FilterNodeSuite to use LocalRelations

7740f5c

This commit refactors DummyNode to take in data from LocalRelation. Then it rewrites FilterNodeSuite to make it read from DummyNode instead of from a DataFrame. Future commits will cover other LocalNode test suites.

Intersect, Project, and Limit

10fc109

TakeOrderedAndProject + Sample

a93a260

Merge branch 'master' of github.com:apache/spark into sql-local-tests…

0030ba0

…-cleanup Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/local/LocalNode.scala sql/core/src/test/scala/org/apache/spark/sql/execution/local/LocalNodeTest.scala

andrewor14 reviewed Sep 15, 2015
View reviewed changes

Andrew Or added 5 commits September 15, 2015 12:04

HashJoinNodeSuite

473e3eb

NestedLoopJoinNodeSuite

8364d23

ExpandNode

36fb038

Delete all obsolete code in LocalNodeTest

060e5e6

Fix TakeOrderedAndProjectNode

372ab5f

andrewor14 changed the title ~~[SPARK-10613] [SQL] Reduce LocalNode tests dependency on SQLContext~~ [SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext Sep 15, 2015

Merge branch 'master' of github.com:apache/spark into sql-local-tests…

3bd5ac7

…-cleanup

andrewor14 force-pushed the sql-local-tests-cleanup branch from 62f0afd to 3bd5ac7 Compare September 15, 2015 22:50

asfgit closed this in 35a19f3 Sep 16, 2015

andrewor14 deleted the sql-local-tests-cleanup branch September 16, 2015 00:26

andrewor14 mentioned this pull request Oct 7, 2015

[SPARK-10887] [SQL] Build HashedRelation outside of HashJoinNode. #8953

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext #8764

[SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext #8764

andrewor14 commented Sep 15, 2015

andrewor14 commented Sep 15, 2015

andrewor14 Sep 15, 2015

zsxwing Sep 15, 2015

SparkQA commented Sep 15, 2015

zsxwing commented Sep 15, 2015

andrewor14 commented Sep 15, 2015

andrewor14 commented Sep 16, 2015

zsxwing commented Sep 16, 2015

SparkQA commented Sep 16, 2015

SparkQA commented Sep 16, 2015

SparkQA commented Sep 16, 2015

SparkQA commented Sep 16, 2015

[SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext #8764

[SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext #8764

Conversation

andrewor14 commented Sep 15, 2015

andrewor14 commented Sep 15, 2015

andrewor14 Sep 15, 2015

Choose a reason for hiding this comment

zsxwing Sep 15, 2015

Choose a reason for hiding this comment

SparkQA commented Sep 15, 2015

zsxwing commented Sep 15, 2015

andrewor14 commented Sep 15, 2015

andrewor14 commented Sep 16, 2015

zsxwing commented Sep 16, 2015

SparkQA commented Sep 16, 2015

SparkQA commented Sep 16, 2015

SparkQA commented Sep 16, 2015

SparkQA commented Sep 16, 2015