Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext #8764

Closed
wants to merge 10 commits into from

Conversation

andrewor14
Copy link
Contributor

Instead of relying on DataFrames to verify our answers, we can just use simple arrays. This significantly simplifies the test logic for LocalNodes and reduces a lot of code duplicated from SparkPlanTest.

This also fixes an additional issue SPARK-10624 where the output of TakeOrderedAndProjectNode is not actually ordered.

Andrew Or added 4 commits September 14, 2015 18:37
This commit refactors DummyNode to take in data from LocalRelation.
Then it rewrites FilterNodeSuite to make it read from DummyNode
instead of from a DataFrame. Future commits will cover other
LocalNode test suites.
…-cleanup

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/execution/local/LocalNode.scala
	sql/core/src/test/scala/org/apache/spark/sql/execution/local/LocalNodeTest.scala
@andrewor14
Copy link
Contributor Author

@zsxwing please have a look in the mean time. Thanks.

(new PoissonSampler[InternalRow](upperBound - lowerBound, useGapSamplingIfPossible = false),
// Use the seed for partition 0 like PartitionwiseSampledRDD to generate the same result
// of DataFrame
random.nextLong())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zsxwing I had to remove this to make testing deterministic. Looking at this further I still don't see the point of introducing another layer of randomness here. What change in behavior does this entail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using DataFrame.sample to test SampleNode and it mocked the behavior of DataFrame.sample(withReplacement = true). Since you don't use DataFrame to test it now, I agree that we can remove this tricky logic.

@SparkQA
Copy link

SparkQA commented Sep 15, 2015

Test build #42479 has finished for PR 8764 at commit 0030ba0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Sep 15, 2015

Fix TakeOrderedAndProjectSuite, where tests are currently ignored

I didn't notice java.util.PriorityQueue.iterator doesn't guarantee the order. Could you fix this line:

?
It should be queue.toArray.sorted(ord).iterator.

@andrewor14 andrewor14 changed the title [SPARK-10613] [SQL] Reduce LocalNode tests dependency on SQLContext [SPARK-10613] [SPARK-10624] [SQL] Reduce LocalNode tests dependency on SQLContext Sep 15, 2015
@andrewor14
Copy link
Contributor Author

Alright, as of the latest commit I believe this patch is basically ready. @zsxwing I fixed the take ordered issue we discussed, filed a new JIRA SPARK-10624 for it, and included the fix in this patch.

@andrewor14
Copy link
Contributor Author

Looks like the relevant tests have already passed. Merging into master.

@zsxwing
Copy link
Member

zsxwing commented Sep 16, 2015

Thanks @andrewor14 I will update #8769 as per your changes.

@asfgit asfgit closed this in 35a19f3 Sep 16, 2015
@andrewor14 andrewor14 deleted the sql-local-tests-cleanup branch September 16, 2015 00:26
@SparkQA
Copy link

SparkQA commented Sep 16, 2015

Test build #42510 has finished for PR 8764 at commit 3bd5ac7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

@SparkQA
Copy link

SparkQA commented Sep 16, 2015

Test build #1761 has finished for PR 8764 at commit 3bd5ac7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

@SparkQA
Copy link

SparkQA commented Sep 16, 2015

Test build #1763 has finished for PR 8764 at commit 3bd5ac7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

@SparkQA
Copy link

SparkQA commented Sep 16, 2015

Test build #1762 has finished for PR 8764 at commit 3bd5ac7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants