Fixed a typo in the comments in RangePartitioner by dorx · Pull Request #1473 · apache/spark

dorx · 2014-07-18T00:25:03Z

Checked with Holden, the original author as per the log, and was told
code is right comment is wrong.

Checked with Holden, the original author as per the log, and was told code is right comment is wrong.

SparkQA · 2014-07-18T00:28:01Z

QA tests have started for PR 1473. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16796/consoleFull

SparkQA · 2014-07-18T02:07:34Z

QA results for PR 1473:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16796/consoleFull

rxin · 2014-07-18T06:26:11Z

core/src/main/scala/org/apache/spark/Partitioner.scala

Actually 1000 seems a pretty large number for doing linear scan. How about 64 or 128?

That's why I noticed it in the first place. Would changing this number have unintended affects on people who're currently using the RangePartitioner?

I don't think so. If anything, it should make it faster.

SparkQA · 2014-07-18T06:43:06Z

QA tests have started for PR 1473. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16816/consoleFull

SparkQA · 2014-07-18T07:58:10Z

QA results for PR 1473:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16816/consoleFull

dorx · 2014-07-18T18:52:02Z

A superficial look at the failed unit tests seems to suggest some Spark SQL optimizations rely on the fact that 1000 is set as the sequential scan threshhold. @rxin @marmbrus

marmbrus · 2014-07-18T19:00:15Z

It appears to me that the range partitioner is not correctly using the provided ordering in the case where it uses a binary search.

rxin · 2014-07-20T07:20:19Z

I filed a JIRA: https://issues.apache.org/jira/browse/SPARK-2598

rxin · 2014-07-20T07:43:27Z

@dorx can you close this PR? #1500 includes the change here.

Fixed a typo in the comments in RangePartitioner

901a595

Checked with Holden, the original author as per the log, and was told code is right comment is wrong.

rxin reviewed Jul 18, 2014
View reviewed changes

changed naive search bound to 128 instead of 1000

6419568

dorx closed this Jul 20, 2014

dorx deleted the typoInPartitioner branch July 20, 2014 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed a typo in the comments in RangePartitioner#1473

Fixed a typo in the comments in RangePartitioner#1473
dorx wants to merge 2 commits intoapache:masterfrom
dorx:typoInPartitioner

dorx commented Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

rxin Jul 18, 2014

Uh oh!

dorx Jul 18, 2014

Uh oh!

rxin Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

dorx commented Jul 18, 2014

Uh oh!

marmbrus commented Jul 18, 2014

Uh oh!

rxin commented Jul 20, 2014

Uh oh!

rxin commented Jul 20, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dorx commented Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

rxin Jul 18, 2014

Choose a reason for hiding this comment

Uh oh!

dorx Jul 18, 2014

Choose a reason for hiding this comment

Uh oh!

rxin Jul 18, 2014

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

SparkQA commented Jul 18, 2014

Uh oh!

dorx commented Jul 18, 2014

Uh oh!

marmbrus commented Jul 18, 2014

Uh oh!

rxin commented Jul 20, 2014

Uh oh!

rxin commented Jul 20, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants