Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7150] SparkContext.range() and SQLContext.range() #6230

Closed
wants to merge 10 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented May 18, 2015

This PR is based on #6081, thanks @adrian-wang.

Closes #6081

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #32984 has started for PR 6230 at commit 789eda5.

:param numPartitions: the number of partitions of the DataFrame
:return: A new DataFrame

>>> sqlContext.range(1, 7, 2).collect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test for large ints (i.e. > 32 bits)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might make sense to have that in tests.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@rxin
Copy link
Contributor

rxin commented May 18, 2015

Can we update the title to "SparkContext.range() and SQLContext.range()" ?

@rxin
Copy link
Contributor

rxin commented May 18, 2015

LGTM other than the unit test.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #32984 has finished for PR 6230 at commit 789eda5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32984/
Test FAILed.

@davies davies changed the title [SPARK-7150] add range() api [SPARK-7150] SparkContext.range() and SQLContext.range() May 18, 2015
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32997/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented May 18, 2015

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #33000 has started for PR 6230 at commit d3ce5fe.

"""
if numPartitions is None:
numPartitions = self._sc.defaultParallelism
jdf = self._ssql_ctx.range(int(start), int(end), int(step), int(numPartitions))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make the parameters unpredictable, and lead to exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the start or end is invalid, you will get an exception anyway. By converting them in Python, we will got an exception in Python way (failed to converted into int), not a Py4j exception (failed to find a method to call), the later is much harder to understand for most of users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #33000 has finished for PR 6230 at commit d3ce5fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33000/
Test PASSed.

@davies
Copy link
Contributor Author

davies commented May 19, 2015

@adrian-wang @rxin Is this ready to merge?

@rxin
Copy link
Contributor

rxin commented May 19, 2015

Merging. Thanks @adrian-wang and @davies.

@asfgit asfgit closed this in c2437de May 19, 2015
asfgit pushed a commit that referenced this pull request May 19, 2015
This PR is based on #6081, thanks adrian-wang.

Closes #6081

Author: Daoyuan Wang <daoyuan.wang@intel.com>
Author: Davies Liu <davies@databricks.com>

Closes #6230 from davies/range and squashes the following commits:

d3ce5fe [Davies Liu] add tests
789eda5 [Davies Liu] add range() in Python
4590208 [Davies Liu] Merge commit 'refs/pull/6081/head' of github.com:apache/spark into range
cbf5200 [Daoyuan Wang] let's add python support in a separate PR
f45e3b2 [Daoyuan Wang] remove redundant toLong
617da76 [Daoyuan Wang] fix safe marge for corner cases
867c417 [Daoyuan Wang] fix
13dbe84 [Daoyuan Wang] update
bd998ba [Daoyuan Wang] update comments
d3a0c1b [Daoyuan Wang] add range api()

(cherry picked from commit c2437de)
Signed-off-by: Reynold Xin <rxin@databricks.com>
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
This PR is based on apache#6081, thanks adrian-wang.

Closes apache#6081

Author: Daoyuan Wang <daoyuan.wang@intel.com>
Author: Davies Liu <davies@databricks.com>

Closes apache#6230 from davies/range and squashes the following commits:

d3ce5fe [Davies Liu] add tests
789eda5 [Davies Liu] add range() in Python
4590208 [Davies Liu] Merge commit 'refs/pull/6081/head' of github.com:apache/spark into range
cbf5200 [Daoyuan Wang] let's add python support in a separate PR
f45e3b2 [Daoyuan Wang] remove redundant toLong
617da76 [Daoyuan Wang] fix safe marge for corner cases
867c417 [Daoyuan Wang] fix
13dbe84 [Daoyuan Wang] update
bd998ba [Daoyuan Wang] update comments
d3a0c1b [Daoyuan Wang] add range api()
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
This PR is based on apache#6081, thanks adrian-wang.

Closes apache#6081

Author: Daoyuan Wang <daoyuan.wang@intel.com>
Author: Davies Liu <davies@databricks.com>

Closes apache#6230 from davies/range and squashes the following commits:

d3ce5fe [Davies Liu] add tests
789eda5 [Davies Liu] add range() in Python
4590208 [Davies Liu] Merge commit 'refs/pull/6081/head' of github.com:apache/spark into range
cbf5200 [Daoyuan Wang] let's add python support in a separate PR
f45e3b2 [Daoyuan Wang] remove redundant toLong
617da76 [Daoyuan Wang] fix safe marge for corner cases
867c417 [Daoyuan Wang] fix
13dbe84 [Daoyuan Wang] update
bd998ba [Daoyuan Wang] update comments
d3a0c1b [Daoyuan Wang] add range api()
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This PR is based on apache#6081, thanks adrian-wang.

Closes apache#6081

Author: Daoyuan Wang <daoyuan.wang@intel.com>
Author: Davies Liu <davies@databricks.com>

Closes apache#6230 from davies/range and squashes the following commits:

d3ce5fe [Davies Liu] add tests
789eda5 [Davies Liu] add range() in Python
4590208 [Davies Liu] Merge commit 'refs/pull/6081/head' of github.com:apache/spark into range
cbf5200 [Daoyuan Wang] let's add python support in a separate PR
f45e3b2 [Daoyuan Wang] remove redundant toLong
617da76 [Daoyuan Wang] fix safe marge for corner cases
867c417 [Daoyuan Wang] fix
13dbe84 [Daoyuan Wang] update
bd998ba [Daoyuan Wang] update comments
d3a0c1b [Daoyuan Wang] add range api()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants