-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20281][SQL] Print the identical Range parameters of SparkContext APIs and SQL in explain #17670
Conversation
Test build #75900 has finished for PR 17670 at commit
|
I think the change should rather be here where the built-in table-valued function |
@gatorsmile WDYT? |
val scRange = sqlContext.range(10) | ||
val sqlRange = sqlContext.sql("SELECT * FROM range(10)") | ||
assert(explainStr(scRange) === explainStr(sqlRange)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test case is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll revert
As @jaceklaskowski said, it would be good to fill |
okay, I'll fix soon. Thanks! |
Looking around the related code, I think we cannot easily set |
Test build #75971 has started for PR 17670 at commit |
@@ -527,7 +527,7 @@ class SparkSession private( | |||
@Experimental | |||
@InterfaceStability.Evolving | |||
def range(start: Long, end: Long, step: Long): Dataset[java.lang.Long] = { | |||
range(start, end, step, numPartitions = sparkContext.defaultParallelism) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about reverting the changes in this file? We can make the PR small enough. We can backport it to 2.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, good to me. I'll revert
Ok, I am fine to keep the existing way. |
LGTM except a comment. |
Test build #75974 has started for PR 17670 at commit |
better to open another pr to backport into v2.2? |
Jenkins, retest this please. |
Test build #75975 has finished for PR 17670 at commit
|
ping |
Thanks! Merging to master/2.2 |
…xt APIs and SQL in explain ## What changes were proposed in this pull request? This pr modified code to print the identical `Range` parameters of SparkContext APIs and SQL in `explain` output. In the current master, they internally use `defaultParallelism` for `splits` by default though, they print different strings in explain output; ``` scala> spark.range(4).explain == Physical Plan == *Range (0, 4, step=1, splits=Some(8)) scala> sql("select * from range(4)").explain == Physical Plan == *Range (0, 4, step=1, splits=None) ``` ## How was this patch tested? Added tests in `SQLQuerySuite` and modified some results in the existing tests. Author: Takeshi Yamamuro <yamamuro@apache.org> Closes #17670 from maropu/SPARK-20281. (cherry picked from commit 48d760d) Signed-off-by: Xiao Li <gatorsmile@gmail.com>
…xt APIs and SQL in explain ## What changes were proposed in this pull request? This pr modified code to print the identical `Range` parameters of SparkContext APIs and SQL in `explain` output. In the current master, they internally use `defaultParallelism` for `splits` by default though, they print different strings in explain output; ``` scala> spark.range(4).explain == Physical Plan == *Range (0, 4, step=1, splits=Some(8)) scala> sql("select * from range(4)").explain == Physical Plan == *Range (0, 4, step=1, splits=None) ``` ## How was this patch tested? Added tests in `SQLQuerySuite` and modified some results in the existing tests. Author: Takeshi Yamamuro <yamamuro@apache.org> Closes apache#17670 from maropu/SPARK-20281.
What changes were proposed in this pull request?
This pr modified code to print the identical
Range
parameters of SparkContext APIs and SQL inexplain
output. In the current master, they internally usedefaultParallelism
forsplits
by default though, they print different strings in explain output;How was this patch tested?
Added tests in
SQLQuerySuite
and modified some results in the existing tests.