-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19658] [SQL] Set NumPartitions of RepartitionByExpression In Parser #16988
Conversation
Test build #73125 has finished for PR 16988 at commit
|
Test build #73130 has finished for PR 16988 at commit
|
how about we set the |
(" sort by a, b desc", basePlan.sortBy('a.asc, 'b.desc)), | ||
(" distribute by a, b", basePlan.distribute('a, 'b)()), | ||
(" distribute by a sort by b", basePlan.distribute('a)().sortBy('b.asc)), | ||
(" cluster by a, b", basePlan.distribute('a, 'b)().sortBy('a.asc, 'b.asc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three test cases are moved to SparkSqlParserSuite.scala
@cloud-fan Great idea! |
Test build #73286 has finished for PR 16988 at commit
|
Test build #73288 has finished for PR 16988 at commit
|
ctx: QueryOrganizationContext, | ||
expressions: Seq[Expression], | ||
query: LogicalPlan): LogicalPlan = { | ||
RepartitionByExpression(expressions, query, conf.numShufflePartitions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent is wrong here
Test build #73289 has finished for PR 16988 at commit
|
LGTM |
Test build #73293 has finished for PR 16988 at commit
|
thanks, merging to master! |
…rser ### What changes were proposed in this pull request? Currently, if `NumPartitions` is not set in RepartitionByExpression, we will set it using `spark.sql.shuffle.partitions` during Planner. However, this is not following the general resolution process. This PR is to set it in `Parser` and then `Optimizer` can use the value for plan optimization. ### How was this patch tested? Added a test case. Author: Xiao Li <gatorsmile@gmail.com> Closes apache#16988 from gatorsmile/resolveRepartition.
What changes were proposed in this pull request?
Currently, if
NumPartitions
is not set in RepartitionByExpression, we will set it usingspark.sql.shuffle.partitions
during Planner. However, this is not following the general resolution process. This PR is to set it inParser
and thenOptimizer
can use the value for plan optimization.How was this patch tested?
Added a test case.