-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-38391][SPARK-38768][SQL][FOLLOWUP] Add comments for pushLimit
and pushTopN
of PushDownUtils
#36092
Conversation
…and pushTopN of PushDownUtils
ping @huaxingao cc @cloud-fan |
* | ||
* @return the tuple of Boolean. The first Boolean value represents whether to push down, and | ||
* the second Boolean value represents whether to push down partially which means to | ||
* keep the `Limit`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we be a little more clear and say something like
...the second Boolean value represents whether to push down partially, which means Spark will keep the Limit and do it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
* | ||
* @return the tuple of Boolean. The first Boolean value represents whether to push down, and | ||
* the second Boolean value represents whether to push down partially which means to | ||
* keep the `Sort` and `Limit`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as the above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to master/3.3.
…` and `pushTopN` of `PushDownUtils` ### What changes were proposed in this pull request? `pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents. ### Why are the changes needed? Make DS V2 API more friendly to developers. ### Does this PR introduce _any_ user-facing change? 'No'. Just update comments. ### How was this patch tested? N/A Closes #36092 from beliefer/SPARK-38391_SPARK-38768_followup. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit c4397cb) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun Thank you. @huaxingao Thank you for your review. |
…` and `pushTopN` of `PushDownUtils` ### What changes were proposed in this pull request? `pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents. ### Why are the changes needed? Make DS V2 API more friendly to developers. ### Does this PR introduce _any_ user-facing change? 'No'. Just update comments. ### How was this patch tested? N/A Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
… to data source (#474) * [SPARK-38768][SQL] Remove `Limit` from plan if complete push down limit to data source ### What changes were proposed in this pull request? Currently, Spark supports push down limit to data source. If limit could pushed down and Data source only have one partition, DS V2 still do limit again. This PR want remove `Limit` from plan if complete push down limit to data source. ### Why are the changes needed? Improve performance. ### Does this PR introduce _any_ user-facing change? 'No'. New feature. ### How was this patch tested? Tests updated. Closes apache#36043 from beliefer/SPARK-38768. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * [SPARK-38391][SPARK-38768][SQL][FOLLOWUP] Add comments for `pushLimit` and `pushTopN` of `PushDownUtils` ### What changes were proposed in this pull request? `pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents. ### Why are the changes needed? Make DS V2 API more friendly to developers. ### Does this PR introduce _any_ user-facing change? 'No'. Just update comments. ### How was this patch tested? N/A Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> * [SPARK-37960][SQL][FOLLOWUP] Make the testing CASE WHEN query more reasonable ### What changes were proposed in this pull request? Some testing CASE WHEN queries are not carefully written and do not make sense. In the future, the optimizer may get smarter and get rid of the CASE WHEN completely, and then we loose test coverage. This PR updates some CASE WHEN queries to make them more reasonable. ### Why are the changes needed? future-proof test coverage. ### Does this PR introduce _any_ user-facing change? 'No'. ### How was this patch tested? N/A Closes apache#36125 from beliefer/SPARK-37960_followup3. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * update spark version Co-authored-by: Jiaan Geng <beliefer@163.com>
… to data source (#474) * [SPARK-38768][SQL] Remove `Limit` from plan if complete push down limit to data source Currently, Spark supports push down limit to data source. If limit could pushed down and Data source only have one partition, DS V2 still do limit again. This PR want remove `Limit` from plan if complete push down limit to data source. Improve performance. 'No'. New feature. Tests updated. Closes apache#36043 from beliefer/SPARK-38768. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * [SPARK-38391][SPARK-38768][SQL][FOLLOWUP] Add comments for `pushLimit` and `pushTopN` of `PushDownUtils` `pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents. Make DS V2 API more friendly to developers. 'No'. Just update comments. N/A Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> * [SPARK-37960][SQL][FOLLOWUP] Make the testing CASE WHEN query more reasonable Some testing CASE WHEN queries are not carefully written and do not make sense. In the future, the optimizer may get smarter and get rid of the CASE WHEN completely, and then we loose test coverage. This PR updates some CASE WHEN queries to make them more reasonable. future-proof test coverage. 'No'. N/A Closes apache#36125 from beliefer/SPARK-37960_followup3. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * update spark version Co-authored-by: Jiaan Geng <beliefer@163.com>
… to data source (#474) * [SPARK-38768][SQL] Remove `Limit` from plan if complete push down limit to data source Currently, Spark supports push down limit to data source. If limit could pushed down and Data source only have one partition, DS V2 still do limit again. This PR want remove `Limit` from plan if complete push down limit to data source. Improve performance. 'No'. New feature. Tests updated. Closes apache#36043 from beliefer/SPARK-38768. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * [SPARK-38391][SPARK-38768][SQL][FOLLOWUP] Add comments for `pushLimit` and `pushTopN` of `PushDownUtils` `pushLimit` and `pushTopN` of `PushDownUtils` returns tuple of boolean. It will be good to explain what the boolean value represents. Make DS V2 API more friendly to developers. 'No'. Just update comments. N/A Closes apache#36092 from beliefer/SPARK-38391_SPARK-38768_followup. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> * [SPARK-37960][SQL][FOLLOWUP] Make the testing CASE WHEN query more reasonable Some testing CASE WHEN queries are not carefully written and do not make sense. In the future, the optimizer may get smarter and get rid of the CASE WHEN completely, and then we loose test coverage. This PR updates some CASE WHEN queries to make them more reasonable. future-proof test coverage. 'No'. N/A Closes apache#36125 from beliefer/SPARK-37960_followup3. Authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> * update spark version Co-authored-by: Jiaan Geng <beliefer@163.com>
What changes were proposed in this pull request?
pushLimit
andpushTopN
ofPushDownUtils
returns tuple of boolean. It will be good to explain what the boolean value represents.Why are the changes needed?
Make DS V2 API more friendly to developers.
Does this PR introduce any user-facing change?
'No'.
Just update comments.
How was this patch tested?
N/A