-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33372][SQL] Fix InSet bucket pruning #30279
Conversation
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #130728 has finished for PR 30279 at commit
|
thanks, merging to master/3.0! |
### What changes were proposed in this pull request? This pr fix `InSet` bucket pruning because of it's values should not be `Literal`: https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255 ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test and manual test: ```scala spark.sql("select id as a, id as b from range(10000)").write.bucketBy(100, "a").saveAsTable("t") spark.sql("select * from t where a in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)").show ``` Before this PR | After this PR -- | -- ![image](https://user-images.githubusercontent.com/5399861/98380788-fb120980-2083-11eb-8fae-4e21ad873e9b.png) | ![image](https://user-images.githubusercontent.com/5399861/98381095-5ba14680-2084-11eb-82ca-2d780c85305c.png) Closes #30279 from wangyum/SPARK-33372. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 69799c5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
late LGTM. Nice catch. |
Hi, @cloud-fan and @wangyum and @maropu . |
@wangyum can you open a PR for 2.4? |
### What changes were proposed in this pull request? This is a backport of #30279. This pr fix `InSet` bucket pruning because of it's values should not be `Literal`: https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255 ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test Closes #30308 from wangyum/SPARK-33372-2.4. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Yuming Wang <yumwang@ebay.com>
What changes were proposed in this pull request?
This pr fix
InSet
bucket pruning because of it's values should not beLiteral
:spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Lines 253 to 255 in cbd3fde
Why are the changes needed?
Fix bug.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit test and manual test: