-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-10539][SQL]Project should not be pushed down through Intersect or Except #8742
Conversation
Test build #42419 has finished for PR 8742 at commit
|
|
Test build #42432 has finished for PR 8742 at commit
|
cc @yhuai for review. |
val rewrites = buildRewrites(e) | ||
Except( | ||
Project(projectList, left), | ||
Project(projectList.map(pushToRight(_, rewrites)), right)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add comments in this class to explain why we cannot pushdown projections? For filter pushdown, if the condition has non-deterministic expressions, it is not safe to pushdown filters for some cases. But, it will not be the case because of #7446. But, it is still good to think about if there is any case that filter pushdown is not safe. If we determine it is safe to do filter pushdown, let's add comments to explain the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yhuai, thanks for your comment. I didn't consider non-deterministic filters' effect on push down when I was doing this, I will think about it and make comments soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comments at here the reason that we cannot pushdown projections and why we can pushdown filters?
ping :) |
@yjshen The fix is good. Can you address comments? |
…ct or Except #8742 Intersect and Except are both set operators and they use the all the columns to compare equality between rows. When pushing their Project parent down, the relations they based on would change, therefore not an equivalent transformation. JIRA: https://issues.apache.org/jira/browse/SPARK-10539 I added some comments based on the fix of #8742. Author: Yijie Shen <henry.yijieshen@gmail.com> Author: Yin Huai <yhuai@databricks.com> Closes #8823 from yhuai/fix_set_optimization.
…ct or Except #8742 Intersect and Except are both set operators and they use the all the columns to compare equality between rows. When pushing their Project parent down, the relations they based on would change, therefore not an equivalent transformation. JIRA: https://issues.apache.org/jira/browse/SPARK-10539 I added some comments based on the fix of #8742. Author: Yijie Shen <henry.yijieshen@gmail.com> Author: Yin Huai <yhuai@databricks.com> Closes #8823 from yhuai/fix_set_optimization. (cherry picked from commit c6f8135) Signed-off-by: Yin Huai <yhuai@databricks.com> Conflicts: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Thanks @yhuai, I'll close this one. |
…ct or Except #8742 Intersect and Except are both set operators and they use the all the columns to compare equality between rows. When pushing their Project parent down, the relations they based on would change, therefore not an equivalent transformation. JIRA: https://issues.apache.org/jira/browse/SPARK-10539 I added some comments based on the fix of apache/spark#8742. Author: Yijie Shen <henry.yijieshen@gmail.com> Author: Yin Huai <yhuai@databricks.com> Closes #8823 from yhuai/fix_set_optimization.
…ct or Except apache#8742 Intersect and Except are both set operators and they use the all the columns to compare equality between rows. When pushing their Project parent down, the relations they based on would change, therefore not an equivalent transformation. JIRA: https://issues.apache.org/jira/browse/SPARK-10539 I added some comments based on the fix of apache#8742. Author: Yijie Shen <henry.yijieshen@gmail.com> Author: Yin Huai <yhuai@databricks.com> Closes apache#8823 from yhuai/fix_set_optimization. (cherry picked from commit c6f8135) Signed-off-by: Yin Huai <yhuai@databricks.com> Conflicts: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala (cherry picked from commit 3df52cc)
Intersect
andExcept
are both set operators and they use the all the columns to compare equality between rows. When pushing theirProject
parent down, the relations they based on would change, therefore not an equivalent transformation.JIRA: https://issues.apache.org/jira/browse/SPARK-10539