-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33733][SQL] PullOutNondeterministic should check and collect deterministic field #30703
Conversation
Test build #132553 has finished for PR 30703 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #132557 has finished for PR 30703 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #132566 has finished for PR 30703 at commit
|
retest this please |
val leafNondeterministic = expr.collect { case n: Nondeterministic => n } | ||
val leafNondeterministic = expr.collect { | ||
case n: Nondeterministic => n | ||
case e: CallMethodViaReflection => e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix itself looks fine. Just my nit comment: naming java_method('java.lang.Math', 'abs', c1)
"_nondeterministic " in a plan looks a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for simple test, like the empty scala udf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. I misunderstood it a bit. nvm.
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #132607 has finished for PR 30703 at commit
|
cc @cloud-fan |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Show resolved
Hide resolved
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #132648 has finished for PR 30703 at commit
|
cc @cloud-fan @maropu thanks for review ! |
thanks, merging to master/3.1! |
…eterministic field ### What changes were proposed in this pull request? The deterministic field is wider than `NonDerterministic`, we should keep same range between pull out and check analysis. ### Why are the changes needed? For example ``` select * from values(1), (4) as t(c1) order by java_method('java.lang.Math', 'abs', c1) ``` We will get exception since `java_method` deterministic field is false but not a `NonDeterministic` ``` Exception in thread "main" org.apache.spark.sql.AnalysisException: nondeterministic expressions are only allowed in Project, Filter, Aggregate or Window, found: java_method('java.lang.Math', 'abs', t.`c1`) ASC NULLS FIRST in operator Sort [java_method(java.lang.Math, abs, c1#1) ASC NULLS FIRST], true ;; ``` ### Does this PR introduce _any_ user-facing change? Yes. ### How was this patch tested? Add test. Closes #30703 from ulysses-you/SPARK-33733. Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 839d689) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@ulysses-you can you open backport PRs for 3.0/2.4? thanks! |
@cloud-fan will do it. |
…ect deterministic field backport [#30703](#30703) for branch-2.4. ### What changes were proposed in this pull request? The deterministic field is wider than `NonDerterministic`, we should keep same range between pull out and check analysis. ### Why are the changes needed? For example ``` select * from values(1), (4) as t(c1) order by java_method('java.lang.Math', 'abs', c1) ``` We will get exception since `java_method` deterministic field is false but not a `NonDeterministic` ``` Exception in thread "main" org.apache.spark.sql.AnalysisException: nondeterministic expressions are only allowed in Project, Filter, Aggregate or Window, found: java_method('java.lang.Math', 'abs', t.`c1`) ASC NULLS FIRST in operator Sort [java_method(java.lang.Math, abs, c1#1) ASC NULLS FIRST], true ;; ``` ### Does this PR introduce _any_ user-facing change? Yes. ### How was this patch tested? Add test. Closes #30772 from ulysses-you/SPARK-33733-branch-2.4. Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…ect deterministic field backport [#30703](#30703) for branch-3.0. ### What changes were proposed in this pull request? The deterministic field is wider than `NonDerterministic`, we should keep same range between pull out and check analysis. ### Why are the changes needed? For example ``` select * from values(1), (4) as t(c1) order by java_method('java.lang.Math', 'abs', c1) ``` We will get exception since `java_method` deterministic field is false but not a `NonDeterministic` ``` Exception in thread "main" org.apache.spark.sql.AnalysisException: nondeterministic expressions are only allowed in Project, Filter, Aggregate or Window, found: java_method('java.lang.Math', 'abs', t.`c1`) ASC NULLS FIRST in operator Sort [java_method(java.lang.Math, abs, c1#1) ASC NULLS FIRST], true ;; ``` ### Does this PR introduce _any_ user-facing change? Yes. ### How was this patch tested? Add test. Closes #30771 from ulysses-you/SPARK-33733-branch-3.0. Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
late lgtm. thanks for fixing it, @ulysses-you |
What changes were proposed in this pull request?
The deterministic field is wider than
NonDerterministic
, we should keep same range between pull out and check analysis.Why are the changes needed?
For example
We will get exception since
java_method
deterministic field is false but not aNonDeterministic
Does this PR introduce any user-facing change?
Yes.
How was this patch tested?
Add test.