-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-38030][SQL][3.2] Canonicalization should not remove nullability of AttributeReference dataType #35446
Conversation
…y of AttributeReference dataType
cc @cloud-fan and @viirya |
@@ -42,7 +42,7 @@ object Canonicalize { | |||
/** Remove names and nullability from types, and names from `GetStructField`. */ | |||
private[expressions] def ignoreNamesTypes(e: Expression): Expression = e match { | |||
case a: AttributeReference => | |||
AttributeReference("none", a.dataType.asNullable)(exprId = a.exprId) | |||
AttributeReference("none", a.dataType)(exprId = a.exprId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, I realized that we should fix line 42 too. Could you fix it master branch first as a follow-up of SPARK-38030, @shardulm94 ?
- Remove names and nullability from types
+ Remove names from types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! The canonicalization code has change substantiantially between 3.2 and master due to #34883. This comment no longer exists in the master branch so it should not be an issue. I will fix it in 3.2 and 3.1 PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thank you for checking, @shardulm94 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (Pending CI)
Can one of the admins verify this patch? |
…y of AttributeReference dataType This is a backport of #35332 to branch 3.2 ### What changes were proposed in this pull request? Canonicalization of AttributeReference should not remove nullability information of its dataType. ### Why are the changes needed? SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's `checkInputDataTypes` fails. Although the exact repro listed in SPARK-38030 no longer works in branch-3.2 due to an unrelated change (details in the JIRA), some other codepaths which depend on canonicalized representations can trigger the same issue. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast Closes #35446 from shardulm94/SPARK-38030-3.2. Authored-by: Shardul Mahadik <smahadik@linkedin.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Merged to branch-3.2. |
…y of AttributeReference dataType This is a backport of apache#35332 to branch 3.2 ### What changes were proposed in this pull request? Canonicalization of AttributeReference should not remove nullability information of its dataType. ### Why are the changes needed? SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's `checkInputDataTypes` fails. Although the exact repro listed in SPARK-38030 no longer works in branch-3.2 due to an unrelated change (details in the JIRA), some other codepaths which depend on canonicalized representations can trigger the same issue. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast Closes apache#35446 from shardulm94/SPARK-38030-3.2. Authored-by: Shardul Mahadik <smahadik@linkedin.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit d62735d)
This is a backport of #35332 to branch 3.2
What changes were proposed in this pull request?
Canonicalization of AttributeReference should not remove nullability information of its dataType.
Why are the changes needed?
SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's
checkInputDataTypes
fails. Although the exact repro listed in SPARK-38030 no longer works in branch-3.2 due to an unrelated change (details in the JIRA), some other codepaths which depend on canonicalized representations can trigger the same issue.Does this PR introduce any user-facing change?
No
How was this patch tested?
Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast