-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-50068][SQL] Refactor TypeCoercion and AnsiTypeCoercion to separate single node transformations
#48596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-50068][SQL] Refactor TypeCoercion and AnsiTypeCoercion to separate single node transformations
#48596
Conversation
TypeCoercion and AnsiTypeCoercion to separate single node transformations
TypeCoercion and AnsiTypeCoercion to separate single node transformationsTypeCoercion and AnsiTypeCoercion to separate single node transformations
1cfea26 to
0c1e6b2
Compare
1fb8c74 to
ed8f121
Compare
023c7d5 to
41cd1ab
Compare
TypeCoercion and AnsiTypeCoercion to separate single node transformationsTypeCoercion and AnsiTypeCoercion to separate single node transformations
d4de3f0 to
945c86a
Compare
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactor!! This is covered by existing tests. The PR generally looks good, just a couple of naming and comment changes.
...in/scala/org/apache/spark/sql/catalyst/analysis/AnsiGetDateFieldOperationsTypeCoercion.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala
Outdated
Show resolved
Hide resolved
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after responding to remaining comments
e94a12f to
1b6fd9d
Compare
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this PR contains to much changes. Let's split it, and make the changes step-by-step. @cloud-fan @dongjoon-hyun @HyukjinKwon WDYT?
TypeCoercion and AnsiTypeCoercion to separate single node transformationsTypeCoercion and AnsiTypeCoercion to separate single node transformations
|
yeah |
|
Reverted renaming changes |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala
Outdated
Show resolved
Hide resolved
5c727f6 to
3457a9b
Compare
...yst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecisionTypeCoercion.scala
Outdated
Show resolved
Hide resolved
This reverts commit 1b6fd9d8b3f366df9582bda467e72aa95eecb740.
3457a9b to
9b85a74
Compare
| def apply(expression: Expression): Expression = { | ||
| decimalAndDecimal() | ||
| .orElse(integralAndDecimalLiteral) | ||
| .orElse(nondecimalAndDecimal(conf.literalPickMinimumPrecision)) | ||
| .lift(expression).getOrElse(expression) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I changed all partial functions to regular apply methods. For consistency, I also changed this DecimalPrecision method from partial to regular. Can we do it this way?
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala
Outdated
Show resolved
Hide resolved
| import org.apache.spark.sql.connector.catalog.procedures.BoundProcedure | ||
| import org.apache.spark.sql.types.DataType | ||
|
|
||
| abstract class TypeCoercionBase extends TypeCoercionHelper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to confirm, do we just move it from TypeCoercion.scala to here without any change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly
| * Type coercion helper that matches against function expression in order to type coerce function | ||
| * argument types to expected types. | ||
| */ | ||
| object FunctionArgumentTypeCoercion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are they in the helper instead of individual files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because these objects depend on methods that are defined in Helper but are only implemented in concrete TypeCoercion objects. This makes it so that the inheritance chain needs to be Helper -> FunctionArgumentTypeCoercion -> TypeCoercion/AnsiTypeCoercion. So I think all these helper objects need to be bunched into a single class so I just left them at the root as it seemed to be the cleanest solution. Don't really love it, so if you have any suggestions we should change it.
2184ad5 to
d1e0e6f
Compare
|
thanks, merging to master! |
What changes were proposed in this pull request?
This PR proposed a refactor to
TypeCoercionrule in order to separate logic for transformation of a single node into separate files. Refactor is doing the following:TypeCoercionBaseintoTypeCoercionBaseandTypeCoercionHelperin separate abstract classesTypeCoercionHelpercontains declarations of all utility virtual methods that are later overriden inTypeCoercionandAnsiTypeCoerciontransformmethods that rely on these virtual methods are refactored so that the check whether children have been resolved remains as is while other match cases are refactored and moved into separate objects insideTypeCoercionHelpertransformmethods that don't depend on these virtual methods are refactored to separate files in a similar manner to previous pointTypeCoercionandAnsiTypeCoercionremain inTypeCoercionBase.TypeCoercionBaseinheritsTypeCoercionHelperTypeCoercionandAnsiTypeCoercioninheritTypeCoercionBaseThis PR doesn't refactor type coercion rules that don't have a separate
transformmethod but traverse the tree from their apply method. This will be done in a separate PRs.Why are the changes needed?
Refactoring to support
Analyzer++effortDoes this PR introduce any user-facing change?
How was this patch tested?
Existing tests
Was this patch authored or co-authored using generative AI tooling?