-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43529][SQL] Support general constant expressions as CREATE/REPLACE TABLE OPTIONS values #41191
Conversation
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
Note @MaxGekk @gengliangwang it says the GitHub actions CI failed, but actually this was just an unrelated PySpark flake and they actually passed :) |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gengliangwang for your review! Please take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gengliangwang for your review! Please take another look.
Update on this: I thought of a simpler way to break apart this work into smaller pieces so we can make gradual improvements. Let me make a commit and then I can ping this thread again. |
move constant folding to analysis
This is done, I have deferred the constant folding logic to the analyzer. This is ready for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gengliangwang, thanks again for your reviews, this is ready again :)
val dt = value.dataType | ||
value match { | ||
case Literal(null, _) => | ||
"null" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be null
or "null"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be "null", this part computes the string value for the table option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But string can be null too. If it is "null"
, then the literal should be Literal("null", StringType)
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gengliangwang I responded to your comments. I think this is ready to merge now, if you agree.
val dt = value.dataType | ||
value match { | ||
case Literal(null, _) => | ||
"null" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be "null", this part computes the string value for the table option.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
Outdated
Show resolved
Hide resolved
The main commit responding to the last round of code review comments is here: 65f9a92. |
(Note, this is passing all CI again.) |
val dt = value.dataType | ||
value match { | ||
case Literal(null, _) => | ||
"null" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"null" | |
null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for the last comment. Sorry for the slow response, I was focusing on another project.
No worries, thanks for the careful reviews :) |
This has been iterated for many rounds. I will make a follow-up PR to further make the code simpler. |
…related plans ### What changes were proposed in this pull request? Follow-up of #41191 to clean up the code in UnresolvedTableSpec and related plans: * Rename `OptionsListExpressions` as `OptionList` * Rename `trait TableSpec` as `TableSpecBase` * Rename `ResolvedTableSpec` as `TableSpec`, make sure all the physical plans are using `TableSpec` instead of `TableSpecBase`. * Move option list expressions to UnresolvedTableSpec, so that all the specs are in one class. * Make UnaryExpression an `UnaryExpression`, so that transforming with `mapExpressions` will transform it and the option list expressions in its child * Restore the signatures of class `CreateTable`, `CreateTableAsSelect`, `ReplaceTable` and `ReplaceTableAsSelect` ### Why are the changes needed? Make the code implementation simpler ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #41549 from gengliangwang/optionsFollowUp. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Gengliang Wang <gengliang@apache.org>
…related plans ### What changes were proposed in this pull request? Follow-up of apache/spark#41191 to clean up the code in UnresolvedTableSpec and related plans: * Rename `OptionsListExpressions` as `OptionList` * Rename `trait TableSpec` as `TableSpecBase` * Rename `ResolvedTableSpec` as `TableSpec`, make sure all the physical plans are using `TableSpec` instead of `TableSpecBase`. * Move option list expressions to UnresolvedTableSpec, so that all the specs are in one class. * Make UnaryExpression an `UnaryExpression`, so that transforming with `mapExpressions` will transform it and the option list expressions in its child * Restore the signatures of class `CreateTable`, `CreateTableAsSelect`, `ReplaceTable` and `ReplaceTableAsSelect` ### Why are the changes needed? Make the code implementation simpler ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #41549 from gengliangwang/optionsFollowUp. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Gengliang Wang <gengliang@apache.org>
…LACE TABLE OPTIONS values ### What changes were proposed in this pull request? This PR updates the SQL compiler to support general constnat expressions in the syntax for CREATE/REPLACE TABLE OPTIONS values, rather than restricting to a few types of literals only. * The analyzer now checks that the provided expressions are in fact `foldable`, and throws an error message otherwise. * This error message that users encounter in these cases improves from a general "syntax error at or near <location>" to instead indicate that the syntax is valid, but only constant expressions are supported in these contexts. ### Why are the changes needed? This makes it easier to provide OPTIONS lists in SQL, supporting use cases like concatenating strings with `||`. ### Does this PR introduce _any_ user-facing change? Yes, the SQL syntax changes. ### How was this patch tested? This PR adds new unit test coverage. Closes apache#41191 from dtenedor/expression-properties. Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com> Signed-off-by: Gengliang Wang <gengliang@apache.org>
…related plans ### What changes were proposed in this pull request? Follow-up of apache#41191 to clean up the code in UnresolvedTableSpec and related plans: * Rename `OptionsListExpressions` as `OptionList` * Rename `trait TableSpec` as `TableSpecBase` * Rename `ResolvedTableSpec` as `TableSpec`, make sure all the physical plans are using `TableSpec` instead of `TableSpecBase`. * Move option list expressions to UnresolvedTableSpec, so that all the specs are in one class. * Make UnaryExpression an `UnaryExpression`, so that transforming with `mapExpressions` will transform it and the option list expressions in its child * Restore the signatures of class `CreateTable`, `CreateTableAsSelect`, `ReplaceTable` and `ReplaceTableAsSelect` ### Why are the changes needed? Make the code implementation simpler ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes apache#41549 from gengliangwang/optionsFollowUp. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Gengliang Wang <gengliang@apache.org>
What changes were proposed in this pull request?
This PR updates the SQL compiler to support general constnat expressions in the syntax for CREATE/REPLACE TABLE OPTIONS values, rather than restricting to a few types of literals only.
foldable
, and throws an error message otherwise.Why are the changes needed?
This makes it easier to provide OPTIONS lists in SQL, supporting use cases like concatenating strings with
||
.Does this PR introduce any user-facing change?
Yes, the SQL syntax changes.
How was this patch tested?
This PR adds new unit test coverage.