-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-49566][SQL] Add SQL pipe syntax for the SET operator #48940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @cloud-fan @gengliangwang here is the |
sql/core/src/test/resources/sql-tests/results/pipe-operators.sql.out
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
respond to code review comments respond to code review comments respond to code review comments
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cloud-fan for your review! Please take another look.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/resources/sql-tests/results/pipe-operators.sql.out
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Show resolved
Hide resolved
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @cloud-fan for your reviews!!
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Show resolved
Hide resolved
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gengliangwang for your review!
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
respond to code review comments
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cloud-fan for your review again.
| case class UnresolvedStarExcept(target: Option[Seq[String]], excepts: Seq[Seq[String]]) | ||
| case class UnresolvedStarExceptOrReplace( | ||
| target: Option[Seq[String]], | ||
| excepts: Seq[Seq[String]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The excepts here can be nested columns, so technically we can support nested columns in the SET clause. This can be a followup work as an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan yes, and also we could support SELECT * REPLACE(<column_name>, <new_expression>) as well (some other SQL engines support this). We can keep these features in mind for adding to Spark in the future.
|
thanks, merging to master! |
|
@HyukjinKwon @allisonwang-db shall we implement |
What changes were proposed in this pull request?
This PR adds SQL pipe syntax support for SET operator.
This operator removes one or more existing column from the input table and replaces each one with a new computed column whose value is equal to evaluating the specified expression.
This is equivalent to
SELECT * EXCEPT (name), <newExpressions> AS namein the SQL compiler. It is provided as a convenience feature and some functionality overlap exists with lateral column aliases.For example:
Why are the changes needed?
The SQL pipe operator syntax will let users compose queries in a more flexible fashion.
Does this PR introduce any user-facing change?
Yes, see above.
How was this patch tested?
This PR adds a few unit test cases, but mostly relies on golden file test coverage. I did this to make sure the answers are correct as this feature is implemented and also so we can look at the analyzer output plans to ensure they look right as well.
Was this patch authored or co-authored using generative AI tooling?
No