-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48545][SQL] Create to_avro and from_avro SQL functions to match DataFrame equivalents #46977
Conversation
...talyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/toFromAvroSqlFunctions.scala
Show resolved
Hide resolved
...talyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/toFromAvroSqlFunctions.scala
Outdated
Show resolved
Hide resolved
Thanks @allisonwang-db for your review, followed through on your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! cc @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if CI passes
cc @cloud-fan the CI is passing now :) |
...talyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/toFromAvroSqlFunctions.scala
Show resolved
Hide resolved
...talyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/toFromAvroSqlFunctions.scala
Show resolved
Hide resolved
Thanks, merging to master |
…nd from_avro functions but Avro is not loaded by default ### What changes were proposed in this pull request? This PR updates the new `to_avro` and `from_avro` SQL functions added in #46977 to return reasonable errors when Avro is not loaded by default. ### Why are the changes needed? According to the [Apache Spark Avro Data Source Guide](https://spark.apache.org/docs/latest/sql-data-sources-avro.html), Avro is not loaded into Spark by default. With this change, users get reasonable error messages if they try to call the `to_avro` or `from_avro` SQL functions in this case with instructions telling them what to do, rather than obscure Java `ClassNotFoundException`s. ### Does this PR introduce _any_ user-facing change? Yes, see above. ### How was this patch tested? This PR adds golden file based test coverage. ### Was this patch authored or co-authored using generative AI tooling? No GitHub copilot this time. Closes #47063 from dtenedor/to-from-avro-error-not-loaded. Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
This PR creates two new SQL functions "to_avro" and "from_avro" to match existing DataFrame equivalents.
For example:
Why are the changes needed?
This brings parity between SQL and DataFrame APIs in Apache Spark.
Does this PR introduce any user-facing change?
Yes, see above.
How was this patch tested?
This PR adds extra unit tests, and I also checked that the functions work with
spark-shell
.Was this patch authored or co-authored using generative AI tooling?
No GitHub copilot usage this time