-
Notifications
You must be signed in to change notification settings - Fork 656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SEDONA-276][SEDONA-277] Support Spark 3.4 #825
Conversation
@umartin @Kimahriman Martin and Adam, any comments on this since this introduces changes to the underlying POM design? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. Is it worth dropping tests for 3.0? Or only doing the cross Scala/jdk tests for the latest version and dropping for 3.3?
Looks good to me. Nice work! |
I agree with Adam. We can drop the cross Scala/jdk tests for 3.3, and only do it for 3.4. Also, for Python tests, we should only keep multiple Python version tests for Spark 3.4. Just a heads-up, this PR by default still builds Sedona-Spark against Spark 3.3 (not 3.4). We should change it to Spark 3.4 after we release Sedona 1.4.1. As a maintenance release, Spark 1.4.1 is not supposed to change the underlying Spark version. Regarding dropping the support of Spark 3.0, I will start a discussion in our mailing list and gauge the impact. If we'd like to proceed, this should be done in another PR. |
…version tests for Spark 3.3.
Did you read the Contributor Guide?
Is this PR related to a JIRA ticket?
What changes were proposed in this PR?
Support Spark 3.4
Introduction
This patch added support for Spark 3.4 by building separate artifacts for this minor versions. Later spark minor versions will also be supported in this way to workaround the evolving of Spark internal API changes.
Run the following command to build artifacts for Spark 3.4:
Run the following command to build artifacts for Spark 3.0 to 3.3:
We can also simply run
mvn clean install
in this case, since the default setup is to build artifacts for3.0_2.12
.Implementation
This patch divided
sedona-sql
into several modules:sedona-sql-common
: common Sedona SQL code compatible with all Spark versionssedona-sql
: depends onsedona-sql-common
, contains code for specific Spark minor versions. They live insql/spark-3.x
directories and use maven profile to select one of the directories to build artifacts.Tuned Dependency Management
We've also tuned the dependency management of profile-guided dependencies. These dependencies were explicitly declared in submodules instead of relying on the dependencyManagement of parent POM, and the versions of dependencies will be substituted to constants in published POMs.
How was this patch tested?
Added Spark 3.4 tests to GitHub Workflow.
Did this PR include necessary documentation updates?