Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support spark 3.0 #59

Merged
merged 1 commit into from
Sep 11, 2020
Merged

support spark 3.0 #59

merged 1 commit into from
Sep 11, 2020

Conversation

thesuperzapper
Copy link
Collaborator

@thesuperzapper thesuperzapper commented Aug 6, 2020

This PR implements support for Spark 3.0 (Scala 2.12)

To get this merged we will need to:

  • confirm if we have broken support for older versions of Spark, and if so, update the docs.
  • validate that this works in real deployments, not just unit tests

@thesuperzapper
Copy link
Collaborator Author

All sbt test pass except com.github.saurfang.sas.spark.DefaultSourceSuite which fails because it thinks path is not being passed to com.github.saurfang.sas.spark.DefaultSource.checkPath( by the CREATE TABLE ... step.

Feel free to tinker around with it if anyone wants.

Copy link
Contributor

@Tagar Tagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With CREATE TEMPORARY TABLE -> CREATE TEMPORARY VIEW change all tests pass.
Thanks for the improvement .

README.md Outdated Show resolved Hide resolved
@thesuperzapper
Copy link
Collaborator Author

thesuperzapper commented Aug 19, 2020

Can someone check if the jar compiled for Spark 3.0.0 is usable with Spark 2.4.X, (if you compile for Scala 2.11 obviously)

NOTE: run sbt +assembly to compile for multiple versions

@thesuperzapper
Copy link
Collaborator Author

Actually that will fail, because Maven has no 2.11 jars for Spark 3.0.0.

I think we should consider dropping support for Spark 2.X with version 3.X of this library.

(And just tell people to use 2.X if they use Spark 2.X)

@thesuperzapper
Copy link
Collaborator Author

@saurfang thoughts about dropping Spark 2.X?

@Tagar
Copy link
Contributor

Tagar commented Aug 19, 2020

my +1 cent on dropping Spark 2.x

Scala 2.12 support in Spark 2.4 was experimental as per release notes https://spark.apache.org/releases/spark-release-2-4-0.html
Databricks for example never released a runtime with Spark 2.x and Scala 2.12.

@srowen can comment here better.

If somebody needs to use spark-sas7bdat with Spark 2.x, they can just reference the current version. The new version would then be for Spark 3.x

Thanks!

@srowen
Copy link

srowen commented Aug 19, 2020

So, I think ideally you simply set up the build matrix to test Scala 2.11 + Spark 2.4, and Scala 2.12 + Spark 3. That is trivial to declare with Travis; I can show you examples.

If this is a library that doesn't change much, it's coherent to say users of Spark 2 can use current versions and users of Spark 3 can use future versions and just deal with it. However I think the code cross-complies readily.

What I would suggest dropping now is support for Scala 2.10 and Spark 2.3 or earlier. No real point going forward.

@Tagar
Copy link
Contributor

Tagar commented Aug 19, 2020

+1 to what Sean said - this looks like a much better option

@saurfang
Copy link
Owner

saurfang commented Aug 20, 2020

+1 on setting build matrix for 2.4 and 3.0 if we can fix tests to work for both. Otherwise I'm peaceful dropping support for 2.x since I don't think we have any plans for major features anyway.

@pkolli-caredx
Copy link

@saurfang Could you pls build the JAR ? I don't have permission to build the JAR

@srowen
Copy link

srowen commented Aug 20, 2020

@pkolli-caredx what do you mean you don't have permission? You can pull this branch and build it. If it's hard I could do it for you but should be that simple.

@pkolli-caredx
Copy link

@srowen Could you pls build the JAR and share it, that would be great

@srowen
Copy link

srowen commented Aug 20, 2020

@thesuperzapper thesuperzapper changed the title [wip] add support for spark 3.0 support spark 3.0 Sep 10, 2020
@thesuperzapper thesuperzapper marked this pull request as ready for review September 10, 2020 07:34
@thesuperzapper
Copy link
Collaborator Author

@saurfang got this working with a proper test matrix in travis, just need you to:

  1. merge this
  2. release 3.0.0 and update NEWS.md
    • there will be two .jar now, one for Spark3/Scala12 and one for Spark2/Scala11

@saurfang
Copy link
Owner

much appreciate it as always @thesuperzapper
I will get the release going this weekend

@saurfang saurfang merged commit c771652 into saurfang:master Sep 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants