Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Spark 3.0.2, 3.1.1 #830

Closed
wants to merge 2 commits into from
Closed

Conversation

suhsteve
Copy link
Member

@suhsteve suhsteve commented Mar 4, 2021

Add support for Spark 3.0.2 and 3.1.1

Addresses #827 and #811

@@ -30,7 +30,7 @@ public DeltaTableTests(DeltaFixture fixture)
/// Run the end-to-end scenario from the Delta Quickstart tutorial.
/// </summary>
/// <see cref="https://docs.delta.io/latest/quick-start.html"/>
[SkipIfSparkVersionIsLessThan(Versions.V2_4_2)]
[SkipIfSparkVersionIsNotInRange(Versions.V2_4_2, Versions.V3_1_1)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disable Delta tests that have code path that creates a org.apache.spark.sql.catalyst.expressions.Alias object. Delta 0.8.0 is not compatible with Spark 3.1.1 delta-io/delta#594

@@ -85,6 +80,12 @@ public void TestSignaturesV2_3_X()
dfw.Text($"{tempDir.Path}TestTextPath");

dfw.Csv($"{tempDir.Path}TestCsvPath");

dfw.Option("path", tempDir.Path).SaveAsTable("TestTable");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting path option and then calling save is not supported by default unless spark.sql.legacy.pathOptionBehavior.enabled conf is set. Reverse order in test.

https://github.com/apache/spark/blob/bb9abb098a1e26c220546616d655380c479b0e42/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L289

@@ -63,6 +62,7 @@ public void TestSignaturesV2_3_X()
dsr.Parquet(Path.Combine(TestEnvironment.ResourceDirectory, "users.parquet")));
Assert.IsType<DataFrame>
(dsr.Text(Path.Combine(TestEnvironment.ResourceDirectory, "people.txt")));
Assert.IsType<DataFrame>(dsr.Format("json").Option("path", jsonFilePath).Load());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting path option and then calling load is not supported by default unless spark.sql.legacy.pathOptionBehavior.enabled conf is set. Reverse order in test.

https://github.com/apache/spark/blob/bb9abb098a1e26c220546616d655380c479b0e42/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala#L259

@suhsteve suhsteve changed the title [WIP] Support Spark 3.0.2, 3.1.1 Support Spark 3.0.2, 3.1.1 Mar 4, 2021
@Niharikadutta
Copy link
Collaborator

@suhsteve Could you add a description of the changes made as a part of this PR?

@suhsteve
Copy link
Member Author

suhsteve commented Mar 9, 2021

@suhsteve Could you add a description of the changes made as a part of this PR?

Done.

Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things:

Can you spit these into following PRs?

  • Support 3.0.2
  • PR for just creating a new JAR microsoft-spark-3.1 JAR, but no need to build it, but just copying files.
  • PR for supporting Spark 3.1. (.NET changes + build)

BTW, is this backward compatible? I see lots of filtered tests. We should make this backward compatible if possible, o.w., we should release 2.0.

@suhsteve
Copy link
Member Author

suhsteve commented Mar 9, 2021

A couple of things:

Can you spit these into following PRs?

  • Support 3.0.2
  • PR for just creating a new JAR microsoft-spark-3.1 JAR, but no need to build it, but just copying files.
  • PR for supporting Spark 3.1. (.NET changes + build)

BTW, is this backward compatible? I see lots of filtered tests. We should make this backward compatible if possible, o.w., we should release 2.0.

  • I can split it up into different PRs if that's preferred.

  • I added comments as to why we skip tests for forward/backward compat in the pipeline yml.

3.0.2
We skip all forward compat tests because 3.0.2 is not supported in 1.0.0's jar microsoft-spark-3-0_2.12-1.0.0.jar

private val supportedSparkVersions = Set[String]("3.0.0", "3.0.1")

3.1.1
1.0.0 Worker is not backward compatible with 3.1.* because of

_ => throw new NotSupportedException($"Spark {version} not supported.")
and
_ => throw new NotSupportedException($"Spark {_version} not supported.")

Forward compatible tests for 3.1 cannot be run because when we check out 1.0.0 of the repo, there is no microsoft-spark-3-1_2.12-1.0.0.jar that is built.

@imback82
Copy link
Contributor

imback82 commented Mar 9, 2021

OK, I will check again with new PRs. So you are saying we are good to go with 1.1, right?

@suhsteve suhsteve closed this Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants