Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat]Support spark3.3+ and spark2.2- compile #4301

Merged
merged 5 commits into from
Mar 8, 2023

Conversation

GuoPhilipse
Copy link
Member

@GuoPhilipse GuoPhilipse commented Mar 5, 2023

What is the purpose of the change

1、after spark 3.3.0 JdbcUtils#createConnectionFactory was moved, we can improve code to support spark3.3 +
2、OrcFileFormat may be lack(take before spark2.2.1) ,we can improve code to support lower spark version
3、HiveTableRelation may be lack(take before spark2.2.1) ,we can improve code to support lower spark version
4、querytimeout may be lack(take before spark2.4.0) ,we can improve code to support lower spark version
5、authtoken may be lack(before spark2.4.0) ,we can improve code to support lower spark version

Related issues/PRs

Related issues: #4298
Related pr: #4301

Brief change log

1、rewrite JdbcUtils#createConnectionFactory in linkis for supporting for higher version spark(method createConnectionFactory was moved at https://issues.apache.org/jira/browse/SPARK-38361)
2、use reflectiong to keep OrcFileFormat code compatability for higher version spark
3、comment HiveTableRelation to keep HiveTableRelation code compatability for lower version spark(
further check at
https://github.com/apache/spark/blob/v2.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
https://github.com/apache/spark/blob/v2.2.0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala)
4、decision spark version to keep querytimeout code compatability for lower version spark
4、use higher py4jversion to keep authtoken code compatability for lower version pyspark

please note: for lower version if compile failed for netty, you may try to compile source code with -Dnetty.version=4.1.51.Final

Checklist

  • I have read the Contributing Guidelines on pull requests.
  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the Linkis mailing list first)
  • If this is a code change: I have written unit tests to fully verify the new behavior.

@@ -187,12 +187,21 @@
<artifactId>linkis-rpc</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>net.sf.py4j</groupId>
<artifactId>py4j</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here py4j is compatible and should not be introduced separately, because spark jars already have this dependency, which will cause low version conflicts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have set the dependency as provided

@peacewong
Copy link
Contributor

@rarexixi

@@ -58,7 +60,7 @@ class JdbcSink extends DataCalcSink[JdbcSinkConfig] with Logging {
.repartition(1)
.foreachPartition((_: Iterator[Row]) => {
val jdbcOptions = new JDBCOptions(options)
val conn: Connection = JdbcUtils.createConnectionFactory(jdbcOptions)()
val conn: Connection = createConnectionFactory(jdbcOptions)()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with DriverManager.getConnection(config.getUrl, config.getUser, config.getPassword) here.

break()
}
})
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right?

Copy link
Contributor

@peacewong peacewong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@rarexixi
Copy link
Member

rarexixi commented Mar 8, 2023

LGTM.

@rarexixi rarexixi merged commit 2c5abd7 into apache:dev-1.4.0 Mar 8, 2023
@GuoPhilipse GuoPhilipse deleted the supportspark2.2 branch March 10, 2023 14:02
GuoPhilipse added a commit to GuoPhilipse/incubator-linkis that referenced this pull request Mar 20, 2023
peacewong pushed a commit that referenced this pull request Mar 21, 2023
aiceflower added a commit to aiceflower/linkis that referenced this pull request Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants