Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-547] Resolve Jar dependency that conflicts with Apache Spark #1030

Merged
merged 12 commits into from
Mar 7, 2023

Conversation

CodingCat
Copy link
Contributor

What changes were proposed in this pull request?

this PR proposes to shade protobuf in gluten core to avoid the protobuf version conflicts with what is shipped with Spark distro

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

test in our prod env

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[Gluten-${ISSUES_ID}] ${detailed message}

See also:

@CodingCat
Copy link
Contributor Author

pom.xml Outdated
<artifactId>maven-shade-plugin</artifactId>
<version>3.4.1</version>
<configuration>
<finalName>${jar.assembly.name.prefix}-spark${sparkbundle.version}_${scala.binary.version}-${project.version}-jar-with-dependencies</finalName>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the operating system version here? like ubuntu22.04

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to make this change here? I saw assembly plugin original didn't contain this jar.assembly.name.prefix

package/pom.xml Outdated
Comment on lines 81 to 88
<pattern>org.apache.arrow</pattern>
<shadedPattern>io-glutenproject.shaded.org.apache.arrow</shadedPattern>
<!--arrow's C wrapper refers to the original class path, so we should not relocate here-->
<excludes>
<exclude>org.apache.arrow.c.*</exclude>
<exclude>org.apache.arrow.c.jni.*</exclude>
</excludes>
</relocation>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

So Spark doesn't refer to arrow-c-data jar, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is doing it at least right now

@PHILO-HE
Copy link
Contributor

The CI check failure is caused by other commit and has been fixed. Please rebase your branch. Thanks!

@CodingCat
Copy link
Contributor Author

after rebasing, it failed in other places, update branch again...

@zhztheplayer zhztheplayer changed the title [GLUTEN-547]resolve package conflicts [GLUTEN-547] Resolve Jar dependency conflicts with Apache Spark Feb 28, 2023
@zhztheplayer
Copy link
Member

CI's integrating testing is looking for a bundle jar gluten-package-1.0.0-SNAPSHOT.jar, which was installed during Gluten's mvn install. Please ensure the jar is still generated, or update the bundled jar dependency of integrating testing tool gluten-it.

@weiting-chen weiting-chen added the core works for Gluten Core label Mar 1, 2023
@zhztheplayer zhztheplayer changed the title [GLUTEN-547] Resolve Jar dependency conflicts with Apache Spark [GLUTEN-547] Resolve Jar dependency that conflicts with Apache Spark Mar 2, 2023
@zhztheplayer
Copy link
Member

Also we need to update doc:
https://github.com/oap-project/gluten#35-jar-conflicts

@CodingCat
Copy link
Contributor Author

sorry for being late, will come back to this soon, just stuck at my daily jobs

@CodingCat
Copy link
Contributor Author

ok, so after trying different things, I come up with the following solution

  1. build shade jar but with the name of gluten-package-1.0.0-SNAPSHOT.jar, this will be a uber jar containing everything and will be published to local maven
  2. I also use exec-maven plugin to make a copy of gluten-package-1.0.0-SNAPSHOT.jar to ${jar.assembly.name.prefix}-spark${sparkbundle.version}_${scala.binary.version}-${project.version}-jar-with-dependencies.jar ....this is to be consistent with the original artifact generated by assembly plugin, just in case someone relies on that file name to do anything
  3. change gluten-it by removing in pom.xml, so it will refer to the jar generated in 1 correctly

package/pom.xml Outdated Show resolved Hide resolved
@zhztheplayer zhztheplayer merged commit 2611bbe into apache:main Mar 7, 2023
@zhztheplayer
Copy link
Member

Thanks @CodingCat !

@CodingCat CodingCat deleted the resolve_package_conflicts branch March 7, 2023 23:33
izchen added a commit to izchen/gluten that referenced this pull request Mar 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core works for Gluten Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants