Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jar conflicts on Arrow and Protobuf between Vanilla Spark and Gluten #789

Closed
zhztheplayer opened this issue Jan 3, 2023 · 9 comments
Closed
Labels
bug Something isn't working velox backend works for Velox backend

Comments

@zhztheplayer
Copy link
Member

I see we have extended a PR to fix this then revert it.

cc6d264
74c3678

@CodingCat We can discuss what have to do about the issue in this thread

@zhztheplayer zhztheplayer added the bug Something isn't working label Jan 3, 2023
@CodingCat
Copy link
Contributor

CodingCat commented Jan 4, 2023

yeah, as mentioned by @zzcclp , the original PR leads to the project loading failure in IDEA and therefore the workflow of people used to using that for development is broken

the root cause analysis is following:

  1. maven "implements" cross-module based on jar. i.e. if module B depends on module A, module A is packed as a jar first which is taken by B as a dependency
  2. IDEA "implements" cross-module dependency by simply compiling classes
  3. maven shade plugin relocates classes when packing the jar

so when people uses IDEA to develop with the changed code from "import x.y.z" to "import shade.x.y.z", IDEA cannot find the dependency and the whole project is broken

I am proposing

  1. use gradle for the whole project as gradle is much better to handle such issues and based on my experience of managing a complicated JAVA/Scala mono repo , you can always find some solution with gradle for various issues
  2. developers have to get a hacky workflow by building jars first and then load to IDEA ... it is with the minimum change, but may bring more complexities and frictions when , e.g. , run a unit test with a updated gluten-core jar...
  3. don't touch...if we cannot get the agreement on solution, we have to push it to later...but TBH, it is just the worst choice to me

cc @PHILO-HE

@zhztheplayer
Copy link
Member Author

Was this proved a correct usage of dependency shading?
cc6d264#diff-5c796f2425c83beae6bafec60f1fca955c4f21f2b22a8961ca87110bd1bd6906L18-L24

Some questions here:

  1. What happened in IDEA if at least running "mvn install" once?
  2. What happened if we shade on the assembly jar rather than gluten-core jar?

@CodingCat
Copy link
Contributor

Was this proved a correct usage of dependency shading?

you mean whether we are referring the right class? yes, I run it in our environment successfully without the issue

What happened in IDEA if at least running "mvn install" once?

this is basically my second proposal, I am OK with this, but everyone need to be onboarded to this approach, e.g. if you changed something in gluten-core and want to run a test in velox-backend to verify the change, you need to remember local publish the jar...otherwise you are likely got confused

@CodingCat
Copy link
Contributor

What happened if we shade on the assembly jar rather than gluten-core jar?

this is a good question, I was discussing with @PHILO-HE . The reason I discard here is that, I found the eventual jar will contain two copies of pb, one for relocated, the other is not relocated....I put shade plugin in root pom.xml, in theory, every module should get relocated when being packed into a jar, but still....I checked mvn dependency:tree, everything looks fine so I suspect if it is a bug of shade plugin

image

image

@zhztheplayer
Copy link
Member Author

I put shade plugin in root pom.xml

Does this make sense to apply shade plugin on the bundled jar generated by maven-assembly-plugin? e.g. https://github.com/oap-project/gluten/blob/main/package/velox/spark32/pom.xml

I recalled that (correct me if I was wrong) maven-shade-plugin could do the similar work with maven-assembly-plugin to generate a bundled jar but with some classes to be shaded.

@CodingCat
Copy link
Contributor

CodingCat commented Jan 5, 2023

so I spent another few hours in this problem, I don't think anything new is found, please check the following

  1. I repeated what my original PR did. As expected, we still fell into the problem of IDEA setup, unless we explicitly add the built gluten-core jar to the dependency of gluten-data (FYI @PHILO-HE , mvn install before running the test doesn't work here). This approach will bring a lot of unknowns, e.g. when you explicitly add a gluten-core.jar but simultaneously keep the cross-module dependency between gluten-core and gluten-data, which one will take precedence is unclear.

  2. only add shade plugin in package/velox directory, it doesn't work at all. The protobuf classes are not relocated at all (I think the reason is that shade plugin only takes care of the dependency defined in the same pom.xml, in this case, since protobuf is brought by gluten-core so that it cannot be covered)

  3. add shade plugin in the root pom.xml, same as 1, we need to deal with the IDEA setup problem

  4. I think we should learn from other projects and how they handle the similar issues. There are two examples:

Flink (https://github.com/apache/flink/blob/03c0f155c2f5198bf1fda35de43afc34a4b12f6e/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/HttpRequestHandler.java). Flink uses maven, it refers to the shaded netty classes but still makes IDEA happy. The reason is that it publishes a relocated netty library https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-netty/4.1.65.Final-14.0 , everything looks fine in this case. (all packages shaded and published for flink are at https://github.com/apache/flink-shaded)

Iceberg (https://github.com/apache/iceberg/blob/0c79e40f8772672daedd45db516e003d9569e038/build.gradle#L203-L242). Iceberg uses gradle, so it can easily define a shaded project just for guava and then refer it in other modules, https://github.com/apache/iceberg/blob/a878ad7cd61b68cf17ead57c3aa7c2b3dae5ab6f/spark/v3.3/build.gradle#L54

I THINK it should have been clear that we have only two options

  1. publish shaded protobuf like Flink and change the code in gluten repo accordingly (maintenance overhead for the shaded protobuf, etc. are unavoidable)
  2. use gradle like Iceberg (bite the bullet of transition overhead but pay less for the future maintenance )

@jinchengchenghh
Copy link
Contributor

jinchengchenghh commented Jan 11, 2023

Maybe you can shade in the file package/pom.xml like hudi https://github.com/apache/hudi/blob/master/packaging/hudi-spark-bundle/pom.xml#L213-L214

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Jan 11, 2023

@CodingCat I've create a PR as proof-of-concept to use shade plugin instead of assembly plugin:

#837

Would you like to check on the changes to see if it works on your side?

@zhztheplayer
Copy link
Member Author

2611bbe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working velox backend works for Velox backend
Projects
None yet
Development

No branches or pull requests

4 participants