-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] Should we shade all aws dependencies to avoid class conflicts? #4474
Comments
I seem to have encountered the same problem #4475 . Is there any good way to solve it quickly? |
For our internal hudi version, we shade aws dependencies, you can add new relocation and build a new bundle package: For example, to shade aws dependencies in spark, add following codes in packaging/hudi-spark-bundle/pom.xml <!-- line 185-->
<relocation>
<pattern>com.amazonaws.</pattern>
<shadedPattern>${spark.bundle.spark.shade.prefix}com.amazonaws.</shadedPattern>
</relocation> |
I tried it, but the following exception was still thrown。
|
Not the same exception? |
yes. But it works fine。 |
@xushiyan @zhedoubushishi , Looks this is a common case for many users, does shading this have other side effects or not? If not, I'm willing to raise a ticket to solve this. |
@boneanxs +1 for shading. We already do that for other dependencies (global search |
After some investigation about this, I'm curious why flink remove amazonaws packages by this JIRA(it doesn't have too much description): HUDI-2803, and it's only include hudi-aws by the pr: 4127, can we follow flink to do the same thing for spark-bundle and utilities-bundle(I'm afraid flink might miss some dependencies if using dynamoDb based lock)? @codope @xushiyan Also, could you please add me as a contributor to the project, I can not assign myself, and my username is : Bone An. Thanks in advance. |
Hi, I solved my issue by removing aws deps in the pom file. #4442 Are there any plans on adding aws deps to support more functinos in the future, or why to add thses in the first place? |
@boneanxs @a0x Thanks for sharing the info and ideas. I've filed https://issues.apache.org/jira/browse/HUDI-3157 |
Thanks for bringing up this issue. My initial idea is to relocate the aws jars with a Hudi prefix to avoid jar conflicts. If we just directly remove the shading for aws jars, then we need to manually pass aws jars in the Spark/Flink classpath when the users are using AWS Dynamodb/cloudwatch features. |
Looks flink-bundle already remove this...HUDI-2803 |
After some discussions, we think that we should keep cloud provider's jars out of open source bundle jars. Any cloud provider can create its own specific hudi module and hudi bundle jars. (like I've pivoted this ticket to removing bundle deps to align with flink bundle changes. https://issues.apache.org/jira/browse/HUDI-3157
@zhedoubushishi If to help users use the bundle a bit easier, as I suggested above, please consider adding an aws specific hudi bundle to resolve dependency problem. Hope this could align with your thoughts too. |
Closing the github issue as we have a tracking jira. thank you folks for chiming in. |
could this be included in the future separate aws bundle ? |
As we introduce support for DynamoDb based lock by HUDI-2314, can we shade all aws dependencies for all our bundled jars(spark, flink)? As many users also import their own aws packages but also use hudi, which could cause many class conflicts like following error:
I'm not sure whether shading these jars will introduce other issues or not. @zhedoubushishi Can you take a look at this issue?
The text was updated successfully, but these errors were encountered: