Skip to content

[HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle#2147

Merged
vinothchandar merged 1 commit intoapache:masterfrom
rmpifer:hbase-shade-fix
Oct 15, 2020
Merged

[HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle#2147
vinothchandar merged 1 commit intoapache:masterfrom
rmpifer:hbase-shade-fix

Conversation

@rmpifer
Copy link
Contributor

@rmpifer rmpifer commented Oct 5, 2020

Tips

What is the purpose of the pull request

Hbase index currently does not work due to relocation when only using hudi-spark-bundle

Brief change log

  • Update hudi-spark-bundle pom to not relocate hbase and htrace pattern
  • Remove codec relocation as this is not included in bundle which was causing error

Verify this pull request

  • Manually verified the change by running an insert on hbase index table through spark-shell using hudi-spark-bundle

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@vinothchandar vinothchandar self-assigned this Oct 5, 2020
@vinothchandar
Copy link
Member

@rmpifer as a quick check, is it possible to shade all the deps of Hbase, leaving hbase classes themselves unshaded? The most concern we have is around guava etc conflicting with what spark/presto use

@vinothchandar
Copy link
Member

@rmpifer if you can confirm the above, we can land this. otherwise LGTM

@umehrot2
Copy link
Contributor

@rmpifer A couple of points:

  • As @vinothchandar mentioned, it would be worth exploring if by just removing the dependency relocation and still continuing to shade, helps avoid the issues with Hbase index, and at the same time not break bootstrap code.
  • If we do go ahead with removing relocation for Hbase, we may want to remove the relocation in hudi-hadoop-mr-bundle and hudi-presto-bundle to avoid any other issues this might cause. One such issue we ran into with bootstrap was that Hbase was writing the KeyValue Comparator class name in HFile footer. At read time it would expect to see the exact same class. However this was resolved by creating our own comparator class for Hbase. https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java#L584
  • Lets fix the commit message. We are not removing shading, but avoiding relocation as part of shading process.

@rmpifer
Copy link
Contributor Author

rmpifer commented Oct 14, 2020

@vinothchandar Sorry I've been caught up in some other obligations. We would have to explicitly add the dependencies we want to shade from hbase. If the biggest concern is around guava conflicts I think we may just want to include this for now.

@umehrot2 I'm ok with updating this in hudi-hadoop-mr-bundle and hudi-presto-bundle as well as long as there are no other concerns there

…f guava in hadoop, spark, and presto bundles
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach looks good to me.

@vinothchandar vinothchandar merged commit a44f668 into apache:master Oct 15, 2020
@bhasudha
Copy link
Contributor

@rmpifer We might need to remove this hbase relocation from hudi-utilities-bundle as well. Ran into this issue when using DeltaStreamer without HBASE index and in EMR 5.31.0 with security configs. The job failed as soon as it started. Saw errors similar to #2100 . But was able to quickly verify that on removing this relocation from the hudi-utilities-bundle bundle, the job ran fine.

prashantwason pushed a commit to prashantwason/incubator-hudi that referenced this pull request Feb 22, 2021
…dd shading of guava in hadoop, spark, and presto bundles (apache#2147)

- Update hudi-spark-bundle pom to not relocate hbase and htrace pattern
- Remove codec relocation as this is not included in bundle which was causing error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants