Skip to content

fix: Enhance hudi-azure-bundle#18472

Merged
yihua merged 4 commits into
apache:masterfrom
linliu-code:add_azure_bundle
May 19, 2026
Merged

fix: Enhance hudi-azure-bundle#18472
yihua merged 4 commits into
apache:masterfrom
linliu-code:add_azure_bundle

Conversation

@linliu-code
Copy link
Copy Markdown
Collaborator

@linliu-code linliu-code commented Apr 6, 2026

Describe the issue this Pull Request addresses

#18471

When running Hudi Spark jobs on Azure (ADLS Gen2), the Azure Storage SDK's Netty and Reactor dependencies conflict with Spark's bundled Netty, causing runtime NoSuchMethodError and StacklessClosedChannelException during lock acquisition. Specifically, reactor-netty-http calls HttpClientCodec.<init>(HttpDecoderConfig, boolean, boolean) — a constructor that only exists in Netty 4.1.94+ — but Spark's older Netty HttpClientCodec is loaded instead. This makes the Azure-based StorageBasedLockProvider (added in #17951) unusable in Spark environments.

Additionally, hudi-azure-bundle as it stands on master does not include reactor-netty, the Azure identity transitive deps, or Netty/Reactor relocations, so users still have to manage those jars manually on the Spark classpath.

Summary and Changelog

This PR is purely additive on top of master's existing hudi-azure-bundle skeleton. All changes are in packaging/hudi-azure-bundle/pom.xml:

  • Includes added to the shaded jar:
    • com.nimbusds:* and net.minidev:* — Azure identity transitive deps required for DefaultAzureCredential and related auth providers.
    • io.projectreactor.netty:* — the actual reactor-netty HTTP client used by the Azure SDK.
    • org.reactivestreams:reactive-streams — required by Reactor.
  • Shading relocations added to isolate Netty/Reactor from Spark's classpath:
    • io.netty.*org.apache.hudi.io.netty.*
    • io.projectreactor.*org.apache.hudi.io.projectreactor.*
    • reactor.*org.apache.hudi.reactor.*
    • org.reactivestreams.*org.apache.hudi.org.reactivestreams.*
  • Avro compile dependency added so the bundle resolves Avro classes referenced by Hudi APIs without relying on the host classpath.
  • Minor: two whitespace fixes in the license header.

Note: this PR was rebased onto current master. The original AzureStorageLockClient commits were dropped because master already has an implementation of that class via #17951; this PR now contains only the bundle module additions.

Impact

  • Enables Hudi's StorageBasedLockProvider to work reliably on Azure/ADLS Gen2 in Spark environments.
  • Eliminates the need to manually place reactor-netty, reactor-core, reactive-streams, and netty-resolver-dns jars on the Spark classpath — everything is self-contained and relocated in the bundle.
  • No impact on existing AWS or GCP bundles — reactor-netty is only included in the Azure bundle.

Risk Level

Low

  • This PR only modifies packaging/hudi-azure-bundle/pom.xml; no source changes.
  • Shading Netty with relocation is a well-established pattern (used by HBase, gRPC, Snowflake, and DataHub in this same codebase).
  • One area to monitor: Netty native transports (epoll/kqueue) reference class names via JNI — however, Azure SDK's HTTP client uses reactor-netty over NIO, so relocated Netty classes function correctly.

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@linliu-code linliu-code changed the title feat: Add azure bundle feat: Add hudi-azure-bundle Apr 6, 2026
@linliu-code linliu-code marked this pull request as ready for review April 6, 2026 17:21
@github-actions github-actions Bot added the size:L PR with lines of changes in (300, 1000] label Apr 6, 2026
Copy link
Copy Markdown
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Style & Readability Review — One code reuse issue: URI parsing and container validation logic is duplicated between readObject() and writeObject() methods.

} else {
logger.error("Error reading JSON config file: {}", filePath, e);
}
return Option.empty();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: this URI parsing and container validation (lines 288–297) is duplicated from readObject(). Could you extract into a private helper method?

Copy link
Copy Markdown
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for contributing! The overall structure of AzureStorageLockClient is clean and follows the S3/GCS pattern well. There's one functional bug worth addressing in the ETag handling before merging: the quote-stripping in readCurrentLockFile creates an inconsistency with the ETag format expected by BlobRequestConditions.setIfMatch(), which will break the "take over expired lock" scenario. Details in the inline comment.

String eTag = response.getHeaders().getValue("ETag");
if (eTag != null) {
// Azure returns ETags wrapped in quotes, remove them
eTag = eTag.replaceAll("^\"|\"$", "");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This quote-stripping creates an ETag format inconsistency that breaks conditional writes for the expired-lock takeover path. BlockBlobItem.getETag() (used in createOrUpdateLockFileInternal) returns the ETag WITH surrounding double-quotes (e.g. "0x8D4A"), which is exactly what BlobRequestConditions.setIfMatch() expects — it passes the value directly to the If-Match header, so the quotes must be present for a valid HTTP conditional request. By stripping them here, any setIfMatch call using an ETag sourced from readCurrentLockFile sends If-Match: 0x8D4A (unquoted), while a call using an ETag from a write sends If-Match: "0x8D4A" (correctly quoted). Removing the replaceAll and keeping the raw header value would align both paths and avoid this failure mode.

@github-actions github-actions Bot added size:S PR with lines of changes in (10, 100] and removed size:L PR with lines of changes in (300, 1000] labels May 15, 2026
Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR turns hudi-azure-bundle into a proper shaded fat jar with Netty/Reactor relocation, mirroring the AWS and GCP bundles. One question on a build-config divergence from the sibling bundles worth a second look. Please take a look at the inline comment, and this should be ready for a Hudi committer or PMC member to take it from here. A couple of spots where a short explanatory comment would save future readers some head-scratching.

Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
<resource>
<directory>src/main/resources</directory>
</resource>
<resource>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Is there a reason to add src/test/resources to the main <resources> block? Maven's standard layout already exposes it for the test phase, and adding it here will package any test resources into the production shaded jar. The sibling hudi-aws-bundle and hudi-gcp-bundle only declare src/main/resources. Could you drop this entry unless something specifically requires it?

- AI-generated; verify before applying. React 👍/👎 to flag quality.

Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
<directory>src/main/resources</directory>
</resource>
<resource>
<directory>src/test/resources</directory>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: including src/test/resources in the main build resources is unusual — could you add a comment explaining why it's needed here (e.g. which runtime config files live there that the bundle requires)?

- AI-generated; verify before applying. React 👍/👎 to flag quality.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This should be removed.

@yihua yihua changed the title feat: Add hudi-azure-bundle fix: Enhance hudi-azure-bundle May 19, 2026
Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
<directory>src/main/resources</directory>
</resource>
<resource>
<directory>src/test/resources</directory>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This should be removed.

Comment thread packaging/hudi-azure-bundle/pom.xml Outdated
Copy link
Copy Markdown
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yihua yihua merged commit f67dc26 into apache:master May 19, 2026
62 of 64 checks passed
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.20%. Comparing base (4035f70) to head (bdc66e5).
⚠️ Report is 15 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18472      +/-   ##
============================================
+ Coverage     68.14%   68.20%   +0.05%     
- Complexity    29105    29241     +136     
============================================
  Files          2518     2525       +7     
  Lines        141221   141667     +446     
  Branches      17534    17588      +54     
============================================
+ Hits          96237    96625     +388     
- Misses        37068    37088      +20     
- Partials       7916     7954      +38     
Flag Coverage Δ
common-and-other-modules 44.34% <ø> (-0.07%) ⬇️
hadoop-mr-java-client 44.98% <ø> (-0.03%) ⬇️
spark-client-hadoop-common 48.29% <ø> (-0.03%) ⬇️
spark-java-tests 48.84% <ø> (-0.15%) ⬇️
spark-scala-tests 44.94% <ø> (+0.04%) ⬆️
utilities 37.46% <ø> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 36 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

yihua pushed a commit that referenced this pull request May 20, 2026
dwshmilyss pushed a commit to dwshmilyss/hudi that referenced this pull request May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants