Skip to content

chore(deps): Pin AWS v1 SDK BOM to short-circuit transitive version#18619

Merged
voonhous merged 1 commit into
apache:masterfrom
voonhous:fix-dep-walk-for-cold-builds
Apr 29, 2026
Merged

chore(deps): Pin AWS v1 SDK BOM to short-circuit transitive version#18619
voonhous merged 1 commit into
apache:masterfrom
voonhous:fix-dep-walk-for-cold-builds

Conversation

@voonhous
Copy link
Copy Markdown
Member

@voonhous voonhous commented Apr 27, 2026

Describe the issue this Pull Request addresses

amazon-kinesis-deaggregator:1.0.3 (added in #18224) drags in aws-lambda-java-events:1.1.0, whose POM declares aws-java-sdk-* deps as soft version ranges:

<version>[1.10.5,)</version>

Maven resolves these literally: it fetches maven-metadata.xml for every affected artifact, then downloads every intermediate POM (1.11.35, 1.11.36, ... 1.11.49x, ...) to walk the version graph. A clean build pulls hundreds of aws-java-sdk-s3-1.11.NNN.pom files just for graph traversal; the resolved jar is never on the classpath.

Side effects:

  • CI cold builds slow down by minutes per module that touches hudi-utilities.
  • Resolution is non-deterministic since it picks "latest at resolve time" within the open range.
image

Summary and Changelog

Pin com.amazonaws:aws-java-sdk-bom in the root <dependencyManagement>. The BOM imports a fixed version for every com.amazonaws:aws-java-sdk-* artifact, so Maven's <dependencyManagement> short-circuits the transitive ranges before the walk starts.

Changes (pom.xml):

  • New property <aws.sdk.v1.version>1.12.797</aws.sdk.v1.version> next to the existing v2 SDK property. Version chosen to match the highest aws-java-sdk-core already in the resolved graph, so this is a behavioral no-op and adopts the SDK version the build was already converging on.
  • Import aws-java-sdk-bom (scope import, type pom) at the top of <dependencyManagement> with an inline comment explaining the rationale.

Impact

  • Build time: clean mvn invocations that touch hudi-utilities (or any reactor including it) skip the AWS SDK range walk. Hundreds of POM downloads per fresh ~/.m2 go away.
  • Reproducibility: v1 SDK version is now pinned, not floating. Future bumps go through a deliberate version property change.
  • Runtime: no change. The pinned version (1.12.797) is what the build was already resolving to.
  • Public API: none.

Risk Level

low.

  • BOM <scope>import</scope> only affects dependency resolution; no code change.
  • Pinned version matches what was already on the classpath, so no behavioral drift.
  • v1 SDK 1.12.x is API-stable (maintenance line, security patches only).
  • Verified locally: mvn dependency:tree -pl hudi-utilities -am finishes without the AWS SDK range-walk download flood.

Documentation Update

none.

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

…ange walk

amazon-kinesis-deaggregator (added in apache#18224) pulls aws-lambda-java-events
1.1.0, whose POM declares aws-java-sdk-* deps with soft ranges like
[1.10.5,). Maven resolves these by walking every published patch version,
producing hundreds of POM downloads per clean build. Importing aws-java-sdk-bom
in dependencyManagement overrides the ranges with a single deterministic
version, eliminating the walk.
@voonhous voonhous changed the title chore(deps): Pin AWS v1 SDK BOM to short-circuit transitive version-r… chore(deps): Pin AWS v1 SDK BOM to short-circuit transitive version Apr 27, 2026
Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR pins the AWS v1 SDK BOM in root <dependencyManagement> to short-circuit transitive soft-version-range resolution (avoiding the AWS SDK patch-version walk during clean builds). No issues flagged from this automated pass — a Hudi committer or PMC member can take it from here for a final review.

cc @yihua

@voonhous voonhous requested review from rahil-c and yihua April 27, 2026 21:09
@github-actions github-actions Bot added the size:S PR with lines of changes in (10, 100] label Apr 27, 2026
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.04%. Comparing base (5c73bc0) to head (e99b76a).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18619      +/-   ##
============================================
+ Coverage     68.03%   68.04%   +0.01%     
- Complexity    28906    28920      +14     
============================================
  Files          2518     2518              
  Lines        140594   140598       +4     
  Branches      17420    17420              
============================================
+ Hits          95652    95674      +22     
+ Misses        37089    37073      -16     
+ Partials       7853     7851       -2     
Flag Coverage Δ
common-and-other-modules 44.36% <ø> (+0.01%) ⬆️
hadoop-mr-java-client 44.83% <ø> (+<0.01%) ⬆️
spark-client-hadoop-common 48.42% <ø> (+0.01%) ⬆️
spark-java-tests 48.62% <ø> (-0.02%) ⬇️
spark-scala-tests 44.69% <ø> (-0.01%) ⬇️
utilities 37.72% <ø> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@voonhous voonhous merged commit 426cbb8 into apache:master Apr 29, 2026
67 checks passed
rahil-c pushed a commit to rahil-c/hudi that referenced this pull request Apr 29, 2026
…ange walk (apache#18619)

amazon-kinesis-deaggregator (added in apache#18224) pulls aws-lambda-java-events
1.1.0, whose POM declares aws-java-sdk-* deps with soft ranges like
[1.10.5,). Maven resolves these by walking every published patch version,
producing hundreds of POM downloads per clean build. Importing aws-java-sdk-bom
in dependencyManagement overrides the ranges with a single deterministic
version, eliminating the walk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants