Skip to content

chore(docker): silence Dockerfile lint warnings across image set#18664

Merged
danny0405 merged 1 commit into
apache:masterfrom
voonhous:fix-docker-warnings
Jun 4, 2026
Merged

chore(docker): silence Dockerfile lint warnings across image set#18664
danny0405 merged 1 commit into
apache:masterfrom
voonhous:fix-docker-warnings

Conversation

@voonhous

Copy link
Copy Markdown
Member

Describe the issue this Pull Request addresses

BuildKit emits three classes of lint warnings across the Hudi docker/ tree. Two of them are real bugs (silent empty env vars, signal-handling), the third is style:

  • LegacyKeyValueFormat: ENV KEY VALUE form is deprecated.
  • UndefinedVar: pre-FROM ARGs do not carry into the build stage, so ENV HADOOP_DN_PORT ${HADOOP_DN_PORT} etc. expanded to empty strings.
  • JSONArgsRecommended: CMD startup.sh runs through /bin/sh -c, which swallows SIGTERM and breaks graceful shutdown of the metastore/hiveserver.

Summary and Changelog

  • Rewrite ENV KEY VALUE to ENV KEY=VALUE across 13 Dockerfiles. Quoted values and ${VAR} expansion are preserved.
  • Re-declare HADOOP_DN_PORT, HADOOP_WEBHDFS_PORT, and HADOOP_HISTORY_PORT as ARG after the FROM line in datanode, namenode, and historyserver, so the corresponding ENV expansions actually receive the build-arg values.
  • hive_base: switch CMD startup.sh to JSON-array form CMD ["startup.sh"] so SIGTERM is delivered directly to the script.
  • Drop a couple of trailing whitespaces in ARG HADOOP_VERSION lines.

Files touched: base, base_java11, base_java17, datanode, historyserver, hive_base, namenode, prestobase, spark_base, sparkadhoc, sparkmaster, sparkworker, trinobase.

Impact

  • Build: BuildKit lint warnings cleared.
  • Runtime: previously empty HADOOP_DN_PORT / HADOOP_WEBHDFS_PORT / HADOOP_HISTORY_PORT env vars are now populated from the corresponding ARGs (default values unchanged: 50075 / 50070 / 8188).
  • Runtime: hive metastore/hiveserver containers now receive SIGTERM directly instead of through /bin/sh -c, so docker stop shuts them down cleanly.

Risk Level

low. Docker build only. The behavior changes (env var population, signal delivery) are corrections, not new features, and defaults match prior intent.

Documentation Update

none.

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

Three classes of buildkit warnings are addressed in the Hudi docker tree:

- LegacyKeyValueFormat: rewrite all `ENV KEY VALUE` to `ENV KEY=VALUE`
  across 13 Dockerfiles. Quoted values and ${VAR} expansion preserved.
- UndefinedVar: re-declare HADOOP_DN_PORT, HADOOP_WEBHDFS_PORT, and
  HADOOP_HISTORY_PORT as ARG after the FROM line in datanode, namenode,
  and historyserver. Pre-FROM ARGs do not propagate into the build stage,
  so the corresponding ENV expansions were silently empty.
- JSONArgsRecommended: change `CMD startup.sh` to `CMD ["startup.sh"]`
  in hive_base so SIGTERM is delivered to the script directly instead of
  through a /bin/sh -c wrapper.

No runtime behavior change beyond the metastore/hiveserver now receiving
shutdown signals correctly.

@hudi-agent hudi-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

No reviewable code files in this PR.

cc @yihua

@github-actions github-actions Bot added the size:M PR with lines of changes in (100, 300] label Apr 30, 2026
@hudi-bot

Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@voonhous voonhous added this to the release-1.2.0 milestone May 13, 2026
@yihua yihua removed this from the release-1.2.0 milestone May 15, 2026
@voonhous voonhous closed this Jun 3, 2026
@voonhous voonhous reopened this Jun 3, 2026
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.13%. Comparing base (38db5ed) to head (e222324).
⚠️ Report is 113 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18664      +/-   ##
============================================
+ Coverage     68.06%   69.13%   +1.07%     
- Complexity    28922    32101    +3179     
============================================
  Files          2518     2548      +30     
  Lines        140574   153878   +13304     
  Branches      17419    20520    +3101     
============================================
+ Hits          95682   106384   +10702     
- Misses        37036    39104    +2068     
- Partials       7856     8390     +534     
Flag Coverage Δ
common-and-other-modules 45.21% <ø> (+0.84%) ⬆️
hadoop-mr-java-client 45.76% <ø> (+0.87%) ⬆️
spark-client-hadoop-common 48.05% <ø> (-0.39%) ⬇️
spark-java-tests 49.30% <ø> (+0.66%) ⬆️
spark-scala-tests 44.96% <ø> (+0.26%) ⬆️
utilities 38.66% <ø> (+0.97%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 283 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@danny0405 danny0405 merged commit c40f765 into apache:master Jun 4, 2026
120 checks passed
@voonhous voonhous deleted the fix-docker-warnings branch June 4, 2026 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants