Skip to content

[MINOR] Make SparkCatalogMetaStoreClient.setMetaConf a no-op#18652

Merged
danny0405 merged 1 commit into
apache:masterfrom
prashantwason:spark-catalog-metastore-client-setmetaconf-noop
Apr 30, 2026
Merged

[MINOR] Make SparkCatalogMetaStoreClient.setMetaConf a no-op#18652
danny0405 merged 1 commit into
apache:masterfrom
prashantwason:spark-catalog-metastore-client-setmetaconf-noop

Conversation

@prashantwason
Copy link
Copy Markdown
Member

Describe the issue this Pull Request addresses

Closes #18651

SparkCatalogMetaStoreClient (#18203) throws UnsupportedOperationException from setMetaConf. HoodieHiveSyncClient calls IMetaStoreClient.setMetaConf unconditionally at construction time (HoodieHiveSyncClient.java:119) to forward hive.metastore.callerContext.* properties for audit/tracing. Result: any sync client constructed with SparkCatalogMetaStoreClient throws before any catalog operation can run, blocking the entire reason SparkCatalogMetaStoreClient was introduced (avoiding the Hive-on-Spark classloader split during Hive sync).

Summary and Changelog

Make SparkCatalogMetaStoreClient.setMetaConf a no-op. HoodieHiveSyncClient.setMetaConf only forwards hive.metastore.callerContext.* properties, which are diagnostic metadata sent to a remote thrift HMS. With Spark's external catalog there is no remote HMS to receive them; dropping the call is the correct semantic for a non-thrift catalog backend.

Files changed:

  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hive/SparkCatalogMetaStoreClient.scalasetMetaConf now returns Unit without throwing.

Other unsupported methods (e.g. getMetaConf, getTables, getDatabases) continue to throw UnsupportedOperationException because they have meaningful semantics that the Spark catalog implementation does not provide. Only setMetaConf is changed because:

  1. Its sole caller in Hudi (HoodieHiveSyncClient.setMetaConf) only forwards diagnostic context properties.
  2. The Spark catalog has no remote endpoint to receive those properties.
  3. Throwing here is strictly worse than dropping silently — it prevents any other catalog operation from running.

Impact

Unblocks hoodie.datasource.hive_sync.use_spark_catalog=true. No behavior change for callers that didn't enable this config. No behavior change for the Spark catalog operations themselves (the dropped properties are diagnostic-only, not behavioral).

Risk Level

low

Single-method behavior change in a class introduced two months ago (#18203) that is opt-in via a config defaulting to false. Users with the config disabled see no change. Users with the config enabled were previously unable to use the feature at all, so the only direction the change can go is "from broken to working".

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

HoodieHiveSyncClient invokes IMetaStoreClient.setMetaConf at construction
time to forward hive.metastore.callerContext.* properties to the
metastore for audit/tracing. With Spark's external catalog there is no
remote HMS to receive those values, so throwing UnsupportedOperationException
unconditionally breaks every sync client that uses
SparkCatalogMetaStoreClient (it fails before any actual catalog operation
can run).

Accept the call silently. The caller-context properties are diagnostic
metadata; dropping them is the correct semantic for a non-thrift catalog
backend.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot added the size:XS PR with lines of changes in <= 10 label Apr 29, 2026
@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR makes SparkCatalogMetaStoreClient.setMetaConf a no-op so that HoodieHiveSyncClient construction doesn't fail when forwarding diagnostic hive.metastore.callerContext.* properties to a non-thrift Spark catalog backend. No issues flagged from this automated pass — a Hudi committer or PMC member can take it from here for a final review.

cc @yihua

@danny0405 danny0405 merged commit 7443856 into apache:master Apr 30, 2026
62 of 63 checks passed
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 68.06%. Comparing base (8d348cc) to head (9361c8b).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...e/spark/sql/hive/SparkCatalogMetaStoreClient.scala 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master   #18652   +/-   ##
=========================================
  Coverage     68.06%   68.06%           
- Complexity    28909    28910    +1     
=========================================
  Files          2518     2518           
  Lines        140572   140572           
  Branches      17422    17422           
=========================================
+ Hits          95684    95687    +3     
  Misses        37032    37032           
+ Partials       7856     7853    -3     
Flag Coverage Δ
common-and-other-modules 44.36% <0.00%> (-0.01%) ⬇️
hadoop-mr-java-client 44.95% <ø> (-0.03%) ⬇️
spark-client-hadoop-common 48.44% <ø> (+<0.01%) ⬆️
spark-java-tests 48.63% <0.00%> (-0.01%) ⬇️
spark-scala-tests 44.70% <0.00%> (+<0.01%) ⬆️
utilities 37.70% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...e/spark/sql/hive/SparkCatalogMetaStoreClient.scala 29.92% <0.00%> (ø)

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS PR with lines of changes in <= 10

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SparkCatalogMetaStoreClient.setMetaConf throws UnsupportedOperationException, blocking all Hudi Hive sync that uses it

5 participants