feat(hudi-sync): Publish HUDI version to Hive metastore (allowing users to infer which HUDI client jar to use for a given dataset)#18307
Merged
nsivabalan merged 1 commit intoapache:masterfrom Mar 13, 2026
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18307 +/- ##
=========================================
Coverage 57.27% 57.27%
- Complexity 18639 18651 +12
=========================================
Files 1956 1956
Lines 107069 107086 +17
Branches 13255 13255
=========================================
+ Hits 61324 61336 +12
- Misses 39939 39945 +6
+ Partials 5806 5805 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Collaborator
nsivabalan
approved these changes
Mar 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe the issue this Pull Request addresses
During hive sync, the Hudi writer version is not published to the Hive Metastore (HMS) table properties. This makes it difficult for downstream consumers and platform tooling to determine which version of the Hudi writer library produced the data for a given table.
Publishing this version info allows users to infer which HUDI jar versions to use when writing to the dataset. This is helpful for cases when a user is performing a rolling HUDI verison upgrade of all their datasets, and has a table service platform for invoking table services (that needs to infer which HUDI jar to use before running a table service against a dataset)
#17954
Summary and Changelog
Publish the Hudi writer version as a table property (
hudi_writer_version) in HMS during hive sync.updateHoodieWriterVersion(String tableName)default method toHoodieMetaSyncOperationsinterfaceupdateHoodieWriterVersioninHoodieHiveSyncClient, which reads the current Hudi version viaHoodieVersion.get()and sets it as a table parameter in HMSupdateHoodieWriterVersioninHiveSyncTool.syncHoodieTableafter updating the last commit time syncedTestHiveSyncToolto validate the new table property is presentImpact
A new table property
hudi_writer_versionwill be set on HMS tables during every hive sync. This is a metadata-only change with no impact on the storage format or read/write path. Existing tables will get the property populated on their next sync.Risk Level
low — The change only adds a single HMS
alter_tablecall per sync to set a table-level property. No existing behavior is modified. If the call fails, it throws a clear exception consistent with existing error handling in the sync client.Documentation Update
none
Contributor's checklist