-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28930: Implement a metastore service that expires iceberg table snapshots periodically #5786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@deniskuzZ : this is the reusable, general part of the iceberg table maintenance service (no query history bits can be found here), I would appreciate a review in the future once you have time for that |
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergUtil.java
Outdated
Show resolved
Hide resolved
...tore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/TableFetcher.java
Show resolved
Hide resolved
...ndler/src/main/java/org/apache/iceberg/mr/hive/metastore/task/IcebergHouseKeeperService.java
Outdated
Show resolved
Hide resolved
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergUtil.java
Outdated
Show resolved
Hide resolved
...ndler/src/main/java/org/apache/iceberg/mr/hive/metastore/task/IcebergHouseKeeperService.java
Outdated
Show resolved
Hide resolved
...ndler/src/main/java/org/apache/iceberg/mr/hive/metastore/task/IcebergHouseKeeperService.java
Outdated
Show resolved
Hide resolved
...ndler/src/main/java/org/apache/iceberg/mr/hive/metastore/task/IcebergHouseKeeperService.java
Show resolved
Hide resolved
...ndler/src/main/java/org/apache/iceberg/mr/hive/metastore/task/IcebergHouseKeeperService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM, minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, pending tests
|
What changes were proposed in this pull request?
This patch introduces a metastore task as a MetastoreTaskThread that can expire snapshots of iceberg tables periodically according to configuration: catalog name, database pattern, table pattern. The configuration was inspired by the partition management task.
Patch contents:
Why are the changes needed?
This service could act as a convenient helper to maintain iceberg tables, which otherwise need explicit hive ql statements by the user.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit tests added.
Manual testing is also possible, as the patch adds MiniHS2 capability and fixes to run metastore tasks in remote mode, example command: