-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed as not planned
Closed as not planned
Copy link
Labels
Description
After reading this mail archive - https://www.mail-archive.com/dev@iceberg.apache.org/msg01416.html, I had turned on write.metadata.delete-after-commit.enabled for my tables in which data is written by a streaming job.
ALTER TABLE hive.db_name.table_name SET TBLPROPERTIES ('write.metadata.delete-after-commit.enabled'='true')
But even after enabling it, S3 size of the table is still very large as it contain all metadata files. Do I need to enable some other property too?
Latest metadata file content
{
"format-version" : 2,
"table-uuid" : "7e6356d7-2e6e-40f0-a462-073b4cbd40fc",
"location" : "S3_LOCATION",
"last-sequence-number" : 4072,
"last-updated-ms" : 1627992967421,
"last-column-id" : 16,
"schema" : {
"type" : "struct",
"fields" : [...]
},
"default-spec-id" : 0,
"partition-specs" : [ {
"spec-id" : 0,
"fields" : [ ]
} ],
"default-sort-order-id" : 0,
"sort-orders" : [ {
"order-id" : 0,
"fields" : [ ]
} ],
"properties" : {
"engine.hive.enabled" : "true",
"write.format.default" : "parquet",
"write.parquet.compression-codec" : "snappy",
"write.metadata.delete-after-commit.enabled" : "true"
},
"current-snapshot-id" : 6552119266625920959,
...
