Bug Description
What happened:
I was using archive_commits with retain_commits = 200 due to multi-writer approach into one table. I have noticed that this parameter in doc says:
Archiving of instants is batched in best-effort manner, to pack more instants into a single archive log. This config controls such archival batch size.
But if you check the code, the HoodieWriteConfig is built as
HoodieWriteConfig config = HoodieWriteConfig.newBuilder().withPath(basePath)
.withArchivalConfig(HoodieArchivalConfig.newBuilder().archiveCommitsWith(minCommits, maxCommits).build())
.withCleanConfig(HoodieCleanConfig.newBuilder().retainCommits(commitsRetained).build())
.withEmbeddedTimelineServerEnabled(false)
.withMetadataConfig(HoodieMetadataConfig.newBuilder().enable(enableMetadata).build())
.build();
I think it must map the conf COMMITS_ARCHIVAL_BATCH_SIZE, because current logs shows
2026-03-31 11:15:39.232 [Thread-10] WARN org.apache.hudi.config.HoodieWriteConfig - Increase hoodie.keep.min.commits=4 to be greater than hoodie.cleaner.commits.retained=200 (there is risk of incremental pull missing data from few instants based on the current configuration). The Hudi archiver will automatically adjust the configuration regardless.
but code must point
public static final ConfigProperty<String> COMMITS_ARCHIVAL_BATCH_SIZE = ConfigProperty
.key("hoodie.commits.archival.batch")
.defaultValue(String.valueOf(10))
.markAdvanced()
.withDocumentation("Archiving of instants is batched in best-effort manner, to pack more instants into a single"
+ " archive log. This config controls such archival batch size.");
Environment
Hudi version:
Query engine: (Spark/Flink/Trino etc)
Relevant configs:
Logs and Stack Trace
No response
Bug Description
What happened:
I was using archive_commits with retain_commits = 200 due to multi-writer approach into one table. I have noticed that this parameter in doc says:
But if you check the code, the HoodieWriteConfig is built as
I think it must map the conf
COMMITS_ARCHIVAL_BATCH_SIZE, because current logs showsbut code must point
Environment
Hudi version:
Query engine: (Spark/Flink/Trino etc)
Relevant configs:
Logs and Stack Trace
No response