Skip to content

feat(comapct): Implement AppendCompactCoordinator for append-only unaware-bucket table compaction#238

Merged
lucasfang merged 8 commits into
alibaba:mainfrom
lxy-9602:add-append-compact-coordinator
Apr 21, 2026
Merged

feat(comapct): Implement AppendCompactCoordinator for append-only unaware-bucket table compaction#238
lucasfang merged 8 commits into
alibaba:mainfrom
lxy-9602:add-append-compact-coordinator

Conversation

@lxy-9602
Copy link
Copy Markdown
Collaborator

Purpose

Linked issue: #93

Implement AppendCompactCoordinator for synchronous compaction of append-only unaware-bucket tables (bucket=-1).

Key components:

  • AppendCompactCoordinator: Entry point with static Run method. Loads schema, validates table type, scans small files, generates compact tasks via bin-packing (FileBin), and executes them synchronously.
  • AppendCompactTask: Represents a single compaction task for a partition's file group. Reads source files and rewrites them into a single compacted file via AppendOnlyFileStoreWrite.
  • FileBin + PackFiles: Bin-packing algorithm that sorts files by size ascending, then flushes a bin when EnoughContent (totalFileSize >= targetFileSize * 2) or EnoughInputFiles (count >= minFileNum).

TODO: DV (deletion vectors) is not yet supported in UNAWARE_BUCKET mode.

Tests

AppendCompactCoordinatorTest

API and Format

Add AppendCompactCoordinator ::Run()

Documentation

Generative AI tooling

Generated-by: Claude-4.6-Opus

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a new synchronous compaction entry point for append-only tables in UNAWARE_BUCKET mode (bucket = -1), scanning the latest snapshot for small files and rewriting them into compacted outputs.

Changes:

  • Added AppendCompactCoordinator::Run() public API to scan small files, bin-pack them into tasks, and synchronously execute compaction rewrites.
  • Introduced AppendCompactTask to perform the rewrite via AppendOnlyFileStoreWrite::CompactRewrite() and emit CommitMessage/CompactIncrement.
  • Added comprehensive unit tests for coordinator behavior (partition filtering, validation failures, external paths, size thresholds, schema evolution).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/paimon/core/operation/append_only_file_store_write.h Exposes CompactRewrite as a public method for coordinator/task usage.
include/paimon/append/append_compact_coordinator.h Adds new public API for append-only unaware-bucket compaction.
src/paimon/core/append/append_compact_coordinator.cpp Implements snapshot scan, bin-packing, task generation, and synchronous execution.
src/paimon/core/append/append_compact_task.{h,cpp} Defines/executes a single compaction rewrite task and builds commit messages.
src/paimon/core/append/append_compact_coordinator_test.cpp Adds UT coverage for coordinator end-to-end scenarios and validations.
src/paimon/CMakeLists.txt Wires new sources and test into the build.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/paimon/core/append/append_compact_coordinator.cpp
Comment thread src/paimon/core/append/append_compact_coordinator.cpp
Comment thread src/paimon/core/append/append_compact_task.cpp
Comment thread src/paimon/core/append/append_compact_coordinator_test.cpp
Comment thread include/paimon/append/append_compact_coordinator.h
Comment thread src/paimon/core/append/append_compact_coordinator.cpp Outdated
Copy link
Copy Markdown
Collaborator

@lucasfang lucasfang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@lucasfang lucasfang merged commit 0521dca into alibaba:main Apr 21, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants