Skip to content

[core] Fix dedicated-format bundle write path#7598

Open
QuakeWang wants to merge 2 commits into
apache:masterfrom
QuakeWang:fix/dedicated-bundle-path
Open

[core] Fix dedicated-format bundle write path#7598
QuakeWang wants to merge 2 commits into
apache:masterfrom
QuakeWang:fix/dedicated-bundle-path

Conversation

@QuakeWang
Copy link
Copy Markdown
Contributor

Purpose

Fix the dedicated-format writeBundle path so bundle writes remain correct when data is fanned out to main/blob/vector writers.

This change:

  • materializes only non-replayable bundles in the dedicated fan-out path instead of always materializing them;
  • introduces explicit opt-in bundle capabilities in common (ReplayableBundleRecords and ProjectableBundleRecords) without changing the base BundleRecords contract;
  • allows projection to preserve typed bundle implementations such as ArrowBundleRecords;
  • preserves row-level side effects in bundle writes, including sequence number updates and file-index writing.

Tests

Added or updated tests in paimon-core:

  • ProjectedFileWriterTest
  • BundleAwareRowDataRollingFileWriterTest
  • DedicatedFormatRollingFileWriterTest
  • DedicatedFormatRollingFileWriterVectorTest

These cover:

  • replayable/projectable bundle pass-through and fallback behavior;
  • sequence-number side effects for bundle writes;
  • dedicated-format bundle writes with main-file index side effects;
  • external-storage blob fallback for bundle writes;
  • single-use bundle writes for blob/vector dedicated paths.

@JingsongLi
Copy link
Copy Markdown
Contributor

Can you explain what went wrong?

@QuakeWang
Copy link
Copy Markdown
Contributor Author

Can you explain what went wrong?

The old dedicated-format writeBundle path was still functionally fine in the simple row-by-row fallback, but it dropped bundle semantics completely by iterating the bundle and calling write(row).

That becomes problematic in the dedicated fan-out path, where one logical bundle has to be written to projected main/blob/vector writers. BundleRecords itself does not guarantee replayability, so the same bundle cannot be safely reused across multiple child writers. At the same time, if we forward the bundle directly to restore pass-through, we can bypass row-level side effects in RowDataFileWriter, such as sequence-number updates and main-file index writing. We also lose typed / bundle-aware fast paths such as ArrowBundleRecords.

This patch makes those constraints explicit: replayable bundles can be passed through safely, non-replayable bundles are materialized once, and dedicated row-data writers preserve the row-level side effects after bundle writes.

So the issue was not that the old row-by-row fallback always corrupted data. The issue was that the dedicated-format bundle path could not preserve bundle semantics safely and correctly once a single bundle needed to be fanned out to multiple writers.

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the dedicated-format bundle write path. This is a complex change touching the append writer hierarchy. Comments:

Design:

  1. ReplayableBundleRecords / ProjectableBundleRecords interfaces: The opt-in capability pattern is clean. It avoids strengthening the base BundleRecords contract and allows runtime capability checks. This is the right extension pattern for the existing architecture.

  2. Materialization strategy: "Materialize only non-replayable bundles" is correct — if the bundle supports replay, we can iterate it multiple times without copying. The MaterializedBundleRecords fallback for non-replayable bundles ensures correctness.

  3. ArrowBundleRecords implementing ProjectableBundleRecords: This preserves the Arrow fast path through projection, which avoids row-by-row materialization. The project() implementation using rowType.project(projection) is elegant.

Concerns:

  1. BundlePassThroughWriter interface: This introduces yet another interface in the writer hierarchy. The writer abstraction is getting quite layered: SingleFileWriterRowDataFileWriterBundleAwareRowDataFileWriter → implements BundlePassThroughWriter. Ensure this doesn't become a maintenance burden.

  2. Sequence number side effects: The PR description mentions "preserves row-level side effects in bundle writes, including sequence number updates." This is critical — if the sequence number counter isn't advanced correctly for bundle writes, it could cause data consistency issues during compaction. The test BundleAwareRowDataRollingFileWriterTest should explicitly verify sequence numbers are correct after bundle writes.

  3. File index writing: "file-index writing" side effects — does the main-file index get updated correctly when the blob writer uses the bundle path? The DedicatedFormatRollingFileWriterVectorTest should cover this.

  4. 1054 lines added: The change is large. Could the interface additions (ReplayableBundleRecords, ProjectableBundleRecords) be a separate preparatory PR?

Good test coverage with 4 dedicated test classes. Please confirm sequence number correctness in tests.

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@QuakeWang
Copy link
Copy Markdown
Contributor Author

@JingsongLi Thanks for the review.

I added explicit coverage for the side effects:

  • BundleAwareRowDataRollingFileWriterTest verifies the sequence counter and min/max sequence numbers after bundle writes.
  • DedicatedFormatRollingFileWriterVectorTest verifies sequence side effects for the normal/blob/vector fan-out path.
  • It also verifies the main-file index path by enabling bitmap index on f0, writing a single-use bundle, and evaluating the embedded index.

I kept ReplayableBundleRecords / ProjectableBundleRecords in this PR because they are part of the fix: replayability is needed for safe fan-out, and projectability keeps the Arrow bundle fast path.

The remaining failed CI is a timeout in SinkSavepointITCase.testRecoverFromSavepoint, which looks unrelated to this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants