Skip to content

[lake/tiering] add table dropped handling and nullable write results#2920

Open
beryllw wants to merge 2 commits intoapache:mainfrom
beryllw:tiering-droptbl
Open

[lake/tiering] add table dropped handling and nullable write results#2920
beryllw wants to merge 2 commits intoapache:mainfrom
beryllw:tiering-droptbl

Conversation

@beryllw
Copy link
Contributor

@beryllw beryllw commented Mar 24, 2026

Purpose

Linked issue: close #2498

This PR improves Lake Tiering's handling of dropped tables and adds cancellation support for tiering operations.

Brief change log

Dropped Table Handling

  • When a table is dropped during active tiering, the TieringSplitReader now gracefully completes all
    in-progress splits with empty results instead of failing
  • Added handleTableDropped() to mark dropped tables and trigger force completion
  • Dropped tables are properly cleaned up from LakeTableTieringManager to prevent resource leaks

Cancellation Support

  • Added cancelled flag to TableBucketWriteResult to distinguish between normal and cancelled tiering rounds
  • Committers skip commit processing for cancelled results and report back to the coordinator
  • Lake writers are closed without completing when tiering is cancelled, discarding uncommitted data

Tests

API and Format

Documentation

@beryllw beryllw force-pushed the tiering-droptbl branch 3 times, most recently from c74237d to ef27034 Compare March 25, 2026 05:04
Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the PR, left a comment.

*/
public class TableBucketWriteResult<WriteResult> implements Serializable {

private static final long serialVersionUID = 1L;
Copy link
Contributor

@leekeiabstraction leekeiabstraction Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As canceled field is added, shouldn't the serialVersionUID be bumped as well? Similarly the serialiser needs to handle each version of TableBucketWriteResult differently.

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't finish review, but left some comments

return forceCompleteDroppedTable();
}

checkSplitOrStartNext();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if move to a split, and then the split is for the table dropped, will getOrMoveToTable cause exception and then fail the job. Maybe we can protect this case .

ScanRecords scanRecords;
try {
scanRecords = currentLogScanner.poll(pollTimeout);
} catch (TableNotExistException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we don't need to check this exception, just wait it to be forced complete in next fetch.
Also, as disscuss before, poll won't throw TableNotExistException.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TieringService will stuck when drop the table that is tiering

3 participants