Skip to content

[fix](compcation) fix mow compaction delete bitmap handling#63167

Closed
mrhhsg wants to merge 1 commit into
apache:branch-4.1from
mrhhsg:fix_compaction_41
Closed

[fix](compcation) fix mow compaction delete bitmap handling#63167
mrhhsg wants to merge 1 commit into
apache:branch-4.1from
mrhhsg:fix_compaction_41

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented May 12, 2026

What problem does this PR solve?

Fix incorrect delete bitmap handling for cluster-key unique MOW tables during cumulative compaction.

Problem

test_compact_with_seq2 can fail under fuzzy mode when cumulative compaction produces a rowset like [2-11] while a newer visible rowset such as [12-12] remains outside the compaction output. In that shape, duplicate unique keys can
remain visible.

There are two related gaps:

  1. For cluster-key MOW tables, the primary key index contains rowid. The previous merge-index delete bitmap calculation used a seek optimization that could skip rows with the same unique key inside the same segment.
  2. Existing compaction rowid conversion only transfers delete bitmap entries that already exist on compaction input rowsets. It does not recompute delete bitmap for newer visible rowsets against the newly generated compaction output
    rowset.

Changes

  • Disable the seek optimization in MergeIndexDeleteBitmapCalculator when rowid is part of the primary key index, so in-segment duplicate unique keys are not skipped.
  • After merging compaction output delete bitmap, recalculate delete bitmap for visible rowsets newer than the compaction output version against the new compaction output rowset before replacing input rowsets.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@mrhhsg mrhhsg requested a review from yiguolei as a code owner May 12, 2026 08:23
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 12, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg mrhhsg force-pushed the fix_compaction_41 branch 7 times, most recently from b48cab6 to 1c13941 Compare May 12, 2026 21:55
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 13, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: request changes.

Critical checkpoints:

  • Correctness: one regression test change filters out an inserted row from the pre/post compaction assertions, which weakens coverage and can hide the variant compaction behavior that this suite is meant to validate.
  • Concurrency/locking: the storage-side delete bitmap changes were reviewed for the modified paths; no additional blocking concurrency issue was identified.
  • Error handling: no new unchecked error path was identified in the modified storage code.
  • Tests: the MOW delete-bitmap regression coverage is relevant, but the unrelated variant test expectation change should be fixed or justified without dropping the row from the assertion.

Existing review context: no existing inline review threads were present. User focus: no additional user-provided focus points. Repository instructions: no AGENTS.md or repository-local code-review skill file was found within the current directory, so this review followed the explicit PR review prompt and standard code-review criteria.

qt_sql_2 "select k, cast(v['a'] as array<int>) from ${tableName} where size(cast(v['a'] as array<int>)) > 0 order by k"
qt_sql_3 "select k, v['a'], cast(v['b'] as string) from ${tableName} where length(cast(v['b'] as string)) > 4 order by k"
qt_sql_3 "select k, v['a'], cast(v['b'] as string) from ${tableName} where length(cast(v['b'] as string)) > 4 and k != 1029 order by k"
qt_sql_5 "select cast(v['b'] as string), cast(v['b']['c'] as string) from ${tableName} where cast(v['b'] as string) != 'null' and cast(v['b'] as string) != '{}' order by k desc, 1, 2 limit 10;"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change drops the inserted k = 1029 row from both the before- and after-compaction checks instead of preserving the expected behavior. The same row is still inserted above and still appears in the full qt_sql_1/qt_sql_11 outputs, so filtering it here weakens this regression test and can hide a compaction/variant extraction regression for that key. Since this PR does not change variant query semantics, please keep the original predicate (or add a separate, justified test case) rather than excluding this row from the assertion.

@mrhhsg mrhhsg force-pushed the fix_compaction_41 branch 3 times, most recently from 4c36c05 to 7d5d887 Compare May 13, 2026 05:21
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 13, 2026

run buildall

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 13, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes. I found a blocking correctness issue in the core MOW delete-bitmap path. Critical checkpoints: correctness is not satisfied because cluster-key same-segment duplicates can keep the wrong row; regression coverage is incomplete because the new regression does not cover same-segment duplicate ordering; diff hygiene is not satisfied due to a trailing-whitespace failure from git diff --check. Existing review context: the variant sparse-limit assertion weakening is already covered by the existing inline thread, so I did not duplicate it. User focus: no additional user-provided focus points were supplied.

}
_last_key = cur_key.to_string();
if (_rowid_length > 0) {
// For cluster-key MOW tables, the primary key index contains rowid and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the seek skip, but it now preserves the first rowid encountered in a segment and deletes later rows with the same unique key. For cluster-key MOW the rowid suffix is encoded ascending (SegmentWriter::_encode_rowid), so duplicates within one segment are visited from smaller to larger rowid. When _last_key matches on the second duplicate, calculate_one() returns that later row as deleted, leaving the older row visible. The intended MOW behavior is to keep the newest row (larger rowid within the same segment, absent a higher sequence value), so this can still return stale data for exactly the same same-segment duplicate case this PR is trying to fix.

-- !audit_log_schema --
query_id varchar(48) Yes true \N
query_id varchar(128) Yes true \N
time datetime(3) Yes true \N
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line introduces trailing whitespace, and git diff --check reports it as an error. Please regenerate or trim the expected output so whitespace checks pass.

@mrhhsg mrhhsg force-pushed the fix_compaction_41 branch from 7d5d887 to ee0271f Compare May 13, 2026 09:46
@mrhhsg mrhhsg closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants