Skip to content

[fix](variant) Skip full footer scan when constructing VariantStatsCaculator#62819

Merged
eldenmoon merged 1 commit intoapache:masterfrom
csun5285:fix/variant-stats-calculator-init
Apr 29, 2026
Merged

[fix](variant) Skip full footer scan when constructing VariantStatsCaculator#62819
eldenmoon merged 1 commit intoapache:masterfrom
csun5285:fix/variant-stats-calculator-init

Conversation

@csun5285
Copy link
Copy Markdown
Contributor

@csun5285 csun5285 commented Apr 24, 2026

Problem

In vertical compaction, SegmentWriter::init() is called multiple times against the same writer (key columns first, then each value-column group). The footer accumulates across calls, so every additional init() re-scans an ever-larger footer — including entries from prior init()s that the current column_ids cannot address.

Before After
Single construction O(footer accumulated size) O(this init's column_ids size)
N vertical-compaction init() calls O(N²) O(N)

Fix

Snapshot _footer.columns_size() before _create_writers appends new entries and pass it to VariantStatsCaculator as footer_column_offset. The constructor only walks [offset, end) — its own slice.

When performing compaction with 10,000 columns in the variant, this method accounts for around 8% of CPU usage. After changing it to O(N), this section has disappeared from the flame graph.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…culator

SegmentWriter::init() can run multiple times against the same writer
(vertical compaction's key columns + per value-column-group calls), and
the footer accumulates entries across calls. The calculator was scanning
the whole footer on every construction, so each additional init() walked
an ever-larger footer that included entries it cannot address via the
init's own `column_ids`.

Snapshot the footer size before _create_writers appends new entries and
pass it to VariantStatsCaculator as `footer_column_offset`, so the
constructor only scans its own slice. Per-init() construction cost goes
from O(footer accumulated size) to O(this init's column_ids size); the
total cost across N vertical-compaction init() calls drops from O(N^2)
to O(N).

All existing behavior is preserved (including the defensive
Status::NotFound on missing footer entries). One new unit test covers
the offset case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@csun5285
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.43% (26782/37494)
Line Coverage 53.79% (279652/519859)
Region Coverage 47.07% (214603/455963)
Branch Coverage 50.43% (97260/192877)

@csun5285
Copy link
Copy Markdown
Contributor Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

OpenCode automated review failed and did not complete.

Error: Review step was failure (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/24952118629

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

@eldenmoon
Copy link
Copy Markdown
Member

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find blocking issues in this PR.

Critical checkpoints:

  • Goal / correctness: The change is narrowly targeted at the repeated SegmentWriter::init() reuse path in vertical compaction / segcompaction. Scoping VariantStatsCaculator to the footer slice added by the current init() matches how SegmentWriter accumulates footer entries across groups, so the fix addresses the reported stale-scan behavior without changing the downstream stats update logic.
  • Scope: The modification is small and focused: one caller-side offset snapshot, one constructor parameter, and one unit test.
  • Concurrency: No new concurrency risk is introduced here. The touched logic runs on the per-writer compaction path and the SegmentWriter reuse is serialized by the surrounding compaction flow.
  • Lifecycle: The review-sensitive lifecycle detail is the persistent _footer across repeated clear() / init() cycles; this patch handles that explicitly by snapshotting the footer size before the current group appends its own metas.
  • Config / compatibility: No config, protocol, or storage-format compatibility changes.
  • Parallel paths: The fix covers both known repeated-init paths that reuse SegmentWriter (VerticalBetaRowsetWriter and segcompaction), because both go through SegmentWriter::init(col_ids, is_key).
  • Special conditions: The new footer_column_offset condition is justified and documented in code.
  • Test coverage: The added unit test captures the intended footer-offset behavior and checks that stats land on the current slice instead of stale entries from a prior init. I could not run the BE UT in this runner because thirdparty/installed/bin/protoc is missing, so my validation here is code inspection only.
  • Observability: No additional observability appears necessary for this localized fix.
  • Data / persistence: No transaction, persistence, or metadata replay semantics are changed beyond which in-memory footer entries are indexed during compaction.
  • Performance: The optimization claim is consistent with the implementation; the constructor scan is reduced from accumulated footer size to the current init slice, which removes the repeated full-footer walk.
  • Other issues: None found in the reviewed diff.

User focus points: No additional user-provided focus points were supplied for this review.

Copy link
Copy Markdown
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@eldenmoon eldenmoon merged commit ffefcae into apache:master Apr 29, 2026
35 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants