Skip to content

branch-4.0: [fix](be fe) pick #62263 #61647#63816

Open
Yukang-Lian wants to merge 2 commits into
apache:branch-4.0from
Yukang-Lian:codex/pick-62263-61647-4.0
Open

branch-4.0: [fix](be fe) pick #62263 #61647#63816
Yukang-Lian wants to merge 2 commits into
apache:branch-4.0from
Yukang-Lian:codex/pick-62263-61647-4.0

Conversation

@Yukang-Lian
Copy link
Copy Markdown
Collaborator

@Yukang-Lian Yukang-Lian commented May 28, 2026

pick #62263
pick #61647

Yukang-Lian and others added 2 commits May 28, 2026 15:30
…e batch size estimation (apache#62263)

- When vertical compaction runs for the first time on a tablet (no
historical sampling data), `estimate_batch_size()` previously returned a
hardcoded value of 992, which could cause OOM for wide tables or be too
conservative for narrow tables
- This change uses `ColumnMetaPB.raw_data_bytes` from segment footer to
compute a per-row size estimate for the first compaction.
`raw_data_bytes` records the original data size before encoding, which
closely approximates runtime `Block::bytes()`
- Historical sampling now uses `Block::allocated_bytes()` instead of
`bytes()` for more accurate memory estimation (`size()` vs `capacity()`)
- Subsequent compactions with historical sampling data are completely
unchanged

| Column type | Estimation strategy |
|------------|-------------------|
| Scalar (INT/VARCHAR etc.) | `raw_data_bytes / rows_with_data` +
structural compensation (+1 null map, +8 offset) |
| Complex (ARRAY/MAP/STRUCT) | `raw_data_bytes / rows_with_data`, no
compensation (already includes recursive sub-writer data) |
| VARIANT (root/subcolumn) | Fallback to 992 (`raw_data_bytes=0 // TODO`
in writer) |

- Footer collection only runs on first compaction (no historical
sampling data)
- Skipped entirely when `compaction_batch_size` is manually set
- OOM backoff and sparse optimization paths are untouched

- [ ] Wide table (200+ columns) first compaction does not OOM
- [ ] Narrow table first compaction batch_size is close to upper limit
- [ ] Multi-round compaction: first round uses footer, subsequent rounds
use historical sampling
- [ ] Variant columns fallback to 992
- [ ] Sparse optimization is not affected
- [ ] `TestFirstCompactionUsesFooterEstimation` unit test passes

(cherry picked from commit dd15d4f)
This ports two enterprise delete fixes to upstream master:

- Avoid poisoning delete latch status for non-INVALID_ARGUMENT realtime push failures
- Make PushTask.failedWithMsg count down delete latch without forcing non-OK latch status
- Remove the extra half-timeout wait after delete already reaches QUORUM_FINISHED
- Add a FE unit test for delete push failure handling in MasterImpl

(cherry picked from commit 78da3c9)
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Yukang-Lian Yukang-Lian changed the title [fix](be fe) Pick #62263 and #61647 into branch-4.0 branch-4.0: [fix](be fe) pick #62263 #61647 May 28, 2026
@Yukang-Lian Yukang-Lian marked this pull request as ready for review May 28, 2026 08:07
@Yukang-Lian
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 89.42% (93/104) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.46% (25513/35700)
Line Coverage 54.30% (270260/497724)
Region Coverage 51.81% (223633/431625)
Branch Coverage 53.27% (96305/180791)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants