[fix](auto partition) keep load row metrics monotonic for auto partition#64109
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
Review result: no blocking issues found in this PR.
Critical checkpoint conclusions:
- Goal and proof: The PR aims to keep global BE load row/byte metrics monotonic for auto-partition batching while preserving RuntimeState progress compensation. The code accomplishes this by removing negative global metric deltas when rows are moved into the batching block and by skipping global metric increments while replaying the internal batched block. No targeted test is added, so the remaining risk is regression coverage for this metric-only behavior.
- Scope and clarity: The change is small and focused on the three relevant BE writer/distribution locations.
- Concurrency: The modified code does not introduce new shared state or new concurrency. The
_deal_batchedflag is used in the existing writer/replay flow, and the new local snapshot is taken after_send_new_partition_batch(), which preserves the intended distinction between recursive replay writes and normal source writes. - Lifecycle/static initialization: No new lifecycle ownership, static initialization, or release-order behavior is introduced.
- Configuration: No configuration item is added.
- Compatibility: No wire protocol, storage format, or function symbol compatibility impact was found.
- Parallel paths: Both
VTabletWriterandVTabletWriterV2are updated consistently. I also checked the row-distribution path where the negative delta was previously emitted. - Conditional checks: The new condition only guards global DorisMetrics increments and follows the existing
_deal_batchedreplay state. RuntimeState row/byte accounting remains unchanged. - Tests: No tests are added or modified. Given this is metric accounting, a targeted auto-partition load metric regression would reduce risk, but I did not find a functional blocker in the implementation.
- Observability: The change directly corrects observability metrics and does not require additional logs or metrics.
- Transaction/persistence/data writes: The PR does not change transaction visibility, rowset writes, persistence, or data routing semantics.
- FE/BE variable passing: No new FE/BE variables or thrift fields are added.
- Performance: The added boolean snapshot and conditional metric increments are negligible; no performance issue found.
- Other issues: No duplicate existing review threads were present, and no additional user-provided focus points were specified.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
TPC-H: Total hot run time: 29427 ms |
TPC-DS: Total hot run time: 170258 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
…ion (#64109) ### What problem does this PR solve? When loading into an auto partition table, BE may temporarily batch rows whose target partitions do not exist yet. The previous logic first incremented `doris_be_load_rows` when receiving the input block, then decremented it when those rows were moved into the auto-partition batching block, and finally incremented it again when the batched rows were replayed after partition creation. Because `doris_be_load_rows` is exposed as a counter, the negative adjustment made the metric non-monotonic and could show row count drops during auto partition loads. This PR keeps the existing `RuntimeState` row compensation logic unchanged, but stops applying negative deltas to the global BE load row/byte metrics. It also skips global metric increments when replaying the internal batched block, so each source row is counted once.
…ion (#64109) ### What problem does this PR solve? When loading into an auto partition table, BE may temporarily batch rows whose target partitions do not exist yet. The previous logic first incremented `doris_be_load_rows` when receiving the input block, then decremented it when those rows were moved into the auto-partition batching block, and finally incremented it again when the batched rows were replayed after partition creation. Because `doris_be_load_rows` is exposed as a counter, the negative adjustment made the metric non-monotonic and could show row count drops during auto partition loads. This PR keeps the existing `RuntimeState` row compensation logic unchanged, but stops applying negative deltas to the global BE load row/byte metrics. It also skips global metric increments when replaying the internal batched block, so each source row is counted once.
What problem does this PR solve?
When loading into an auto partition table, BE may temporarily batch rows whose target partitions do not exist yet. The previous logic first incremented
doris_be_load_rowswhen receiving the input block, then decremented it when those rows were moved into the auto-partition batching block, and finally incremented it again when the batched rows were replayed after partition creation.Because
doris_be_load_rowsis exposed as a counter, the negative adjustment made the metric non-monotonic and could show row count drops during auto partition loads.This PR keeps the existing
RuntimeStaterow compensation logic unchanged, but stops applying negative deltas to the global BE load row/byte metrics. It also skips global metric increments when replaying the internal batched block, so each source row is counted once.