Skip to content

[fix](insert) fix INSERT job statistics lost in show load after FE restart#62331

Merged
liaoxin01 merged 1 commit intoapache:masterfrom
sollhui:fix_insert_job_statistic
Apr 16, 2026
Merged

[fix](insert) fix INSERT job statistics lost in show load after FE restart#62331
liaoxin01 merged 1 commit intoapache:masterfrom
sollhui:fix_insert_job_statistic

Conversation

@sollhui
Copy link
Copy Markdown
Contributor

@sollhui sollhui commented Apr 10, 2026

Problem

After a FE restart, SHOW LOAD for finished INSERT jobs shows all-zero
JobDetails (ScannedRows, LoadBytes, etc.), even though the values
were correct before the restart.

Fix

InsertLoadJob

  • Add @SerializedName("jdj") private String jobDetailsJson to capture a
    JSON snapshot of loadStatistic at the moment the job finishes.
  • In setJobProperties() (called when the job transitions to
    FINISHED/CANCELLED), snapshot loadStatistic.toJson() into
    jobDetailsJson.
  • Override getJobDetailsJson() to return the persisted snapshot when
    available, falling back to the live loadStatistic during execution.

LoadJob

  • Extract protected String getJobDetailsJson() (defaults to
    loadStatistic.toJson()) so subclasses can override the stats source
    used by getShowInfoUnderLock().

OlapInsertExecutor

  • Remove the !Config.enable_nereids_load guard in afterExec() so that
    recordFinishedLoadJob is always called when jobId != -1. This ensures
    the job is persisted to the edit log and its statistics are captured
    regardless of the config value.

Test

Added test_insert_statistic_after_fe_restart docker regression test that:

  1. Inserts rows via INSERT INTO ... SELECT
  2. Reads and records JobDetails from SHOW LOAD
  3. Restarts FE
  4. Asserts ScannedRows, LoadBytes, FileNumber, FileSize are
    unchanged after restart

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 10, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@sollhui
Copy link
Copy Markdown
Contributor Author

sollhui commented Apr 10, 2026

run buildall

@sollhui sollhui force-pushed the fix_insert_job_statistic branch from c5cfec3 to 5ec7282 Compare April 10, 2026 07:55
@sollhui
Copy link
Copy Markdown
Contributor Author

sollhui commented Apr 10, 2026

run buildall

@sollhui
Copy link
Copy Markdown
Contributor Author

sollhui commented Apr 10, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding:

  1. InsertLoadJob now persists a snapshot of loadStatistic, but the patch does not actually remove the !Config.enable_nereids_load gate from the finished-job journaling path. In the current tree, both fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/OlapInsertExecutor.java:330 and fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/RemoteOlapInsertExecutor.java:307 still skip recordFinishedLoadJob(...) when enable_nereids_load=true. In that configuration the new jobDetailsJson field is never populated or written to the edit log, so SHOW LOAD will still lose INSERT job statistics after FE restart. The new regression test also runs only with the default config, so it does not cover this live path.

Critical checkpoint conclusions:

  • Goal of the task: Partially achieved. The persisted snapshot itself is correct, but the config-enabled Nereids insert path that must create/journal the finished InsertLoadJob remains unfixed, so the stated goal is not met for all supported configurations.
  • Is the modification small/clear/focused: Yes.
  • Concurrency: No new lock-order or thread-safety issue found in the added code; LoadStatistic.toJson() is synchronized and no new shared mutable structure was introduced.
  • Lifecycle/static initialization: No special lifecycle or static-init issue found. InsertLoadJob remains registered in Gson's runtime type adapter, so the new field can replay once the object is journaled.
  • Configuration items: No new config added, but existing enable_nereids_load handling is still incorrect for this fix.
  • Compatibility: The added serialized field is backward-compatible.
  • Parallel code paths: Not fully handled. The Nereids local and remote insert executors still have the old guard.
  • Special conditional checks: The remaining !Config.enable_nereids_load condition is no longer valid for the stated behavior.
  • Test coverage: Insufficient for the bug described in the PR because the regression test does not exercise enable_nereids_load=true.
  • Observability: No additional observability needed for this change.
  • Transaction/persistence: The new field is fine once persisted, but the missing finished-job journal write on the guarded path still causes restart-time metadata loss.
  • Data writes/modifications: No table-data correctness issue introduced by the new serialization itself.
  • FE/BE variable passing: Not applicable.
  • Performance: No material concern.
  • Other issues: None beyond the missing Nereids-path fix above.

Summary opinion: needs follow-up before merge because the main correctness issue is still reproducible when enable_nereids_load=true.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 14.29% (1/7) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@liaoxin01
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one blocking issue.

  1. fe/fe-core/src/main/java/org/apache/doris/load/loadv2/InsertLoadJob.java / fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/OlapInsertExecutor.java:330 / .../RemoteOlapInsertExecutor.java:307: the new jobDetailsJson snapshot only helps if the finished InsertLoadJob is actually persisted via LoadManager.recordFinishedLoadJob() -> EditLog.logCreateLoadJob(loadJob). This patch does not remove the existing !Config.enable_nereids_load guard in either executor, even though the PR description says it does. When enable_nereids_load=true, AbstractInsertExecutor only creates an in-memory InsertLoadJob and adds it to LoadManager; it is neither journaled nor registered as a transaction callback. After an FE restart, SHOW LOAD still cannot recover this snapshot in that mode. The new regression test also does not cover that path because enable_nereids_load defaults to false.

Critical checkpoint conclusions:

  • Goal of the task: preserve finished INSERT JobDetails across FE restart. The code only accomplishes this on the recordFinishedLoadJob() path; it is not correct for enable_nereids_load=true, and the added test does not prove that mode.
  • Small, clear, focused change: yes. The patch is small and localized.
  • Concurrency: no new concurrency pattern was introduced; existing locking in LoadJob is unchanged.
  • Lifecycle / static initialization: no special lifecycle or static-init issue found.
  • Configuration items: no new config was added, but the existing enable_nereids_load switch still leaves a parallel path unfixed.
  • Compatibility: no storage-format or symbol compatibility issue found.
  • Functionally parallel code paths: applicable, and not fully handled. Both local and remote Nereids insert executors still keep the old guard.
  • Special conditional checks: the !Config.enable_nereids_load condition is now the key incorrect gate and is not justified for persisted SHOW LOAD state.
  • Test coverage: insufficient. The new regression test is useful for the default path, but it misses enable_nereids_load=true; no replay-focused FE unit test was added either.
  • Observability: adequate for this small change.
  • Transaction / persistence: blocking issue present. The persisted snapshot depends on a journal write path that still does not run in Nereids-load mode.
  • Data writes / modifications: no direct table-data correctness issue identified.
  • FE-BE variable passing: not applicable here.
  • Performance: no material performance concern found in the patch itself.
  • Other issues: the PR description and actual code diverge on the claimed guard removal.

Overall opinion: REQUEST_CHANGES because the advertised fix is incomplete and still loses SHOW LOAD job details in the enable_nereids_load=true execution path.

@liaoxin01 liaoxin01 dismissed github-actions[bot]’s stale review April 16, 2026 03:31

enable_nereids_load is useless.

@liaoxin01 liaoxin01 merged commit b6178dc into apache:master Apr 16, 2026
30 of 31 checks passed
github-actions bot pushed a commit that referenced this pull request Apr 16, 2026
…start (#62331)

## Problem

After a FE restart, `SHOW LOAD` for finished INSERT jobs shows all-zero
`JobDetails` (`ScannedRows`, `LoadBytes`, etc.), even though the values
were correct before the restart.

## Fix

**`InsertLoadJob`**
- Add `@SerializedName("jdj") private String jobDetailsJson` to capture
a
  JSON snapshot of `loadStatistic` at the moment the job finishes.
- In `setJobProperties()` (called when the job transitions to
  FINISHED/CANCELLED), snapshot `loadStatistic.toJson()` into
  `jobDetailsJson`.
- Override `getJobDetailsJson()` to return the persisted snapshot when
  available, falling back to the live `loadStatistic` during execution.

**`LoadJob`**
- Extract `protected String getJobDetailsJson()` (defaults to
  `loadStatistic.toJson()`) so subclasses can override the stats source
  used by `getShowInfoUnderLock()`.

**`OlapInsertExecutor`**
- Remove the `!Config.enable_nereids_load` guard in `afterExec()` so
that
`recordFinishedLoadJob` is always called when `jobId != -1`. This
ensures
  the job is persisted to the edit log and its statistics are captured
  regardless of the config value.

## Test

Added `test_insert_statistic_after_fe_restart` docker regression test
that:
1. Inserts rows via `INSERT INTO ... SELECT`
2. Reads and records `JobDetails` from `SHOW LOAD`
3. Restarts FE
4. Asserts `ScannedRows`, `LoadBytes`, `FileNumber`, `FileSize` are
   unchanged after restart
github-actions bot pushed a commit that referenced this pull request Apr 16, 2026
…start (#62331)

## Problem

After a FE restart, `SHOW LOAD` for finished INSERT jobs shows all-zero
`JobDetails` (`ScannedRows`, `LoadBytes`, etc.), even though the values
were correct before the restart.

## Fix

**`InsertLoadJob`**
- Add `@SerializedName("jdj") private String jobDetailsJson` to capture
a
  JSON snapshot of `loadStatistic` at the moment the job finishes.
- In `setJobProperties()` (called when the job transitions to
  FINISHED/CANCELLED), snapshot `loadStatistic.toJson()` into
  `jobDetailsJson`.
- Override `getJobDetailsJson()` to return the persisted snapshot when
  available, falling back to the live `loadStatistic` during execution.

**`LoadJob`**
- Extract `protected String getJobDetailsJson()` (defaults to
  `loadStatistic.toJson()`) so subclasses can override the stats source
  used by `getShowInfoUnderLock()`.

**`OlapInsertExecutor`**
- Remove the `!Config.enable_nereids_load` guard in `afterExec()` so
that
`recordFinishedLoadJob` is always called when `jobId != -1`. This
ensures
  the job is persisted to the edit log and its statistics are captured
  regardless of the config value.

## Test

Added `test_insert_statistic_after_fe_restart` docker regression test
that:
1. Inserts rows via `INSERT INTO ... SELECT`
2. Reads and records `JobDetails` from `SHOW LOAD`
3. Restarts FE
4. Asserts `ScannedRows`, `LoadBytes`, `FileNumber`, `FileSize` are
   unchanged after restart
yiguolei pushed a commit that referenced this pull request Apr 16, 2026
… after FE restart #62331 (#62546)

Cherry-picked from #62331

Co-authored-by: hui lai <laihui@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.x dev/4.0.x dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants