Skip to content

pkg/dxf/importinto, pkg/executor: show conflict progress in SHOW IMPORT JOB#67551

Merged
ti-chi-bot[bot] merged 11 commits into
pingcap:masterfrom
D3Hunter:codex/show-import-job-conflict-progress
Apr 13, 2026
Merged

pkg/dxf/importinto, pkg/executor: show conflict progress in SHOW IMPORT JOB#67551
ti-chi-bot[bot] merged 11 commits into
pingcap:masterfrom
D3Hunter:codex/show-import-job-conflict-progress

Conversation

@D3Hunter
Copy link
Copy Markdown
Contributor

@D3Hunter D3Hunter commented Apr 3, 2026

What problem does this PR solve?

Issue Number: ref #66019

Problem Summary:

SHOW IMPORT JOB did not expose meaningful progress for global-sort conflict handling steps (collect-conflicts and conflict-resolution). The progress columns were effectively byte-oriented and did not reflect conflict processing work units.

What changed and how does it work?

This PR wires conflict-step progress into the existing runtime reporting path without changing SHOW IMPORT JOB schema:

  • Extend import task summary with conflict-step totals:
    • CollectConflictsSummary
    • ResolveConflictsSummary
  • Compute and persist conflict-step totals (conflict KV count) in planner/scheduler transitions.
  • Add realtime processed-count updates in conflict executors by counting consumed conflict KV pairs.
  • Extend runtime formatting in RuntimeInfo:
    • conflict steps render Processed / Total as <n> conflicts
    • conflict steps render speed as <n> conflicts/s
    • non-conflict steps keep existing byte-based formatting.
  • Update SHOW IMPORT JOB rendering to consume runtime speed formatter (SpeedStr).
  • Extend tests to verify both runtime formatter behavior and SQL-visible SHOW IMPORT JOB output for conflict steps.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

this is how it displays in SHOW

mysql> show import job 2;
+--------+-----------+-----------------------------+--------------+----------+---------------------+---------+------------------+---------------+----------------+----------------------------+----------------------------+----------+------------+---------------------+---------------------+-------------------------+---------------------+-----------------------+----------------+--------------+
| Job_ID | Group_Key | Data_Source                 | Target_Table | Table_ID | Phase               | Status  | Source_File_Size | Imported_Rows | Result_Message | Create_Time                | Start_Time                 | End_Time | Created_By | Last_Update_Time    | Cur_Step            | Cur_Step_Processed_Size | Cur_Step_Total_Size | Cur_Step_Progress_Pct | Cur_Step_Speed | Cur_Step_ETA |
+--------+-----------+-----------------------------+--------------+----------+---------------------+---------+------------------+---------------+----------------+----------------------------+----------------------------+----------+------------+---------------------+---------------------+-------------------------+---------------------+-----------------------+----------------+--------------+
|      2 | NULL      | s3://mybucket/conflicts.csv | `test`.`t`   |        7 | resolving-conflicts | running | 491B             |             0 |                | 2026-04-08 16:11:03.101617 | 2026-04-08 16:11:03.638522 | NULL     | root@%     | 2026-04-08 16:11:43 | conflict-resolution | 75 conflicts            | 99 conflicts        | 75                    | 4 conflicts/s  | 00:00:06     |
+--------+-----------+-----------------------------+--------------+----------+---------------------+---------+------------------+---------------+----------------+----------------------------+----------------------------+----------+------------+---------------------+---------------------+-------------------------+---------------------+-----------------------+----------------+--------------+
  • No need to test
    • I checked and no code files have been changed.

Unit test commands run:

  • ./tools/check/failpoint-go-test.sh pkg/dxf/importinto -run 'TestShowImportProgress|TestGetTaskImportedRows' -count=1
  • ./tools/check/failpoint-go-test.sh pkg/executor -run TestFillOneImportJobInfo -count=1
  • make lint

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Summary by CodeRabbit

  • New Features

    • Show "collect-conflicts" and "conflict-resolution" steps in import progress with dedicated counts.
  • Bug Fixes

    • Progress, totals and speed display now use processed-count metrics (rows/conflicts) instead of byte-based values for accurate reporting.
  • Refactor

    • Progress reporting unified to record generic "processed" units; conflict steps formatted as " conflicts" / " conflicts/s".
  • Tests

    • Updated tests and fixtures to expect processed-count semantics across affected flows.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 3, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Apr 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6ffb4834-885e-4c1f-a08f-54f5a50f4e77

📥 Commits

Reviewing files that changed from the base of the PR and between 6d2cb13 and 26658cc.

📒 Files selected for processing (6)
  • pkg/dxf/framework/taskexecutor/execute/interface.go
  • pkg/dxf/importinto/collect_conflicts.go
  • pkg/dxf/importinto/conflict_resolution.go
  • pkg/dxf/importinto/conflictedkv/handler.go
  • pkg/dxf/importinto/planner.go
  • pkg/executor/importer/import.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/dxf/importinto/conflictedkv/handler.go
  • pkg/dxf/importinto/conflict_resolution.go
  • pkg/dxf/framework/taskexecutor/execute/interface.go

📝 Walkthrough

Walkthrough

Replaced per-byte Bytes counters with generic Processed counters across progress types, collectors, handlers, executors, tests, and UI. Wired an execute.Collector through conflicted-kv handlers/deleters so executors can accumulate processed/accepted counts; added conflict-step summaries and display formatting.

Changes

Cohort / File(s) Summary
Core progress types & tests
pkg/dxf/framework/taskexecutor/execute/interface.go, pkg/dxf/framework/taskexecutor/execute/interface_test.go
Renamed BytesProcessed in Progress/SubtaskSummary; updated Update/Reset/GetSpeed semantics and test fixtures (JSON tag bytes preserved).
Backfill & index metering
pkg/ddl/backfilling_clean_s3.go, pkg/ddl/backfilling_read_index.go, pkg/dxf/framework/storage/table_test.go
Aggregations and meter reporting switched to summary.Processed instead of summary.Bytes; tests adjusted accordingly.
Conflict collection & resolution executors
pkg/dxf/importinto/collect_conflicts.go, pkg/dxf/importinto/conflict_resolution.go
Executors now implement execute.Collector (compile-time assertion) and add Accepted()/Processed() to increment e.summary.Processed; executors passed into conflicted-kv components.
Conflicted-KV handlers / collectors / deleters
pkg/dxf/importinto/conflictedkv/collector.go, pkg/dxf/importinto/conflictedkv/deleter.go, pkg/dxf/importinto/conflictedkv/handler.go, pkg/dxf/importinto/conflictedkv/collector_test.go, pkg/dxf/importinto/conflictedkv/deleter_test.go, pkg/dxf/importinto/conflictedkv/handler_test.go, pkg/dxf/importinto/conflictedkv/BUILD.bazel
NewCollector, NewDeleter, NewBaseHandler signatures gain execute.Collector param; handler uses NoopCollector when nil and calls collector.Processed(1,0) after each successful Handle. Tests and Bazel deps updated.
Import job runtime, planning & scheduler
pkg/dxf/importinto/job.go, pkg/dxf/importinto/job_testkit_test.go, pkg/dxf/importinto/planner.go, pkg/dxf/importinto/scheduler.go, pkg/dxf/importinto/task_executor.go
RuntimeInfo and job progress now read Processed for progress and format conflict-step labels via new helpers; planner/scheduler set conflict row counts; executors record Processed.
Importer summary & UI
pkg/executor/importer/import.go, pkg/executor/show.go, pkg/executor/show_test.go, pkg/executor/chunk_process_testkit_test.go
Added CollectConflictsSummary and ResolveConflictsSummary to exported Summary. UI uses RuntimeInfo.SpeedStr(); tests updated to use Processed.
Lightning backend & related tests
pkg/lightning/backend/external/merge.go, pkg/lightning/backend/external/onefile_writer_test.go, pkg/lightning/backend/external/sort_test.go, tests/.../global_sort_test.go, tests/.../global_sort_test.go
Collector updates increment summary.Processed; multiple tests updated to assert Processed/ProcessedCnt instead of Bytes.
Misc tests & build changes
pkg/dxf/framework/storage/table_test.go, pkg/executor/BUILD.bazel, various test files under tests/
Numerous tests updated to use Processed; Bazel test dependencies adjusted to include new package references.

Sequence Diagram(s)

sequenceDiagram
    participant Exec as Step Executor
    participant Del as ConflictedDeleter
    participant Hdl as BaseHandler
    participant Col as execute.Collector

    Exec->>Del: create deleter (pass Col = Exec)
    Del->>Hdl: init handler (pass Col)
    loop per KV pair
        Hdl->>Hdl: Handle(kvPair)
        Hdl->>Col: Processed(1, 0)
        Col-->>Exec: Exec.summary.Processed += 1
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

component/import, component/lightning, ok-to-test

Suggested reviewers

  • GMHDBJD
  • joechenrh
  • hawkingrei

Poem

🐰 Hopped from Bytes to Processed today,
I tallied conflicts along the way,
Collectors counted, handlers cheered,
Progress leapt — no metric feared,
Import hops on, metrics play!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically identifies the main change: adding conflict progress display to SHOW IMPORT JOB across the specified packages.
Description check ✅ Passed The PR description is comprehensive and follows the template structure. It includes issue reference, problem summary, detailed explanation of changes, test checklist with unit and manual tests completed, and appropriate release notes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 3, 2026
@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 3, 2026

Hi @D3Hunter. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 7, 2026
@D3Hunter D3Hunter marked this pull request as ready for review April 8, 2026 07:40
@ti-chi-bot ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 8, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented Apr 8, 2026

Review Complete

Findings: 0 issues
Posted: 0
Duplicates/Skipped: 0

ℹ️ Learn more details on Pantheon AI.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 73.33333% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.8656%. Comparing base (ee3439d) to head (26658cc).
⚠️ Report is 50 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #67551        +/-   ##
================================================
+ Coverage   77.7488%   79.8656%   +2.1167%     
================================================
  Files          1959       1992        +33     
  Lines        543393     556288     +12895     
================================================
+ Hits         422482     444283     +21801     
+ Misses       120070     110535      -9535     
- Partials        841       1470       +629     
Flag Coverage Δ
integration 47.1237% <0.0000%> (+10.9489%) ⬆️
unit 77.1216% <73.3333%> (+0.7366%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 61.5065% <ø> (ø)
parser ∅ <ø> (∅)
br 66.7335% <ø> (+5.7196%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
pkg/ddl/backfilling_read_index.go (1)

492-494: Document that Processed is index-KV bytes in this step.

SubtaskSummary.Processed is step-dependent now, but the cleanup/metering path later consumes this field as index_kv_bytes. A short comment here would make that contract obvious and reduce future misuse.

As per coding guidelines, comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/ddl/backfilling_read_index.go` around lines 492 - 494, Add a short
comment in the distTaskRowCntCollector.Processed method (or immediately above
the SubtaskSummary.Processed update) documenting that the bytes argument and
SubtaskSummary.Processed represent index-KV bytes for this step; explicitly
state the contract that later cleanup/metering will consume this field as
index_kv_bytes so future readers won't misuse it. Ensure the comment names the
involved symbols: distTaskRowCntCollector.Processed and
SubtaskSummary.Processed.
pkg/dxf/framework/taskexecutor/execute/interface.go (1)

205-211: Document the Processed invariants on the collector API.

GetSpeedInTimeRange() assumes this counter is cumulative and uses one stable unit for the whole subtask. The new comment only says the meaning “may vary by scenario”, which leaves future executors room to mix units or reset semantics mid-subtask and silently corrupt speed/ETA. Please lock that constraint down here, and clarify whether rows still means the physical row count during conflict handling.

Possible doc tweak
 // Processed is used collects metrics.
-// `processed` is the number of processed units, and `rows` is the number of rows processed.
-// The meaning of `processed` may vary by scenario, for example:
+// `processed` is the cumulative display unit for the current subtask. It must
+// remain monotonic and use the same unit for the whole subtask because runtime
+// speed/ETA are derived from deltas in this field.
+// `rows` is the physical row count processed by the executor.
+// The meaning of `processed` may vary by scenario, for example:
As per coding guidelines, comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/dxf/framework/taskexecutor/execute/interface.go` around lines 205 - 211,
Update the Processed comment on the collector API (the Processed(processed, rows
int64) method) to state explicitly that `processed` is a monotonically
increasing cumulative counter for the whole subtask and must use a single stable
unit for the entire subtask lifetime (never reset or switch units mid-subtask),
and that GetSpeedInTimeRange depends on this invariant; also clarify that `rows`
is the physical row count during conflict handling (and otherwise represents the
same cumulative-per-subtask semantic), and mention any concurrency expectations
(e.g., callers may update this from different goroutines but updates must be
monotonic).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/dxf/framework/taskexecutor/execute/interface.go`:
- Around line 89-91: The persisted `bytes` JSON key is being reused for
non-bytes counts; add an explicit unit field and keep the legacy `bytes` tag to
avoid silent misinterpretation: add a new string field (e.g. ProcessedUnit
`json:"processed_unit,omitempty"`) alongside the existing Processed int64 (leave
its json:"bytes,omitempty" tag for backward compatibility) and update all
writers (e.g. conflict step code that sets Processed) to also set ProcessedUnit
to "bytes" or "count" as appropriate; update readers to consult ProcessedUnit
before treating Processed as a byte size. Ensure the same change is applied to
the other analogous fields mentioned in the review (lines ~100-105).

---

Nitpick comments:
In `@pkg/ddl/backfilling_read_index.go`:
- Around line 492-494: Add a short comment in the
distTaskRowCntCollector.Processed method (or immediately above the
SubtaskSummary.Processed update) documenting that the bytes argument and
SubtaskSummary.Processed represent index-KV bytes for this step; explicitly
state the contract that later cleanup/metering will consume this field as
index_kv_bytes so future readers won't misuse it. Ensure the comment names the
involved symbols: distTaskRowCntCollector.Processed and
SubtaskSummary.Processed.

In `@pkg/dxf/framework/taskexecutor/execute/interface.go`:
- Around line 205-211: Update the Processed comment on the collector API (the
Processed(processed, rows int64) method) to state explicitly that `processed` is
a monotonically increasing cumulative counter for the whole subtask and must use
a single stable unit for the entire subtask lifetime (never reset or switch
units mid-subtask), and that GetSpeedInTimeRange depends on this invariant; also
clarify that `rows` is the physical row count during conflict handling (and
otherwise represents the same cumulative-per-subtask semantic), and mention any
concurrency expectations (e.g., callers may update this from different
goroutines but updates must be monotonic).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3055b4f7-91a0-4d1a-8435-d453493bcf6d

📥 Commits

Reviewing files that changed from the base of the PR and between 1fc24d3 and 986c042.

📒 Files selected for processing (25)
  • pkg/ddl/backfilling_clean_s3.go
  • pkg/ddl/backfilling_read_index.go
  • pkg/dxf/framework/storage/table_test.go
  • pkg/dxf/framework/taskexecutor/execute/interface.go
  • pkg/dxf/framework/taskexecutor/execute/interface_test.go
  • pkg/dxf/importinto/collect_conflicts.go
  • pkg/dxf/importinto/conflict_resolution.go
  • pkg/dxf/importinto/conflictedkv/collector.go
  • pkg/dxf/importinto/conflictedkv/collector_test.go
  • pkg/dxf/importinto/conflictedkv/deleter.go
  • pkg/dxf/importinto/conflictedkv/deleter_test.go
  • pkg/dxf/importinto/conflictedkv/handler.go
  • pkg/dxf/importinto/conflictedkv/handler_test.go
  • pkg/dxf/importinto/job.go
  • pkg/dxf/importinto/job_testkit_test.go
  • pkg/dxf/importinto/planner.go
  • pkg/dxf/importinto/scheduler.go
  • pkg/dxf/importinto/task_executor.go
  • pkg/executor/importer/chunk_process_testkit_test.go
  • pkg/executor/importer/import.go
  • pkg/executor/show.go
  • pkg/executor/show_test.go
  • pkg/lightning/backend/external/merge.go
  • pkg/lightning/backend/external/onefile_writer_test.go
  • pkg/lightning/backend/external/sort_test.go

Comment on lines +89 to +91
// Processed stores generic progress units, such as bytes processed or rows processed.
// Keep the JSON tag as "bytes" for backward compatibility with persisted summaries.
Processed int64 `json:"bytes,omitempty"`
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't overload persisted bytes with a different unit.

Keeping the legacy JSON tag preserves decoding, but it also changes the meaning of persisted state. Conflict steps will now write counts under bytes, so any step-agnostic or mixed-version reader that still formats this field as a size will silently report bogus progress/speed. This should be versioned explicitly, or paired with an explicit unit, instead of reusing the bytes key.

Also applies to: 100-105

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/dxf/framework/taskexecutor/execute/interface.go` around lines 89 - 91,
The persisted `bytes` JSON key is being reused for non-bytes counts; add an
explicit unit field and keep the legacy `bytes` tag to avoid silent
misinterpretation: add a new string field (e.g. ProcessedUnit
`json:"processed_unit,omitempty"`) alongside the existing Processed int64 (leave
its json:"bytes,omitempty" tag for backward compatibility) and update all
writers (e.g. conflict step code that sets Processed) to also set ProcessedUnit
to "bytes" or "count" as appropriate; update readers to consult ProcessedUnit
before treating Processed as a byte size. Ensure the same change is applied to
the other analogous fields mentioned in the review (lines ~100-105).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it serves due as a progressed data unit, either bytes or rows

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@D3Hunter, understood — the Processed / bytes field is intentionally a generic progress-unit counter (bytes during encode/merge, conflict-row count during conflict steps), and the step-aware formatting in RuntimeInfo already dispatches on step type to present the correct unit to consumers. I'll note this for future reviews.


✏️ Learnings added
Learnt from: D3Hunter
Repo: pingcap/tidb PR: 67551
File: pkg/dxf/framework/taskexecutor/execute/interface.go:89-91
Timestamp: 2026-04-08T08:19:36.935Z
Learning: In `pkg/dxf/framework/taskexecutor/execute/interface.go`, `SubtaskSummary.Processed` (JSON tag `"bytes"`) and `Progress.Processed` (JSON tag `"bytes"`) are intentionally generic "progressed data unit" fields: they store bytes during encode/merge steps and conflict-row counts during `ImportStepCollectConflicts` / `ImportStepConflictResolution` steps. The `bytes` JSON tag is kept for backward compatibility only. Correct unit interpretation is handled by the step-aware `RuntimeInfo` rendering layer in `pkg/dxf/importinto/job.go`. Do not flag this dual-use as a bug or request a separate unit field.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: joechenrh
Repo: pingcap/tidb PR: 67054
File: pkg/lightning/mydump/parquet_parser.go:542-545
Timestamp: 2026-03-17T06:01:45.974Z
Learning: In `pkg/lightning/mydump/parquet_parser.go`, `ParquetParser.lastRow.SkipCast` intentionally aliases the mutable `pp.skipCast` buffer (same as `Row.Row` aliasing the pool datum slice). Both are only valid until the next `ReadRow()` call. This is by design: callers (e.g., the encode path in `pkg/executor/importer/chunk_process.go`) consume `LastRow().SkipCast` immediately before the next read. Do not flag the direct assignment `pp.lastRow.SkipCast = pp.skipCast` as an aliasing bug.

Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: tests/realtikvtest/importintotest/import_into_test.go:253-267
Timestamp: 2026-03-11T06:26:14.065Z
Learning: In pingcap/tidb, PR `#66878` (cherry-pick of `#58401` to release-7.5) also bundles the follow-up nil-check fix from PR `#63146`. `TestOnUpdateColumn` in `tests/realtikvtest/importintotest/import_into_test.go` was cherry-picked from PR `#63146` and tests that `IMPORT INTO` on a table with an `ON UPDATE CURRENT_TIMESTAMP` column does NOT panic due to a nil `onDup` map in `resolveGeneratedColumns`. It is NOT intended to cover the transitive generated-column / index-inconsistency fix from `#58401/`#58400.

Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T12:48:57.065Z
Learning: Applies to pkg/executor/** : For SQL behavior changes in executor, perform targeted unit test plus relevant integration test.

Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: pkg/planner/core/logical_plan_builder.go:6455-6472
Timestamp: 2026-03-11T06:29:00.122Z
Learning: Ensure code reviews verify the VirtualAssignmentsOffset semantics: the planner sets Update.VirtualAssignmentsOffset = len(update.List). The executor should only apply OrderedList[:VirtualAssignmentsOffset] when composing new rows, and only after the 'changed' check should it group/evaluate OrderedList[VirtualAssignmentsOffset:] per table. This pattern applies to files under pkg/planner/core and pkg/executor (e.g., common_plans.go and update.go). Reviewers should check that updates respect slicing behavior, that the offset is consistently derived from the planner, and that downstream code does not bypass the offset when creating new rows. Add tests validating both branches: the slice before the offset for new rows, and the per-table handling of the slice after the offset.

Copy link
Copy Markdown

@pantheon-ai pantheon-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Code looks good. No issues found.

@ingress-bot
Copy link
Copy Markdown

🔍 Starting code review for this PR...

Copy link
Copy Markdown

@ingress-bot ingress-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review was generated by AI and should be verified by a human reviewer.
Manual follow-up is recommended before merge.

Summary

  • Total findings: 7
  • Inline comments: 7
  • Summary-only findings (no inline anchor): 0
Findings (highest risk first)

⚠️ [Major] (4)

  1. Conflict progress contract says rows, but the implementation records conflict KV pairs (pkg/dxf/framework/taskexecutor/execute/interface.go:210, pkg/dxf/importinto/conflictedkv/handler.go:128, pkg/dxf/importinto/conflictedkv/handler_test.go:235, pkg/ingestor/engineapi/engine.go:76)
  2. conflictedkv now leaks taskexecutor collector contract across package boundaries (pkg/dxf/importinto/conflictedkv/handler.go:88, pkg/dxf/importinto/conflictedkv/collector.go:117, pkg/dxf/importinto/conflictedkv/deleter.go:76, pkg/dxf/importinto/collect_conflicts.go:71, pkg/dxf/importinto/conflict_resolution.go:51)
  3. Per-record shared progress atomic creates a conflict-handling hotspot (pkg/dxf/importinto/conflictedkv/handler.go:128, pkg/dxf/importinto/collect_conflicts.go:204, pkg/dxf/importinto/conflict_resolution.go:143)
  4. SHOW IMPORT JOB changes existing size/speed field units during rolling upgrade (pkg/dxf/importinto/job.go:234, pkg/dxf/importinto/job.go:249, pkg/executor/show.go:2569)

🟡 [Minor] (3)

  1. Retry/replay parity for new conflict progress accounting is not covered (pkg/dxf/importinto/conflictedkv/handler.go:128, pkg/dxf/importinto/conflictedkv/handler_test.go:142, pkg/dxf/importinto/conflictedkv/handler_test.go:235)
  2. Conflict-step totals have no fallback when reading pre-upgrade task metadata (pkg/dxf/importinto/job.go:344, pkg/executor/importer/import.go:379)
  3. Summary comment now overstates unit symmetry after adding conflict-step summaries (pkg/executor/importer/import.go:371, pkg/executor/importer/import.go:379, pkg/dxf/importinto/planner.go:717)

Comment thread pkg/dxf/framework/taskexecutor/execute/interface.go Outdated
Comment thread pkg/dxf/importinto/conflictedkv/handler.go
Comment thread pkg/dxf/importinto/conflictedkv/handler.go
Comment thread pkg/dxf/importinto/job.go
Comment thread pkg/dxf/importinto/conflictedkv/handler.go
Comment thread pkg/dxf/importinto/job.go
Comment thread pkg/executor/importer/import.go
@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented Apr 8, 2026

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 8, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented Apr 8, 2026

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 8, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented Apr 8, 2026

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 8, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented Apr 9, 2026

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 9, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented Apr 9, 2026

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 9, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented Apr 9, 2026

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 9, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 11, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: GMHDBJD, joechenrh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added lgtm approved and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 13, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 13, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-04-11 07:42:18.871615638 +0000 UTC m=+1201344.076975695: ☑️ agreed by joechenrh.
  • 2026-04-13 09:37:40.248078914 +0000 UTC m=+1381065.453438971: ☑️ agreed by GMHDBJD.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 13, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 13, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot merged commit 1589e75 into pingcap:master Apr 13, 2026
38 checks passed
@D3Hunter D3Hunter deleted the codex/show-import-job-conflict-progress branch April 13, 2026 14:41
premal pushed a commit to premal/tidb that referenced this pull request Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants