executor, planner: improve INSERT performance for tables with many virtual generated columns by bb7133 · Pull Request #67917 · pingcap/tidb

bb7133 · 2026-04-20T18:57:22Z

What problem does this PR solve?

Issue Number: close #67916

Problem Summary:

INSERT into a table with many virtual generated columns (GENERATED ALWAYS AS (...) VIRTUAL) is significantly slower than equivalent tables without generated columns. On a table with 150 virtual GCs, inserting 10,000 rows took 36.69s vs 0.68s without GCs (~54x regression). MySQL 8.2 handles the same workload with negligible overhead (~1.1x).

Root causes identified via CPU profiling:

1. Executor — O(G×C) allocations in fillRow() (~79% of CPU)

MutRowFromDatums(row) was called inside the GC evaluation loop, allocating a full copy of the row for every generated column. For G generated columns and C total columns, this is O(G×C) allocations per inserted row.

2. Planner — redundant Clone()+rewrite() in resolveGeneratedColumns()

Every GC expression was cloned and rewritten during prepare, including trivially constant expressions like AS (NULL) that require no rewriting.

3. SetDatum unsafe for null-allocated columns (fixed-size types)

MutRow.SetDatum for KindMysqlTime, KindMysqlDuration, and KindMysqlDecimal wrote directly to col.data[0] without checking the buffer length. A null-allocated column has zero-length data, causing a panic. This prevented using SetDatum to keep the MutRow in sync after each GC, forcing a full MutRowFromDatums rebuild instead.

What is changed and how it works?

1. Executor (pkg/executor/insert_common.go):

Hoist MutRowFromDatums(row) before the GC loop (one allocation per row). After each GC evaluation, call mutRow.SetDatum(colIdx, row[colIdx]) to update only the changed column in-place — O(1) per GC instead of O(C).

2. Planner (pkg/planner/core/planbuilder.go):

Add a fast path in resolveGeneratedColumns() for pure value literal expressions (*driver.ValueExpr): skip Clone()+rewrite() and directly construct a *expression.Constant.

3. Chunk (pkg/util/chunk/mutrow.go):

Fix SetDatum for the three fixed-size types to grow the column data buffer when needed (when transitioning from a null-allocated column). This makes SetDatum safe to call for any datum kind, enabling the in-place update in fillRow.

Performance Results

End-to-end (local tidb-server, 10k-row CTE INSERT, 150 virtual GCs):

Workload	Before	After	Speedup
`obj_gc` (150 literal `AS (NULL)` GCs)	36.69s	1.64s	~22x
`obj_nogc` (baseline)	0.68s	0.90s	—

Go benchmark (mock store, master branch):

Benchmark	Before patch	After patch	MySQL 8.2
NoGC (150 plain INT cols)	—	240 µs	177 µs
LiteralGC `AS (NULL)` (150 GCs)	~14ms*	507 µs	159 µs
NonLiteralGC `AS (LOWER(name))` (150 GCs)	~14ms*	1,188 µs	193 µs

*Before-patch estimate based on Codex's 36.69s / 10k rows.

The non-literal GC case (e.g. LOWER(name)) improves ~12x compared to the original. There remains a gap vs MySQL 8.2 for non-literal GCs (1.2ms vs 0.2ms) which is left for future work.

Check List

Tests

Unit test (TestInsert* in pkg/executor pass)
Unit test (TestMutRow* in pkg/util/chunk pass)
Manual test (benchmark above)

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

INSERT into tables with many virtual generated columns is now significantly faster. The fix eliminates O(G×C) row allocations per inserted row (where G = number of generated columns, C = total columns), achieving ~22x speedup on tables with 150 literal virtual generated columns, and ~12x speedup for non-literal virtual generated columns (e.g. `GENERATED ALWAYS AS (LOWER(name))`).

Summary by CodeRabbit

Performance Improvements
- Faster evaluation of generated columns during INSERT, improving INSERT throughput on wide tables with many virtual/generated columns.
- Reduced per-row allocation when writing fixed-size MySQL/numeric types, improving update/insert performance and memory behavior.
Bug Fixes
- Corrected buffer allocation for temporal/decimal types to prevent write errors.
Tests
- Added benchmarks exercising INSERT on very wide tables to track performance.

pantheon-ai · 2026-04-20T18:57:28Z

@bb7133 I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

_{ℹ️ Learn more details on Pantheon AI.}

coderabbitai · 2026-04-20T18:57:34Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 977ceccd-792c-4669-939d-0704d45600fc

📥 Commits

Reviewing files that changed from the base of the PR and between 4f981e1 and 71cecd5.

📒 Files selected for processing (5)

pkg/executor/BUILD.bazel
pkg/executor/bench_gencol_test.go
pkg/executor/insert_common.go
pkg/planner/core/planbuilder.go
pkg/util/chunk/mutrow.go

✅ Files skipped from review due to trivial changes (2)

pkg/executor/BUILD.bazel
pkg/executor/bench_gencol_test.go

🚧 Files skipped from review as they are similar to previous changes (3)

pkg/util/chunk/mutrow.go
pkg/executor/insert_common.go
pkg/planner/core/planbuilder.go

📝 Walkthrough

Walkthrough

Reused a single MutRow during generated-column evaluation, added a planner fast-path to emit constants for literal generated expressions, sized MutRow buffers defensively for fixed-size types, and added three benchmarks exercising INSERT on wide tables with/without virtual generated columns.

Changes

Cohort / File(s)	Summary
Benchmarks `pkg/executor/bench_gencol_test.go`	Added three Go benchmarks measuring INSERT performance for: no generated columns, many literal virtual generated columns, and many non-literal virtual generated columns (dependent on a base column).
Executor: generated-column evaluation `pkg/executor/insert_common.go`	Create one `chunk.MutRow` once in `fillRow` and reuse it for all generated-column Eval calls; update `mutRow` in-place after each generated-column computation so subsequent generated columns see prior results.
Planner: generated-column resolution `pkg/planner/core/planbuilder.go`	Fast-path in `resolveGeneratedColumns`: when a generated expression is a literal (`*driver.ValueExpr`), build an `expression.Constant` directly and skip Clone+rewrite and the temporary `allowBuildCastArray` toggle.
MutRow buffer sizing `pkg/util/chunk/mutrow.go`	`MutRow.SetDatum` now pre-allocates `col.data` to fixed sizes for certain MySQL/numeric kinds (int, float, time, duration, decimal) before performing unsafe/fixed-size writes.
Build config `pkg/executor/BUILD.bazel`	Added `bench_gencol_test.go` to the `executor_test` go_test `srcs`.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client (TestKit)
  participant Planner as Planner (resolveGeneratedColumns)
  participant Executor as Executor (fillRow / GenExprs)
  participant MutRow as chunk.MutRow
  participant Storage as KV/Store

  Client->>Planner: CREATE/PREPARE table (with generated cols)
  Planner-->>Client: resolved generated-column exprs (constants for literals)
  Client->>Executor: INSERT row(s)
  Executor->>MutRow: MutRowFromDatums(row) (single allocation)
  loop for each generated column
    Executor->>MutRow: Eval GenExprs[i] using mutRow.ToRow()
    MutRow-->>Executor: evaluated datum
    Executor->>MutRow: SetDatum(colIdx, datum) (in-place)
  end
  Executor->>Storage: write final row
  Storage-->>Executor: ack
  Executor-->>Client: result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

planner/core: update missing virtual columns in update and insert (#58401) #66878 — Modifies resolveGeneratedColumns; touches the same planner code path.
planner/core: update missing virtual columns in update and insert (#58401) #64752 — Also changes planner handling of generated columns; overlaps with this PR's planner changes.

Suggested reviewers

qw4990
Benjamin2037
D3Hunter

Poem

🐰 One hop, one reuse, no needless copy to hide,
MutRow keeps the row while generated values collide,
Literals turned constant, rewrite skipped away,
Buffers sized snugly so unsafe writes obey,
Benchmarks clap softly — INSERTs stride. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main performance improvement: speeding up INSERT operations for tables with many virtual generated columns.
Description check	✅ Passed	The description is comprehensive, following the template structure with problem summary, detailed explanation of changes, performance results, and checklist completion.
Linked Issues check	✅ Passed	All code changes directly address the two objectives from `#67916`: eliminating O(G×C) allocations in fillRow() [67916], skipping redundant Clone()+rewrite() for literals [67916], and fixing SetDatum safety.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issue `#67916`: executor changes eliminate per-GC allocations, planner changes add literal fast path, chunk changes fix SetDatum safety, and benchmarks validate the improvements.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ti-chi-bot · 2026-04-20T20:07:46Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign guo-shaoge for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS
pkg/planner/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/executor/insert_common.go (1)
719-755: ⚠️ Potential issue | 🟡 Minor

Add regression test for dependent fixed-size-type generated columns with NULL evaluation.

TestInsertDuplicateToGeneratedColumns (insert_test.go:725) covers chained DATETIME dependent generated columns with non-NULL values, exercising the mutRow refresh path. However, the code comment at lines 751–755 documents a critical constraint for fixed-size types (DECIMAL, TIME, DURATION): when a generated column evaluates to NULL, mutRow is intentionally NOT refreshed to avoid panicking buffers originally allocated for null. This NULL case is not tested.

Add a regression test for INSERT with chained fixed-size-type generated columns, specifically covering NULL evaluation: for example, a DECIMAL GC referencing another DECIMAL GC that evaluates to NULL, to ensure the skip-refresh logic operates correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/executor/insert_common.go` around lines 719 - 755, Add a regression test
that exercises the "do not refresh mutRow on NULL for fixed-size types" path:
create a table with chained generated columns of a fixed-size type (e.g.,
DECIMAL) where the inner generated column can evaluate to NULL and the outer GC
references it; in a new test (e.g., TestInsertGeneratedFixedSizeNullChain)
perform an INSERT that causes the inner GC to be NULL and assert the INSERT
succeeds (no panic) and the stored values/warnings match expectations; place the
test near TestInsertDuplicateToGeneratedColumns and use the same test harness
utilities to construct the table, run the INSERT, and verify results so the
skip-refresh logic for mutRow (code around mutRow := chunk.MutRowFromDatums(row)
and the if !row[colIdx].IsNull() refresh) is covered.

🧹 Nitpick comments (1)

pkg/executor/bench_gencol_test.go (1)
46-47: Avoid the internal tracker name in the benchmark comment.

GTOC-8384 is less useful to future readers than a public regression description.
Proposed comment tweak
-// generated columns (GENERATED ALWAYS AS (NULL) VIRTUAL) — the GTOC-8384 pattern.
+// generated columns (GENERATED ALWAYS AS (NULL) VIRTUAL), matching the wide-GC INSERT regression.
As per coding guidelines, code should remain maintainable for future readers with basic TiDB familiarity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/executor/bench_gencol_test.go` around lines 46 - 47, The benchmark
comment for BenchmarkInsertWideTableWithGC contains an internal tracker
reference (GTOC-8384); remove that internal ID and replace it with a concise
public description of the regression/behavior being tested (e.g.,
"regression/performance scenario for INSERT into a table with 150 virtual
generated columns") so the comment reads clearly for future readers without
internal ticket IDs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/executor/bench_gencol_test.go`:
- Line 1: This new file's license header is out of date; update the top-of-file
header in bench_gencol_test.go to include the standard TiDB Apache-2.0 license
block copied from a nearby Go source and change the copyright year to the
current year (2026), ensuring the full multi-line header (copyright + Apache 2.0
notice) matches project conventions.
- Around line 1-64: The new test file bench_gencol_test.go was not added to the
Bazel srcs list for the pkg/executor go_test target; update the package metadata
by running make bazel_prepare and committing the generated changes, or manually
edit the pkg/executor/BUILD.bazel file to add "bench_gencol_test.go" to the srcs
array for the go_test target (place it alphabetically between
batch_point_get_test.go and benchmark_test.go) so Bazel includes the new test.

In `@pkg/planner/core/planbuilder.go`:
- Around line 4026-4028: The fast-path that converts a GeneratedExpr ValueExpr
into an expression.Constant using column.FieldType.Clone() loses the special
handling performed by rewrite()/expression_rewriter.go for *driver.ValueExpr
(nullability, datum type/repertoire and UTF8MB4 collation), so restore
equivalent semantics: detect the *driver.ValueExpr case in the fast path (where
you currently check valExpr, ok :=
column.GeneratedExpr.Internal().(*driver.ValueExpr)) and apply the same
adjustments the rewrite() path does to produce the Constant (adjust nullability
and call the same datum/type/collation transformations used in
expression_rewriter.go:1557-1620) instead of unconditionally using
column.FieldType.Clone(); alternatively, restrict the fast-path to only the NULL
literal case and leave all other ValueExprs to the rewrite() path.

---

Outside diff comments:
In `@pkg/executor/insert_common.go`:
- Around line 719-755: Add a regression test that exercises the "do not refresh
mutRow on NULL for fixed-size types" path: create a table with chained generated
columns of a fixed-size type (e.g., DECIMAL) where the inner generated column
can evaluate to NULL and the outer GC references it; in a new test (e.g.,
TestInsertGeneratedFixedSizeNullChain) perform an INSERT that causes the inner
GC to be NULL and assert the INSERT succeeds (no panic) and the stored
values/warnings match expectations; place the test near
TestInsertDuplicateToGeneratedColumns and use the same test harness utilities to
construct the table, run the INSERT, and verify results so the skip-refresh
logic for mutRow (code around mutRow := chunk.MutRowFromDatums(row) and the if
!row[colIdx].IsNull() refresh) is covered.

---

Nitpick comments:
In `@pkg/executor/bench_gencol_test.go`:
- Around line 46-47: The benchmark comment for BenchmarkInsertWideTableWithGC
contains an internal tracker reference (GTOC-8384); remove that internal ID and
replace it with a concise public description of the regression/behavior being
tested (e.g., "regression/performance scenario for INSERT into a table with 150
virtual generated columns") so the comment reads clearly for future readers
without internal ticket IDs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2e0236ab-72d8-4a18-b20d-73f0ef562f0b

📥 Commits

Reviewing files that changed from the base of the PR and between 7f3e45f and f3cf61c.

📒 Files selected for processing (3)

pkg/executor/bench_gencol_test.go
pkg/executor/insert_common.go
pkg/planner/core/planbuilder.go

codecov · 2026-04-20T20:30:37Z

Codecov Report

❌ Patch coverage is 91.17647% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.1566%. Comparing base (7f3e45f) to head (71cecd5).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #67917        +/-   ##
================================================
- Coverage   77.8055%   77.1566%   -0.6489%     
================================================
  Files          1983       1965        -18     
  Lines        549119     549149        +30     
================================================
- Hits         427245     423705      -3540     
- Misses       120954     125441      +4487     
+ Partials        920          3       -917

Flag	Coverage Δ
integration	`40.8845% <91.1764%> (+1.0872%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`50.0681% <ø> (-13.0316%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/util/chunk/mutrow.go (1)

305-328: ⚠️ Potential issue | 🔴 Critical

Cover all fixed-size SetDatum and SetValue writes from originally-NULL columns.

The new guards fix Time/Duration/Decimal, but KindInt64/KindUint64/KindFloat64 and KindFloat32 still write into col.data without ensuring it is long enough. When a column is created from a NULL datum via MutRowFromDatums, it receives a zero-length buffer via newMutRowVarLenColumn(0). Calling SetDatum or SetValue with a numeric type on such a column will panic trying to write to that zero-length buffer. Also update SetValue (lines 258–266), which has the same unguarded numeric writes. When growing from NULL state, preserve the fixed-column invariant (elemBuf = data and offsets = nil).

Suggested direction

+func ensureMutRowFixedLenColumn(col *Column, size int) {
+	if len(col.data) < size {
+		col.data = make([]byte, size)
+	} else {
+		col.data = col.data[:size]
+	}
+	col.elemBuf = col.data
+	col.offsets = nil
+}
+
 func (mr MutRow) SetDatum(colIdx int, d types.Datum) {
 	col := mr.c.columns[colIdx]
 	cleanColOfMutRow(col)
 	if d.IsNull() {
 		return
 	}
 	switch d.Kind() {
 	case types.KindInt64, types.KindUint64, types.KindFloat64:
-		binary.LittleEndian.PutUint64(mr.c.columns[colIdx].data, d.GetUint64())
+		ensureMutRowFixedLenColumn(col, 8)
+		binary.LittleEndian.PutUint64(col.data, d.GetUint64())
 	case types.KindFloat32:
-		binary.LittleEndian.PutUint32(mr.c.columns[colIdx].data, math.Float32bits(d.GetFloat32()))
+		ensureMutRowFixedLenColumn(col, 4)
+		binary.LittleEndian.PutUint32(col.data, math.Float32bits(d.GetFloat32()))
 	case types.KindString, types.KindBytes, types.KindBinaryLiteral:
 		setMutRowBytes(col, d.GetBytes())
 	case types.KindMysqlTime:
-		if len(col.data) < sizeTime {
-			col.data = make([]byte, sizeTime)
-		}
+		ensureMutRowFixedLenColumn(col, sizeTime)
 		*(*types.Time)(unsafe.Pointer(&col.data[0])) = d.GetMysqlTime()
 	case types.KindMysqlDuration:
-		if len(col.data) < 8 {
-			col.data = make([]byte, 8)
-		}
+		ensureMutRowFixedLenColumn(col, 8)
 		*(*int64)(unsafe.Pointer(&col.data[0])) = int64(d.GetMysqlDuration().Duration)
 	case types.KindMysqlDecimal:
-		if len(col.data) < types.MyDecimalStructSize {
-			col.data = make([]byte, types.MyDecimalStructSize)
-		}
+		ensureMutRowFixedLenColumn(col, types.MyDecimalStructSize)
 		*(*types.MyDecimal)(unsafe.Pointer(&col.data[0])) = *d.GetMysqlDecimal()

 func (mr MutRow) SetValue(colIdx int, val any) {
 	col := mr.c.columns[colIdx]
 	cleanColOfMutRow(col)
 	if val == nil {
 		return
 	}
 	switch x := val.(type) {
 	case int:
-		binary.LittleEndian.PutUint64(col.data, uint64(x))
+		ensureMutRowFixedLenColumn(col, 8)
+		binary.LittleEndian.PutUint64(col.data, uint64(x))
 	case int64:
-		binary.LittleEndian.PutUint64(col.data, uint64(x))
+		ensureMutRowFixedLenColumn(col, 8)
+		binary.LittleEndian.PutUint64(col.data, uint64(x))
 	case uint64:
-		binary.LittleEndian.PutUint64(col.data, x)
+		ensureMutRowFixedLenColumn(col, 8)
+		binary.LittleEndian.PutUint64(col.data, x)
 	case float64:
-		binary.LittleEndian.PutUint64(col.data, math.Float64bits(x))
+		ensureMutRowFixedLenColumn(col, 8)
+		binary.LittleEndian.PutUint64(col.data, math.Float64bits(x))
 	case float32:
-		binary.LittleEndian.PutUint32(col.data, math.Float32bits(x))
+		ensureMutRowFixedLenColumn(col, 4)
+		binary.LittleEndian.PutUint32(col.data, math.Float32bits(x))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/util/chunk/mutrow.go` around lines 305 - 328, Numeric fixed-size writes
in SetDatum and SetValue may write into a zero-length buffer for columns created
from NULL (via newMutRowVarLenColumn(0)); add length checks and grow col.data
before writing for the 8-byte kinds (KindInt64, KindUint64, KindFloat64,
KindMysqlDuration) and 4-byte KindFloat32 using make([]byte, 8) or make([]byte,
4) as appropriate, then perform binary.LittleEndian.PutUint64/PutUint32 or the
float32 bit conversion; when you grow a NULL column preserve the fixed-column
invariant by ensuring elemBuf = data and offsets = nil on the column struct
(same changes applied in both SetDatum and SetValue for
mr.c.columns[colIdx].data / col.data).

♻️ Duplicate comments (1)

pkg/planner/core/planbuilder.go (1)

4023-4035: ⚠️ Potential issue | 🟠 Major

Preserve ValueExpr rewrite semantics in the fast path.

This shortcut still bypasses the driver.ValueExpr handling in rewrite(). For non-NULL literal generated columns, using column.FieldType.Clone() directly can change constant typing/collation/nullability. Either mirror the rewrite logic here or restrict the shortcut to the targeted NULL literal case.

Safer narrow fast path for the targeted NULL case

-		if valExpr, ok := column.GeneratedExpr.Internal().(*driver.ValueExpr); ok {
-			expr = &expression.Constant{Value: valExpr.Datum, RetType: column.FieldType.Clone()}
+		if valExpr, ok := column.GeneratedExpr.Internal().(*driver.ValueExpr); ok && valExpr.Datum.Kind() == types.KindNull {
+			retType := valExpr.Type.Clone()
+			retType.DelFlag(mysql.NotNullFlag)
+			value := valExpr.Datum
+			value.SetValue(value.GetValue(), retType)
+			expr = &expression.Constant{Value: value, RetType: retType}
 		} else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/planbuilder.go` around lines 4023 - 4035, The fast-path that
converts a driver.ValueExpr into expression.Constant using
column.FieldType.Clone() bypasses the rewrite() handling and thus can change
typing/collation/nullability; update the logic in planbuilder.go so either (A)
only take the fast path when valExpr.Datum is a NULL literal (i.e. check
valExpr.Datum is nil) or (B) preserve rewrite semantics by invoking
b.rewrite(ctx, column.GeneratedExpr.Clone(), mockPlan, nil, true) even when
column.GeneratedExpr.Internal() is a *driver.ValueExpr (so the driver.ValueExpr
branch in rewrite runs); use the existing symbols (column.GeneratedExpr,
driver.ValueExpr, expression.Constant, column.FieldType.Clone(), and b.rewrite)
to locate and implement the chosen fix.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/util/chunk/mutrow.go`:
- Around line 305-328: Numeric fixed-size writes in SetDatum and SetValue may
write into a zero-length buffer for columns created from NULL (via
newMutRowVarLenColumn(0)); add length checks and grow col.data before writing
for the 8-byte kinds (KindInt64, KindUint64, KindFloat64, KindMysqlDuration) and
4-byte KindFloat32 using make([]byte, 8) or make([]byte, 4) as appropriate, then
perform binary.LittleEndian.PutUint64/PutUint32 or the float32 bit conversion;
when you grow a NULL column preserve the fixed-column invariant by ensuring
elemBuf = data and offsets = nil on the column struct (same changes applied in
both SetDatum and SetValue for mr.c.columns[colIdx].data / col.data).

---

Duplicate comments:
In `@pkg/planner/core/planbuilder.go`:
- Around line 4023-4035: The fast-path that converts a driver.ValueExpr into
expression.Constant using column.FieldType.Clone() bypasses the rewrite()
handling and thus can change typing/collation/nullability; update the logic in
planbuilder.go so either (A) only take the fast path when valExpr.Datum is a
NULL literal (i.e. check valExpr.Datum is nil) or (B) preserve rewrite semantics
by invoking b.rewrite(ctx, column.GeneratedExpr.Clone(), mockPlan, nil, true)
even when column.GeneratedExpr.Internal() is a *driver.ValueExpr (so the
driver.ValueExpr branch in rewrite runs); use the existing symbols
(column.GeneratedExpr, driver.ValueExpr, expression.Constant,
column.FieldType.Clone(), and b.rewrite) to locate and implement the chosen fix.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: eea7a0eb-6c1a-4c54-b1bf-92f9aede7053

📥 Commits

Reviewing files that changed from the base of the PR and between f3cf61c and 4f981e1.

📒 Files selected for processing (5)

pkg/executor/BUILD.bazel
pkg/executor/bench_gencol_test.go
pkg/executor/insert_common.go
pkg/planner/core/planbuilder.go
pkg/util/chunk/mutrow.go

✅ Files skipped from review due to trivial changes (1)

pkg/executor/BUILD.bazel

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/executor/insert_common.go

…rtual generated columns For a table with G virtual generated columns and C total columns, the original fillRow() called MutRowFromDatums(row) inside the GC evaluation loop — O(G×C) allocations per row. CPU profiling showed this accounted for ~79% of execution time on wide-GC tables (GTOC-8384). Executor fix: hoist MutRowFromDatums() before the loop (one allocation per row). Rebuild the MutRow only when a GC produces a non-null value, because null is already the zero state in the pre-allocated MutRow, and SetDatum panics for fixed-size types (Time, Duration, Decimal) when the column data buffer was originally null-allocated (zero length). Planner fix: add a fast path in resolveGeneratedColumns() for pure value literal expressions (driver.ValueExpr, e.g. AS (NULL) VIRTUAL), skipping the expensive Clone()+rewrite() call and directly constructing a Constant. Performance on a 150-virtual-GC table (10k-row CTE insert): Before: 36.69s After: 1.64s (~22x speedup) Close pingcap#67916 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ti-chi-bot · 2026-04-21T06:21:29Z

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-tests-checked label, please finished the tests then check the finished items in description.

For example:

Tests

Unit test

Integration test

Manual test (add detailed scripts or steps below)

No code

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

ti-chi-bot · 2026-04-21T06:59:34Z

@bb7133: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-lightning-integration-test	`71a64a6`	link	true	`/test pull-lightning-integration-test`
pull-br-integration-test	`71a64a6`	link	true	`/test pull-br-integration-test`
pull-unit-test-ddlv1	`71a64a6`	link	true	`/test pull-unit-test-ddlv1`
pull-unit-test-next-gen	`71cecd5`	link	true	`/test pull-unit-test-next-gen`
idc-jenkins-ci-tidb/unit-test	`71cecd5`	link	true	`/test unit-test`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-tests-checked labels Apr 20, 2026

ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. component/dumpling This is related to Dumpling of TiDB. component/statistics sig/planner SIG: Planner labels Apr 20, 2026

bb7133 force-pushed the improve-insert-virtual-gencol-perf branch from 71a64a6 to f3cf61c Compare April 20, 2026 20:07

ti-chi-bot Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 20, 2026

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread pkg/executor/bench_gencol_test.go Outdated

Comment thread pkg/executor/bench_gencol_test.go Outdated

Comment thread pkg/planner/core/planbuilder.go

bb7133 force-pushed the improve-insert-virtual-gencol-perf branch 4 times, most recently from c8523f8 to 4f981e1 Compare April 20, 2026 21:06

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

bb7133 force-pushed the improve-insert-virtual-gencol-perf branch from 4f981e1 to 71cecd5 Compare April 21, 2026 06:10

Conversation

bb7133 commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Performance Results

Check List

Release note

Summary by CodeRabbit

Uh oh!

pantheon-ai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

ti-chi-bot Bot commented Apr 20, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bb7133 commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

pantheon-ai Bot commented Apr 20, 2026 •

edited

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

codecov Bot commented Apr 20, 2026 •

edited

Loading