ddl, table: skip pre-check SELECT for widening modify-column changes#67994
ddl, table: skip pre-check SELECT for widening modify-column changes#67994joechenrh wants to merge 1 commit intopingcap:masterfrom
Conversation
|
Hi @joechenrh. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughConditionally skips data pre-validation during MODIFY COLUMN index reorganization for certain char/varchar flen widenings, adds a failpoint at check entry, tightens NULL handling during index casting, and adds five tests covering reorg, rollback, and widening behaviors. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
713e6a2 to
d74a639
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #67994 +/- ##
================================================
+ Coverage 77.7904% 78.6992% +0.9087%
================================================
Files 1982 2002 +20
Lines 549742 559286 +9544
================================================
+ Hits 427647 440154 +12507
+ Misses 121175 116877 -4298
- Partials 920 2255 +1335
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/ddl/modify_column_test.go`:
- Around line 785-815: The test inserts a value ('too_long_to_fit_varchar_10')
that's longer than the source CHAR(20) and can fail before the ALTER reaches the
index-reorg cast; replace that string with a value between 11 and 20 bytes
(e.g., 11–20 ASCII chars) so it fits CHAR(20) but exceeds VARCHAR(10), keeping
the rest of the setup (the insert, the ALTER in castIndexValuesToChangingTypes
path, and the testfailpoint that captures gotTp) unchanged; update the inserted
tuple in the test block that currently inserts into t and assert behavior as
before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1241af33-889b-4746-9cdd-84980727de51
📒 Files selected for processing (3)
pkg/ddl/modify_column.gopkg/ddl/modify_column_test.gopkg/table/tables/index.go
d74a639 to
7ece496
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/ddl/modify_column_test.go (1)
776-806:⚠️ Potential issue | 🔴 CriticalUse a value that fits
CHAR(20)but notVARCHAR(10).
'too_long_to_fit_varchar_10'is 26 bytes, so the setup insert can fail before the ALTER reaches the intended modify-column path. Use an 11–20 byte value to isolate the target-cast failure.🐛 Proposed test fixture fix
- // 'too_long_to_fit_varchar_10' is 26 bytes, over the target varchar(10) - // limit. The cast in castIndexValuesToChangingTypes must reject it. - tk.MustExec("insert into t values ('ok'), ('too_long_to_fit_varchar_10')") + // 'too_long_11' fits the source char(20), but is over the target + // varchar(10) limit. The cast must reject it during modify-column + // validation. + tk.MustExec("insert into t values ('ok'), ('too_long_11')") @@ tk.MustExec("admin check table t") tk.MustQuery("select a from t order by a"). - Check(testkit.Rows("ok", "too_long_to_fit_varchar_10")) + Check(testkit.Rows("ok", "too_long_11"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/ddl/modify_column_test.go` around lines 776 - 806, The inserted test row uses a 26-byte string which can fail earlier than the intended cast check; change the insert values in the test setup (the tk.MustExec call that inserts into table t) so the second value is between 11 and 20 bytes (fits CHAR(20) but exceeds VARCHAR(10)) to force castIndexValuesToChangingTypes to reject it and exercise the ModifyTypeIndexReorg path; keep the rest of the test (the testfailpoint.EnableCall on github.com/pingcap/tidb/pkg/ddl/getModifyColumnType, the ALTER statement, and subsequent assertions on tblInfo/col/indices) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/ddl/modify_column_test.go`:
- Around line 776-806: The inserted test row uses a 26-byte string which can
fail earlier than the intended cast check; change the insert values in the test
setup (the tk.MustExec call that inserts into table t) so the second value is
between 11 and 20 bytes (fits CHAR(20) but exceeds VARCHAR(10)) to force
castIndexValuesToChangingTypes to reject it and exercise the
ModifyTypeIndexReorg path; keep the rest of the test (the
testfailpoint.EnableCall on github.com/pingcap/tidb/pkg/ddl/getModifyColumnType,
the ALTER statement, and subsequent assertions on tblInfo/col/indices)
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d1640ef4-d785-4f85-a60a-89455f294c19
📒 Files selected for processing (7)
pkg/ddl/modify_column.gopkg/ddl/modify_column_test.gopkg/lightning/mydump/parquet_parser_test.gopkg/table/column.gopkg/table/column_test.gopkg/table/tables/index.gopkg/table/tables/tables.go
✅ Files skipped from review due to trivial changes (1)
- pkg/table/tables/index.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/ddl/modify_column.go
7ece496 to
58a9719
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/ddl/modify_column.go`:
- Around line 68-83: The integer branch in dataValuesFitNewType (the
isIntegerChange block) is effectively unreachable on the ModifyTypeIndexReorg
path because getModifyColumnType/needIndexReorg route unsigned-flag changes away
from that path; either delete the integer branch to keep the helper's contract
accurate, or keep it but add a targeted unit test exercising an integer widening
scenario (e.g., tinyint→int with an index) plus a clear comment in
dataValuesFitNewType explaining why this defensive check is kept; update/restore
tests that assert ModifyTypeIndexReorg behavior accordingly and reference
dataValuesFitNewType, isIntegerChange, getModifyColumnType, needIndexReorg, and
ModifyTypeIndexReorg when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 60c81f69-2fe1-42df-8916-29169c6e954f
📒 Files selected for processing (7)
pkg/ddl/modify_column.gopkg/ddl/modify_column_test.gopkg/lightning/mydump/parquet_parser_test.gopkg/table/column.gopkg/table/column_test.gopkg/table/tables/index.gopkg/table/tables/tables.go
✅ Files skipped from review due to trivial changes (1)
- pkg/table/column_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/table/tables/tables.go
- pkg/ddl/modify_column_test.go
58a9719 to
0af40d6
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/ddl/modify_column.go (1)
68-83: Doc comment slightly mislocates theNotNullFlagguard.The comment claims "the NotNullFlag guard in
CastColumnValueWithStrictModecatches existing NULLs during reorg," butCastColumnValueWithStrictModeitself doesn't reject NULLs — the guard is implemented right after the call, incastIndexValuesToChangingTypes(pkg/table/tables/index.golines 142–143). Consider tightening the wording so future readers can find the guard.📝 Proposed wording tweak
-// dataValuesFitNewType reports whether every stored value of oldCol is -// guaranteed to fit newCol so the pre-check SQL can be skipped on the -// ModifyTypeIndexReorg path. Only integer and char/varchar flen widening -// qualify; NULL -> NOT NULL is still safe because the NotNullFlag guard in -// CastColumnValueWithStrictMode catches existing NULLs during reorg. +// dataValuesFitNewType reports whether every stored value of oldCol is +// guaranteed to fit newCol so the pre-check SQL can be skipped on the +// ModifyTypeIndexReorg path. Only integer and char/varchar flen widening +// qualify; NULL -> NOT NULL is still safe because castIndexValuesToChangingTypes +// (pkg/table/tables/index.go) rejects NULLs against NotNullFlag when the +// changing index is populated during reorg.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/ddl/modify_column.go` around lines 68 - 83, Update the doc comment on dataValuesFitNewType to correct the location of the NotNullFlag guard: state that NULL rejection is enforced after calling CastColumnValueWithStrictMode (not inside it) and point to the guard implemented in castIndexValuesToChangingTypes where NotNullFlag is checked; keep the rest of the explanation about safe type changes intact and only adjust wording to reference CastColumnValueWithStrictMode and castIndexValuesToChangingTypes specifically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/table/tables/index.go`:
- Around line 140-144: castIndexValuesToChangingTypes currently rejects
pre-existing NULLs unconditionally (returning ErrColumnCantNull), which blocks
DELETE/UPDATE during StateWriteOnly; change the function signature to accept an
explicit boolean (e.g., isReorgPhase or enforceNotNullForReorg) and only perform
the IsNull + HasNotNullFlag check when that flag is true; update callers
(GenIndexKey, GenIndexValue) to pass false for normal DML paths and have the
reorg code/path pass true so the NULL rejection only applies to the reorg
enforcement path (retain UseChangingType behavior otherwise).
---
Nitpick comments:
In `@pkg/ddl/modify_column.go`:
- Around line 68-83: Update the doc comment on dataValuesFitNewType to correct
the location of the NotNullFlag guard: state that NULL rejection is enforced
after calling CastColumnValueWithStrictMode (not inside it) and point to the
guard implemented in castIndexValuesToChangingTypes where NotNullFlag is
checked; keep the rest of the explanation about safe type changes intact and
only adjust wording to reference CastColumnValueWithStrictMode and
castIndexValuesToChangingTypes specifically.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 98f9de79-03b9-4201-9a56-8c3cb171ff3f
📒 Files selected for processing (3)
pkg/ddl/modify_column.gopkg/ddl/modify_column_test.gopkg/table/tables/index.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/ddl/modify_column_test.go
d43dcfd to
605bc57
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/ddl/modify_column_test.go (1)
771-801:⚠️ Potential issue | 🟠 MajorTest input still exceeds source
CHAR(20)— INSERT will fail before the ALTER runs.
'too_long_to_fit_varchar_10'is 26 characters, which exceeds the source column typechar(20). Under TiDB's defaultsql_mode(which includesSTRICT_TRANS_TABLES), theINSERTat line 773 should error with "Data too long for column 'a'" before control ever reaches theALTERunder test. The subsequent assertion at lines 800-801 that expectstoo_long_to_fit_varchar_10to be present intis also inconsistent with that. Please use an 11–20 byte value so the setup succeeds and the failure isolates to the target-type cast duringModifyTypeIndexReorg.🐛 Proposed fix
- // 'too_long_to_fit_varchar_10' is 26 bytes, over the target varchar(10) - // limit. The cast in castIndexValuesToChangingTypes must reject it. - tk.MustExec("insert into t values ('ok'), ('too_long_to_fit_varchar_10')") + // 'too_long_11' fits the source char(20) but exceeds the target + // varchar(10); the cast in castIndexValuesToChangingTypes must reject it + // during index reorg. + tk.MustExec("insert into t values ('ok'), ('too_long_11')") @@ tk.MustExec("admin check table t") tk.MustQuery("select a from t order by a"). - Check(testkit.Rows("ok", "too_long_to_fit_varchar_10")) + Check(testkit.Rows("ok", "too_long_11"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/ddl/modify_column_test.go` around lines 771 - 801, The test inserts a 26-byte string that exceeds the source column CHAR(20) so the INSERT fails before the ALTER; replace the literal 'too_long_to_fit_varchar_10' used in tk.MustExec("insert into t values ('ok'), ('too_long_to_fit_varchar_10')") with a shorter string of length 11–20 bytes (e.g., an 11-byte value) so the insert succeeds and the failure is triggered by the ALTER/ModifyTypeIndexReorg path (also update the preceding comment to reflect the new length constraint).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/ddl/modify_column_test.go`:
- Around line 771-801: The test inserts a 26-byte string that exceeds the source
column CHAR(20) so the INSERT fails before the ALTER; replace the literal
'too_long_to_fit_varchar_10' used in tk.MustExec("insert into t values ('ok'),
('too_long_to_fit_varchar_10')") with a shorter string of length 11–20 bytes
(e.g., an 11-byte value) so the insert succeeds and the failure is triggered by
the ALTER/ModifyTypeIndexReorg path (also update the preceding comment to
reflect the new length constraint).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ccb5e8e8-92dc-4ad6-86aa-5c8bfd0967da
📒 Files selected for processing (3)
pkg/ddl/modify_column.gopkg/ddl/modify_column_test.gopkg/table/tables/index.go
|
/retest |
|
@joechenrh: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
605bc57 to
97b125f
Compare
97b125f to
71a21e6
Compare
|
🔍 Starting code review for this PR... |
When ALTER TABLE ... MODIFY COLUMN is classified as ModifyTypeIndexReorg (for example char(N) -> varchar(M) with utf8mb4_bin + a secondary index), the StateDeleteOnly handler runs a synchronous SELECT ... FROM t WHERE LENGTH(col) > N OR col IS NULL LIMIT 1 through the internal session pool. On large tables this SELECT has no timeout, pins the cluster GC safepoint for its entire duration, blocks subsequent DDL on the same table, and is invisible to SHOW PROCESSLIST. getModifyColumnType only routes isCharChange cases into ModifyTypeIndexReorg (integer signedness flips go to ModifyTypeReorg, same-signedness integer changes resolve to noReorgDataStrict or NoReorgWithCheck; decimal, temporal, enum/set are similarly handled before this path). So the only widening case that reaches this handler is char/varchar flen widening, and for that case no existing row can possibly violate the new type: the pre-check is pure overhead. Gate the pre-check in doModifyColumnIndexReorg's StateDeleteOnly branch on `isCharChange(oldCol, newCol) && newCol.GetFlen() >= oldCol.GetFlen()`; otherwise keep the existing pre-check. In pkg/table/tables/index.go, extend castIndexValuesToChangingTypes with a NULL guard right after the strict cast. The underlying Datum.ConvertTo converts NULL to NULL without error, so when the new column type carries NotNullFlag this guard is what protects the race window between the statement-build NULL check (GetModifiableColumnJob calls checkForNullValue) and the DDL job's StateNone handler setting PreventNullInsertFlag. A NULL row that lands in that window would otherwise slip through the reorg silently. Add failpoint hook checkModifyColumnDataEntry at the top of checkModifyColumnData so tests can observe whether the pre-check was reached. Tests: - TestModifyColumnIndexReorgWideningSkipsPrecheck covers the widening happy path; checkModifyColumnData is never entered and the ALTER succeeds. - TestModifyColumnIndexReorgRangeViolation preserves the narrowing failure path (pre-check still runs and catches the oversized row). - TestModifyColumnIndexReorgRollbackOnReorgPanic pins the reorg-phase rollback path end-to-end via errorMockPanic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
71a21e6 to
46da87f
Compare
ingress-bot
left a comment
There was a problem hiding this comment.
This review was generated by AI and should be verified by a human reviewer.
Manual follow-up is recommended before merge.
Summary
- Total findings: 7
- Inline comments: 7
- Summary-only findings (no inline anchor): 0
Findings (highest risk first)
🚨 [Blocker] (2)
- Delete-only precheck skip is not version-gated for NULL-to-NOT-NULL widening (pkg/ddl/modify_column.go:1208, pkg/table/tables/index.go:142)
- Flen-widening fast path skips VARCHAR-to-CHAR trailing-space validation (pkg/ddl/modify_column.go:1208, pkg/ddl/modify_column.go:867, pkg/table/column.go:385)
⚠️ [Major] (2)
- NULL-to-NOT-NULL failures are deferred into reorg for flen-widening index alters (pkg/ddl/modify_column.go:1208, pkg/ddl/modify_column.go:873, pkg/table/tables/index.go:142)
- Widening fast-path changes NULL->NOT NULL failure contract for indexed char/varchar (pkg/ddl/modify_column.go:1208, pkg/table/tables/index.go:142)
🟡 [Minor] (3)
- Index-reorg precheck policy is split across classification, execution, and cast enforcement (pkg/ddl/modify_column.go:1208, pkg/ddl/modify_column.go:129, pkg/table/tables/index.go:142)
widensFlenname and branch comment understate the actual skip-precheck gate (pkg/ddl/modify_column.go:1208)castIndexValuesToChangingTypesno longer signals its validation side effect (pkg/table/tables/index.go:142)
| // getModifyColumnType, so this is the one case where no existing | ||
| // row can violate. NULL -> NOT NULL is caught by the NotNullFlag | ||
| // guard in castIndexValuesToChangingTypes during reorg. | ||
| widensFlen := isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen() |
There was a problem hiding this comment.
🚨 [Blocker] Delete-only precheck skip is not version-gated for NULL-to-NOT-NULL widening
Impact
During rolling-upgrade owner handoff, a char/varchar widening NULL -> NOT NULL modify-column job reaches StateWriteOnly without running the NULL precheck and then resumes on an older owner that lacks the new reorg NULL guard. The job then finalizes a NOT NULL schema over preexisting NULL rows, which breaks mixed-version upgrade and rollback safety.
Scope
pkg/ddl/modify_column.go:1208—worker.doModifyColumnIndexReorgpkg/table/tables/index.go:142—index.castIndexValuesToChangingTypes
Evidence
widensFlen skips checkModifyColumnData for all char/varchar widening cases in StateDeleteOnly, and there is no job.Version or capability gate around that branch. The skip now relies on the new NotNullFlag rejection in castIndexValuesToChangingTypes, but a resumed job in StateWriteOnly does not rerun the delete-only precheck, so cross-version owner takeover depends on this new guard being present on every owner binary.
Change request
Do not skip delete-only precheck for NULL -> NOT NULL transitions, even when widensFlen is true, or add an explicit version/capability gate before enabling this optimization. Ensure unsupported mixed-version resume paths fail fast with an actionable error instead of relying on implicit owner-version behavior. Add a compatibility-focused test that forces owner handoff between StateDeleteOnly and StateWriteOnly with preexisting NULL rows.
| // getModifyColumnType, so this is the one case where no existing | ||
| // row can violate. NULL -> NOT NULL is caught by the NotNullFlag | ||
| // guard in castIndexValuesToChangingTypes during reorg. | ||
| widensFlen := isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen() |
There was a problem hiding this comment.
🚨 [Blocker] Flen-widening fast path skips VARCHAR-to-CHAR trailing-space validation
Impact
For indexed VARCHAR -> CHAR conversions on _bin collations with new_flen >= old_flen, rows ending with spaces are no longer rejected in strict mode and the DDL can succeed after trimming those values.
This changes the canonical accept/reject behavior and can silently alter user-visible stored values.
Scope
pkg/ddl/modify_column.go:1208—(*worker).doModifyColumnIndexReorgpkg/ddl/modify_column.go:867—buildCheckSQLFromModifyColumnpkg/table/column.go:385—castColumnValue
Evidence
widensFlen now bypasses checkModifyColumnData for all char/varchar widen-or-equal transitions in StateDeleteOnly.
The skipped checker is the only path that enforces trailing-space rejection for VARCHAR -> CHAR via LIKE '% ', while the reorg path uses CastColumnValueWithStrictMode, which trims trailing spaces for TypeString through truncateTrailingSpaces without returning an error.
Change request
Keep the widening optimization only for pure length-range checks, but do not skip the VARCHAR -> CHAR trailing-space validation path.
A minimal fix is to exclude oldTp == mysql.TypeVarchar && newTp == mysql.TypeString from widensFlen skip handling (or split precheck so this semantic check always runs).
| // getModifyColumnType, so this is the one case where no existing | ||
| // row can violate. NULL -> NOT NULL is caught by the NotNullFlag | ||
| // guard in castIndexValuesToChangingTypes during reorg. | ||
| widensFlen := isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen() |
There was a problem hiding this comment.
⚠️ [Major] NULL-to-NOT-NULL failures are deferred into reorg for flen-widening index alters
Impact
Indexed CHAR/VARCHAR alters that widen flen and add NOT NULL now enter WriteReorganization even when the table already contains NULL rows that must fail.
The job spends backfill scan/write resources before returning the same deterministic invalid-data failure, which creates avoidable long-running DDL work and load spikes on larger tables.
Scope
pkg/ddl/modify_column.go:1208—(*worker).doModifyColumnIndexReorgpkg/ddl/modify_column.go:873—buildCheckSQLFromModifyColumnpkg/table/tables/index.go:142—(*index).castIndexValuesToChangingTypes
Evidence
doModifyColumnIndexReorg skips checkModifyColumnData(...) entirely when widensFlen is true, so the delete-only precheck no longer runs in that branch. buildCheckSQLFromModifyColumn includes the IS NULL predicate for isNullToNotNullChange, but that early check is bypassed by the widening guard. NULL rejection is instead raised from castIndexValuesToChangingTypes during reorg backfill via table.ErrColumnCantNull.
Change request
Keep the widening optimization for length/range checks, but retain a null-only precheck when isNullToNotNullChange(oldCol, args.Column) is true. This preserves early failure before reorg for already-invalid datasets while still avoiding the unnecessary widening-length pre-scan.
| // getModifyColumnType, so this is the one case where no existing | ||
| // row can violate. NULL -> NOT NULL is caught by the NotNullFlag | ||
| // guard in castIndexValuesToChangingTypes during reorg. | ||
| widensFlen := isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen() |
There was a problem hiding this comment.
⚠️ [Major] Widening fast-path changes NULL->NOT NULL failure contract for indexed char/varchar
Impact
For indexed char/varchar widening with NULL -> NOT NULL, existing NULL rows now fail in reorg with [table:1048] instead of the prior pre-check error [ddl:1138].
During rolling upgrade, the same DDL can return different error codes depending on which binary version executes the relevant job stage, which breaks repeatable upgrade-time failure handling.
Scope
pkg/ddl/modify_column.go:1208—doModifyColumnIndexReorgpkg/table/tables/index.go:142—castIndexValuesToChangingTypes
Evidence
doModifyColumnIndexReorg now skips checkModifyColumnData when widensFlen is true, so the old pre-check path is bypassed for widening (pkg/ddl/modify_column.go:1208-1210).
checkModifyColumnData is the branch that returns dbterror.ErrInvalidUseOfNull (pkg/ddl/modify_column.go:835), while the new fallback emits table.ErrColumnCantNull from castIndexValuesToChangingTypes (pkg/table/tables/index.go:142-143).
Change request
Keep the widening optimization for length checks, but preserve error-contract parity for NULL -> NOT NULL by still running the NULL pre-check (or by translating the reorg null failure to ErrInvalidUseOfNull).
Add a focused regression case in TestModifyColumnIndexReorgPrecheck that inserts NULL data, performs the widening+NOT NULL alter, and asserts stable error code plus rollback result across repeated runs.
| // getModifyColumnType, so this is the one case where no existing | ||
| // row can violate. NULL -> NOT NULL is caught by the NotNullFlag | ||
| // guard in castIndexValuesToChangingTypes during reorg. | ||
| widensFlen := isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen() |
There was a problem hiding this comment.
🟡 [Minor] Index-reorg precheck policy is split across classification, execution, and cast enforcement
Impact
The rule for skipping checkModifyColumnData during ModifyTypeIndexReorg now lives in three separate locations across two packages. This parallel contract raises maintenance cost and makes policy updates harder to keep consistent because no single function owns the full decision boundary.
Scope
pkg/ddl/modify_column.go:1208—doModifyColumnIndexReorgpkg/ddl/modify_column.go:129—getModifyColumnTypepkg/table/tables/index.go:142—castIndexValuesToChangingTypes
Evidence
doModifyColumnIndexReorg introduces widensFlen to gate checkModifyColumnData, while the same behavior depends on getModifyColumnType selecting ModifyTypeIndexReorg and on castIndexValuesToChangingTypes enforcing NotNullFlag at reorg time. The new comment in this branch explicitly ties these cross-location assumptions together, which shows the policy is coordinated by convention instead of a shared helper contract.
Change request
Extract a dedicated DDL-level helper for index-reorg precheck eligibility and use it from the state-machine branch instead of embedding the policy inline. Keep castIndexValuesToChangingTypes as runtime enforcement, but make the dependency explicit through a named contract and add a focused test that asserts the precheck decision matrix in one place.
| // getModifyColumnType, so this is the one case where no existing | ||
| // row can violate. NULL -> NOT NULL is caught by the NotNullFlag | ||
| // guard in castIndexValuesToChangingTypes during reorg. | ||
| widensFlen := isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen() |
There was a problem hiding this comment.
🟡 [Minor] widensFlen name and branch comment understate the actual skip-precheck gate
Impact
The StateDeleteOnly path skips checkModifyColumnData for both widening and equal-length CHAR/VARCHAR changes because the gate is args.Column.GetFlen() >= oldCol.GetFlen(). The identifier widensFlen and the surrounding wording describe widening only, so readers get an inaccurate contract for when pre-check is bypassed.
Scope
pkg/ddl/modify_column.go:1208—doModifyColumnIndexReorg
Evidence
widensFlen is computed as isCharChange(oldCol, args.Column) && args.Column.GetFlen() >= oldCol.GetFlen(), which includes the equality case. The adjacent comment labels the rule as flen widening and does not state that equal length also skips the pre-check.
Change request
Rename the gate to reflect non-narrowing semantics, such as nonNarrowingFlen, or tighten the condition to strict widening if that is the intended policy. Update the nearby comment to explicitly document the exact skip condition (>= or >) so branch behavior and text stay aligned.
| } | ||
| // Strict cast passes NULL through; reject it here when the new type | ||
| // carries NotNullFlag so reorg surfaces existing NULL rows. | ||
| if indexedValues[i].IsNull() && mysql.HasNotNullFlag(tblCol.ChangingFieldType.GetFlag()) { |
There was a problem hiding this comment.
🟡 [Minor] castIndexValuesToChangingTypes no longer signals its validation side effect
Impact
castIndexValuesToChangingTypes now enforces a NOT NULL constraint and returns ErrColumnCantNull in addition to performing type conversion. A cast-only name hides that semantic check and obscures the error contract that GenIndexKey and GenIndexValue now inherit from this helper.
Scope
pkg/table/tables/index.go:142—castIndexValuesToChangingTypes
Evidence
Inside castIndexValuesToChangingTypes, the new branch indexedValues[i].IsNull() && mysql.HasNotNullFlag(tblCol.ChangingFieldType.GetFlag()) returns table.ErrColumnCantNull. That branch introduces explicit constraint validation beyond casting, but the function name still communicates conversion only.
Change request
Either rename this helper to include both behaviors, for example castAndValidateIndexValuesForChangingTypes, or split validation into a separate helper with an intent-focused name. Keep a short function-level comment that states both conversion and NOT NULL enforcement so callers can read the contract correctly.
What problem does this PR solve?
Issue Number: ref #68013
Problem Summary:
ALTER TABLE ... MODIFY COLUMNon theModifyTypeIndexReorgpath runs a synchronous pre-check SELECT inStateDeleteOnlythat has no timeout, pins the cluster GC safepoint for its whole duration, blocks subsequent DDL on the same table, and is invisible toSHOW PROCESSLIST. For widening changes (for examplechar(60) -> varchar(70)) this scan is pure overhead — no existing row can violate.What changed and how does it work?
ModifyTypeIndexReorgis only reachable viaisCharChangeingetModifyColumnType. Gate the pre-check SELECT indoModifyColumnIndexReorg'sStateDeleteOnlybranch onisCharChange(oldCol, newCol) && newCol.GetFlen() >= oldCol.GetFlen(); skip it for widening, keep it otherwise.Add a NULL guard in
castIndexValuesToChangingTypesright after the strict cast:Datum.ConvertTopasses NULL through, so when the new type carriesNotNullFlagthe guard rejects existing NULL rows during reorg. This keepsNULL -> NOT NULLsafe on the widening skip path.Add a
checkModifyColumnDataEntryfailpoint at the top ofcheckModifyColumnDataso tests can observe whether the pre-check was reached.Check List
Tests
Three unit tests cover: widening skip (pre-check not entered), narrowing with an oversized row (pre-check catches), and reorg-phase rollback via
errorMockPanic.Side effects
Documentation
Release note