fix insert ignore panic by aptend · Pull Request #22754 · matrixorigin/matrixone

aptend · 2025-11-03T03:15:38Z

User description

What type of PR is this?

Which issue(s) this PR fixes:

What this PR does / why we need it:

Fixes panic index out of range [29] with length 29 in INSERT IGNORE with duplicate primary keys.

Root cause: When Batches.Shrink() removes duplicate rows (35→23), InputBatchRowCount was not updated, causing bitmap initialization with wrong size.

Test: Added test case with 35 rows (12 duplicates) to verify deduplication works correctly.

PR Type

Bug fix

Description

Fix panic in INSERT IGNORE with duplicate primary keys
Update InputBatchRowCount after Batches.Shrink removes duplicates
Add comprehensive test case with 35 rows and 12 duplicates

Diagram Walkthrough

flowchart LR
  A["INSERT IGNORE with duplicates"] --> B["Batches.Shrink removes duplicates"]
  B --> C["InputBatchRowCount not updated"]
  C --> D["Bitmap initialization fails"]
  D --> E["Index out of range panic"]
  B --> F["Update InputBatchRowCount"]
  F --> G["Correct bitmap size"]
  G --> H["Successful insert"]

File Walkthrough

Relevant files

Bug fix

hashmap_util.go `Update InputBatchRowCount after shrinking batches` pkg/sql/colexec/hashmap_util/hashmap_util.go Update `InputBatchRowCount` after `Batches.Shrink()` removes duplicate rows Ensures bitmap initialization uses correct row count Prevents index out of range panic when handling INSERT IGNORE with duplicates	+4/-0

Tests

insert_ignore.sql `Add regression test for INSERT IGNORE panic` test/distributed/cases/dml/insert/insert_ignore.sql Add test case for INSERT IGNORE with many duplicate primary keys Insert 35 rows with 12 duplicates to reproduce the panic scenario Verify deduplication results in 23 unique rows Include detailed comments explaining the test purpose and expected behavior	+58/-0
insert_ignore.result `Add expected results for INSERT IGNORE test` test/distributed/cases/dml/insert/insert_ignore.result Add expected test results for the new INSERT IGNORE test case Verify final row count is 23 after deduplication Confirm panic no longer occurs with duplicate primary keys	+46/-0

qodo-code-review · 2025-11-03T03:16:06Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance

⚪

Concurrency assumption

Description: Possible logic error risk if concurrent use of hb.Batches or hb.InputBatchRowCount occurs
without synchronization; however, based on context it appears single-threaded in build
path and not an exploitable security issue.
hashmap_util.go [407-416]

Referred Code

if hb.IsDedup && hb.OnDuplicateAction == plan.Node_IGNORE {
	err := hb.Batches.Shrink(hb.IgnoreRows, proc)
	if err != nil {
		return err
	}
	// Update InputBatchRowCount to reflect the actual row count after shrinking
	// This is critical because IgnoreRows removed duplicate rows, so the actual
	// row count in batches is now less than the original InputBatchRowCount
	hb.InputBatchRowCount = hb.Batches.RowCount()
}

Ticket Compliance

🟡

🎫 #22733

🟢	Reproduce and fix the panic "index out of range [29] with length 29" occurring during INSERT IGNORE with duplicate primary keys.
	Ensure the underlying cause related to incorrect row count after deduplication is addressed so bitmap initialization uses the correct size.
	Add a test case that demonstrates the issue with many duplicates and verifies no panic occurs and the correct final row count is achieved.
⚪	Validate via runtime execution that no panic occurs under varied workloads beyond the provided test (e.g., different batch sizes, data types, and distributed settings).

Codebase Duplication Compliance

⚪

Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance

🟢

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

⚪

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new logic updating hb.InputBatchRowCount after shrink performs a critical data-path
action without any corresponding audit or diagnostic logging, making it unclear in
production when and how many rows were deduplicated.

Referred Code

	// Update InputBatchRowCount to reflect the actual row count after shrinking
	// This is critical because IgnoreRows removed duplicate rows, so the actual
	// row count in batches is now less than the original InputBatchRowCount
	hb.InputBatchRowCount = hb.Batches.RowCount()
}

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing context: While shrink errors are returned, the new update to InputBatchRowCount lacks validation or
safeguards (e.g., ensuring row count is non-negative and consistent), and no contextual
error/log is provided if inconsistencies arise.

Referred Code

	// Update InputBatchRowCount to reflect the actual row count after shrinking
	// This is critical because IgnoreRows removed duplicate rows, so the actual
	// row count in batches is now less than the original InputBatchRowCount
	hb.InputBatchRowCount = hb.Batches.RowCount()
}

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2025-11-03T03:16:30Z

PR Code Suggestions ✨

No code suggestions found for the PR.

@ouyuanning

Fixes panic `index out of range [29] with length 29` in `INSERT IGNORE` with duplicate primary keys. **Root cause**: When `Batches.Shrink()` removes duplicate rows (35→23), `InputBatchRowCount` was not updated, causing bitmap initialization with wrong size. **Test**: Added test case with 35 rows (12 duplicates) to verify deduplication works correctly. Approved by: @ouyuanning, @heni02

fix shrink

ad0f18f

aptend requested review from aunjgr, heni02 and ouyuanning as code owners November 3, 2025 03:15

aptend temporarily deployed to ci November 3, 2025 03:15 — with GitHub Actions Inactive

aptend had a problem deploying to ci November 3, 2025 03:15 — with GitHub Actions Error

aptend temporarily deployed to ci November 3, 2025 03:15 — with GitHub Actions Inactive

matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Nov 3, 2025

qodo-code-review Bot added the Review effort 2/5 label Nov 3, 2025

mergify Bot added the kind/bug Something isn't working label Nov 3, 2025

ouyuanning approved these changes Nov 3, 2025

View reviewed changes

heni02 approved these changes Nov 3, 2025

View reviewed changes

mergify Bot added the queued label Nov 3, 2025

Merge branch 'main' into fix-panic-insert-ignore

8f0d607

mergify Bot temporarily deployed to ci November 3, 2025 04:10 Inactive

mergify Bot merged commit 92aa291 into matrixorigin:main Nov 3, 2025
19 checks passed

mergify Bot temporarily deployed to ci November 3, 2025 05:00 Inactive

mergify Bot had a problem deploying to ci November 3, 2025 05:00 Failure

mergify Bot removed the queued label Nov 3, 2025

mergify Bot temporarily deployed to ci November 3, 2025 05:06 Inactive

mergify Bot temporarily deployed to ci November 3, 2025 05:09 Inactive

mergify Bot had a problem deploying to ci November 3, 2025 05:28 Failure

mergify Bot temporarily deployed to ci November 3, 2025 05:41 Inactive

mergify Bot temporarily deployed to ci November 3, 2025 06:18 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix insert ignore panic#22754

fix insert ignore panic#22754
mergify[bot] merged 2 commits into
matrixorigin:mainfrom
aptend:fix-panic-insert-ignore

aptend commented Nov 3, 2025 •

edited by qodo-code-review Bot

Loading

Uh oh!

qodo-code-review Bot commented Nov 3, 2025

Uh oh!

qodo-code-review Bot commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aptend commented Nov 3, 2025 • edited by qodo-code-review Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

What type of PR is this?

Which issue(s) this PR fixes:

What this PR does / why we need it:

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review Bot commented Nov 3, 2025

PR Compliance Guide 🔍

Uh oh!

qodo-code-review Bot commented Nov 3, 2025

PR Code Suggestions ✨

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aptend commented Nov 3, 2025 •

edited by qodo-code-review Bot

Loading