Skip to content

fix insert ignore panic#22754

Merged
mergify[bot] merged 2 commits into
matrixorigin:mainfrom
aptend:fix-panic-insert-ignore
Nov 3, 2025
Merged

fix insert ignore panic#22754
mergify[bot] merged 2 commits into
matrixorigin:mainfrom
aptend:fix-panic-insert-ignore

Conversation

@aptend
Copy link
Copy Markdown
Contributor

@aptend aptend commented Nov 3, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22733

What this PR does / why we need it:

Fixes panic index out of range [29] with length 29 in INSERT IGNORE with duplicate primary keys.

Root cause: When Batches.Shrink() removes duplicate rows (35→23), InputBatchRowCount was not updated, causing bitmap initialization with wrong size.

Test: Added test case with 35 rows (12 duplicates) to verify deduplication works correctly.


PR Type

Bug fix


Description

  • Fix panic in INSERT IGNORE with duplicate primary keys

  • Update InputBatchRowCount after Batches.Shrink removes duplicates

  • Add comprehensive test case with 35 rows and 12 duplicates


Diagram Walkthrough

flowchart LR
  A["INSERT IGNORE with duplicates"] --> B["Batches.Shrink removes duplicates"]
  B --> C["InputBatchRowCount not updated"]
  C --> D["Bitmap initialization fails"]
  D --> E["Index out of range panic"]
  B --> F["Update InputBatchRowCount"]
  F --> G["Correct bitmap size"]
  G --> H["Successful insert"]
Loading

File Walkthrough

Relevant files
Bug fix
hashmap_util.go
Update InputBatchRowCount after shrinking batches               

pkg/sql/colexec/hashmap_util/hashmap_util.go

  • Update InputBatchRowCount after Batches.Shrink() removes duplicate
    rows
  • Ensures bitmap initialization uses correct row count
  • Prevents index out of range panic when handling INSERT IGNORE with
    duplicates
+4/-0     
Tests
insert_ignore.sql
Add regression test for INSERT IGNORE panic                           

test/distributed/cases/dml/insert/insert_ignore.sql

  • Add test case for INSERT IGNORE with many duplicate primary keys
  • Insert 35 rows with 12 duplicates to reproduce the panic scenario
  • Verify deduplication results in 23 unique rows
  • Include detailed comments explaining the test purpose and expected
    behavior
+58/-0   
insert_ignore.result
Add expected results for INSERT IGNORE test                           

test/distributed/cases/dml/insert/insert_ignore.result

  • Add expected test results for the new INSERT IGNORE test case
  • Verify final row count is 23 after deduplication
  • Confirm panic no longer occurs with duplicate primary keys
+46/-0   

@qodo-code-review
Copy link
Copy Markdown

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Concurrency assumption

Description: Possible logic error risk if concurrent use of hb.Batches or hb.InputBatchRowCount occurs
without synchronization; however, based on context it appears single-threaded in build
path and not an exploitable security issue.
hashmap_util.go [407-416]

Referred Code
if hb.IsDedup && hb.OnDuplicateAction == plan.Node_IGNORE {
	err := hb.Batches.Shrink(hb.IgnoreRows, proc)
	if err != nil {
		return err
	}
	// Update InputBatchRowCount to reflect the actual row count after shrinking
	// This is critical because IgnoreRows removed duplicate rows, so the actual
	// row count in batches is now less than the original InputBatchRowCount
	hb.InputBatchRowCount = hb.Batches.RowCount()
}
Ticket Compliance
🟡
🎫 #22733
🟢 Reproduce and fix the panic "index out of range [29] with length 29" occurring during
INSERT IGNORE with duplicate primary keys.
Ensure the underlying cause related to incorrect row count after deduplication is
addressed so bitmap initialization uses the correct size.
Add a test case that demonstrates the issue with many duplicates and verifies no panic
occurs and the correct final row count is achieved.
Validate via runtime execution that no panic occurs under varied workloads beyond the
provided test (e.g., different batch sizes, data types, and distributed settings).
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new logic updating hb.InputBatchRowCount after shrink performs a critical data-path
action without any corresponding audit or diagnostic logging, making it unclear in
production when and how many rows were deduplicated.

Referred Code
	// Update InputBatchRowCount to reflect the actual row count after shrinking
	// This is critical because IgnoreRows removed duplicate rows, so the actual
	// row count in batches is now less than the original InputBatchRowCount
	hb.InputBatchRowCount = hb.Batches.RowCount()
}
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing context: While shrink errors are returned, the new update to InputBatchRowCount lacks validation or
safeguards (e.g., ensuring row count is non-negative and consistent), and no contextual
error/log is provided if inconsistencies arise.

Referred Code
	// Update InputBatchRowCount to reflect the actual row count after shrinking
	// This is critical because IgnoreRows removed duplicate rows, so the actual
	// row count in batches is now less than the original InputBatchRowCount
	hb.InputBatchRowCount = hb.Batches.RowCount()
}
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@mergify mergify Bot added the kind/bug Something isn't working label Nov 3, 2025
@qodo-code-review
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@mergify mergify Bot added the queued label Nov 3, 2025
@mergify mergify Bot merged commit 92aa291 into matrixorigin:main Nov 3, 2025
19 checks passed
@mergify mergify Bot removed the queued label Nov 3, 2025
fengttt pushed a commit to fengttt/matrixone that referenced this pull request May 7, 2026
Fixes panic `index out of range [29] with length 29` in `INSERT IGNORE` with duplicate primary keys.

**Root cause**: When `Batches.Shrink()` removes duplicate rows (35→23), `InputBatchRowCount` was not updated, causing bitmap initialization with wrong size.

**Test**: Added test case with 35 rows (12 duplicates) to verify deduplication works correctly.

Approved by: @ouyuanning, @heni02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working Review effort 2/5 size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants