fix: prevent OOM in PKPersistedBetween under high-concurrency lock checks (cherry-pick to 4.0-dev)#24380
Merged
heni02 merged 3 commits intoMay 13, 2026
Conversation
…ecks Under 1000 concurrent TPCC NEW_ORDER transactions, each LockOp triggers PKPersistedBetween to verify primary key conflicts. This path loads object metadata, bloom filters, and block PK columns for all changed objects since the transaction's snapshot. With many concurrent writers on the same table, this produces a thundering-herd of block I/O that exhausts mpool capacity. Three mitigations: 1. Early exit when changed objects exceed threshold (64) — conservatively returns "may be modified" to avoid loading hundreds of object meta/BF. 2. Early exit when candidate blocks exceed threshold (32) — avoids reading too many blocks that passed zonemap + bloom filter checks. 3. Global semaphore (capacity 16) on the block I/O phase — caps peak concurrent mpool allocations from LoadColumns. Also extends the secondary index selectivity guard in getIndexForNonEquiCond to cover in_range operators (previously only single range ops were guarded), preventing non-selective in_range conditions from triggering full index table scans on large tables. Fixes matrixorigin#24348 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- TestPkCheckSemaphore_LimitsConcurrency: verifies the global semaphore actually limits concurrent goroutines to its capacity (16) - TestPkCheckSemaphore_RespectsContextCancellation: verifies cancelled context causes immediate return without blocking - TestIsRangeOp: verifies the expanded range operator classification covers in_range in addition to single comparison ops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. [Must Fix] Semaphore scope narrowed to candidateBlks loop only.
Previously used `defer` which held the slot through tombstone I/O,
causing unintended rate-limiting. Now explicitly released after the
block loop, so tombstonePKExistsInRange is not throttled.
2. [Should Fix] Counter semantics: bail-out paths now use a dedicated
TxnPKChangeCheckBailoutCounter ("bailout" label) instead of reusing
TxnPKChangeCheckChangedCounter. Dashboards can distinguish real
conflicts from protective early exits.
3. [Should Fix] IO counter moved after semaphore acquire. Previously
incremented before the select, so ctx cancellation would produce a
phantom IO count.
4. [Should Fix] Objects cap now only counts cObjs (inserted objects),
not delObjs. Deleted objects don't contain conflicting PKs and
shouldn't inflate the threshold check.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
ouyuanning
approved these changes
May 13, 2026
aptend
approved these changes
May 13, 2026
XuPeng-SH
approved these changes
May 13, 2026
gouhongshen
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
Which issue(s) does this PR fix or relate to?
Cherry-pick of #24373 to 4.0-dev branch.
Fixes #24348
What this PR does / why we need it:
Under 1000 concurrent TPCC NEW_ORDER transactions, each
LockOptriggersPKPersistedBetweento verify primary key conflicts. This path loads object metadata, bloom filters, and block PK columns for all changed objects since the transaction's snapshot. With many concurrent writers on the same table (e.g.,stock), this produces a thundering-herd of block I/O that exhausts mpool capacity → OOM.Changes
pkg/vm/engine/disttae/txn_table.go— Three mitigations inPKPersistedBetween:trueto skip expensive I/OLoadColumnspkg/sql/plan/apply_indices.go— Extend selectivity guard ingetIndexForNonEquiCondto coverin_rangeoperators.🤖 Generated with Claude Code