Skip to content

fix: prevent OOM in PKPersistedBetween under high-concurrency lock checks (cherry-pick to 4.0-dev)#24380

Merged
heni02 merged 3 commits into
matrixorigin:4.0-devfrom
jiangxinmeng1:cherry-pick/tpcc-oom-4.0-dev
May 13, 2026
Merged

fix: prevent OOM in PKPersistedBetween under high-concurrency lock checks (cherry-pick to 4.0-dev)#24380
heni02 merged 3 commits into
matrixorigin:4.0-devfrom
jiangxinmeng1:cherry-pick/tpcc-oom-4.0-dev

Conversation

@jiangxinmeng1
Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) does this PR fix or relate to?

Cherry-pick of #24373 to 4.0-dev branch.

Fixes #24348

What this PR does / why we need it:

Under 1000 concurrent TPCC NEW_ORDER transactions, each LockOp triggers PKPersistedBetween to verify primary key conflicts. This path loads object metadata, bloom filters, and block PK columns for all changed objects since the transaction's snapshot. With many concurrent writers on the same table (e.g., stock), this produces a thundering-herd of block I/O that exhausts mpool capacity → OOM.

Changes

pkg/vm/engine/disttae/txn_table.go — Three mitigations in PKPersistedBetween:

Mechanism Threshold Effect
Changed objects cap 64 Conservatively return true to skip expensive I/O
Candidate blocks cap 32 Avoid reading too many blocks that passed zonemap + BF
Global semaphore 16 concurrent Cap peak mpool allocations from LoadColumns

pkg/sql/plan/apply_indices.go — Extend selectivity guard in getIndexForNonEquiCond to cover in_range operators.

🤖 Generated with Claude Code

jiangxinmeng1 and others added 3 commits May 13, 2026 17:54
…ecks

Under 1000 concurrent TPCC NEW_ORDER transactions, each LockOp triggers
PKPersistedBetween to verify primary key conflicts. This path loads object
metadata, bloom filters, and block PK columns for all changed objects since
the transaction's snapshot. With many concurrent writers on the same table,
this produces a thundering-herd of block I/O that exhausts mpool capacity.

Three mitigations:
1. Early exit when changed objects exceed threshold (64) — conservatively
   returns "may be modified" to avoid loading hundreds of object meta/BF.
2. Early exit when candidate blocks exceed threshold (32) — avoids reading
   too many blocks that passed zonemap + bloom filter checks.
3. Global semaphore (capacity 16) on the block I/O phase — caps peak
   concurrent mpool allocations from LoadColumns.

Also extends the secondary index selectivity guard in getIndexForNonEquiCond
to cover in_range operators (previously only single range ops were guarded),
preventing non-selective in_range conditions from triggering full index table
scans on large tables.

Fixes matrixorigin#24348

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- TestPkCheckSemaphore_LimitsConcurrency: verifies the global semaphore
  actually limits concurrent goroutines to its capacity (16)
- TestPkCheckSemaphore_RespectsContextCancellation: verifies cancelled
  context causes immediate return without blocking
- TestIsRangeOp: verifies the expanded range operator classification
  covers in_range in addition to single comparison ops

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. [Must Fix] Semaphore scope narrowed to candidateBlks loop only.
   Previously used `defer` which held the slot through tombstone I/O,
   causing unintended rate-limiting. Now explicitly released after the
   block loop, so tombstonePKExistsInRange is not throttled.

2. [Should Fix] Counter semantics: bail-out paths now use a dedicated
   TxnPKChangeCheckBailoutCounter ("bailout" label) instead of reusing
   TxnPKChangeCheckChangedCounter. Dashboards can distinguish real
   conflicts from protective early exits.

3. [Should Fix] IO counter moved after semaphore acquire. Previously
   incremented before the select, so ctx cancellation would produce a
   phantom IO count.

4. [Should Fix] Objects cap now only counts cObjs (inserted objects),
   not delObjs. Deleted objects don't contain conflicting PKs and
   shouldn't inflate the threshold check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@heni02 heni02 merged commit 9217573 into matrixorigin:4.0-dev May 13, 2026
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants