Skip to content

Fix allocate of MemoryBlock#17581

Merged
Wei-hao-Li merged 3 commits intomasterfrom
lwh/fixMemoryBlock
Apr 30, 2026
Merged

Fix allocate of MemoryBlock#17581
Wei-hao-Li merged 3 commits intomasterfrom
lwh/fixMemoryBlock

Conversation

@Wei-hao-Li
Copy link
Copy Markdown
Collaborator

@Wei-hao-Li Wei-hao-Li commented Apr 30, 2026

This PR fixes a memory concurrent allocate bug in AtomicLongMemoryBlock:
AtomicLongMemoryBlock.allocate() used AtomicLong.updateAndGet() with a lambda that had an external side effect (result.set(true)).

updateAndGet() is CAS-retry based: when CAS fails, the lambda can be executed multiple times. Under contention, this sequence could happen:

  1. First lambda run sees enough memory, sets result=true, computes next.
  2. CAS fails because another thread changed usedMemoryInBytes.
  3. Lambda runs again with a new value and may decide allocation should fail (no increment).
  4. Method still returns result=true from the earlier side effect.

So the caller may observe “allocation succeeded” while usedMemoryInBytes was not actually increased (phantom allocation success).
After enough such mismatches, release paths can try to release more than recorded memory, triggering warnings like “release is larger than memory cost.”

Signed-off-by: Weihao Li <18110526956@163.com>
Signed-off-by: Weihao Li <18110526956@163.com>
Copy link
Copy Markdown
Contributor

@JackieTien97 JackieTien97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

The core fix is correct — replacing updateAndGet + AtomicBoolean side-effect with a hand-written CAS loop eliminates the phantom allocation bug. The ramBytesUsed() fix for the missing queryId field is also correct and relevant since RamUsageEstimator.sizeOfObject() delegates to ramBytesUsed() for Accountable objects.

Two minor suggestions as inline comments.

return memCost;
}
result.set(true);
return memCost + sizeInByte;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release() 方法中的 LOGGER.warn() 同样是 updateAndGet lambda 内的副作用,CAS 重试时可能导致:

  1. 虚假告警:第一次 lambda 执行时 sizeInByte > memCost 触发 WARN,CAS 失败后重试时 memCost 已变大(其他线程 allocate 了),release 实际成功 — 但 WARN 已经打出
  2. 重复告警:同一次 release() 调用打出多条 WARN

不影响正确性,但建议也改为手写 CAS 循环,与 allocate() 风格保持一致:

public long release(long sizeInByte) {
    long prev, next;
    do {
        prev = usedMemoryInBytes.get();
        if (sizeInByte > prev) {
            LOGGER.warn("The memory cost to be released is larger than the memory cost of memory block {}", this);
            next = 0;
        } else {
            next = prev - sizeInByte;
        }
    } while (!usedMemoryInBytes.compareAndSet(prev, next));
    return next;
}

@JackieTien97
Copy link
Copy Markdown
Contributor

补充一个小建议:

release() 方法中的 LOGGER.warn() 同样是 updateAndGet lambda 内的副作用。虽然不影响正确性,但 CAS 重试时可能出现虚假告警(第一次 lambda 执行 sizeInByte > memCost 触发 WARN,CAS 失败重试时 memCost 已变大导致 release 实际成功,但 WARN 已打出)或同一次 release 调用打出多条 WARN。建议也改为手写 CAS 循环,与 allocate() 风格一致:

public long release(long sizeInByte) {
    long prev, next;
    do {
        prev = usedMemoryInBytes.get();
        if (sizeInByte > prev) {
            LOGGER.warn("The memory cost to be released is larger than the memory cost of memory block {}", this);
            next = 0;
        } else {
            next = prev - sizeInByte;
        }
    } while (!usedMemoryInBytes.compareAndSet(prev, next));
    return next;
}

Signed-off-by: Weihao Li <18110526956@163.com>
@Wei-hao-Li Wei-hao-Li merged commit 1cec558 into master Apr 30, 2026
27 checks passed
@JackieTien97 JackieTien97 deleted the lwh/fixMemoryBlock branch April 30, 2026 03:23
@sonarqubecloud
Copy link
Copy Markdown

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 95.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 39.96%. Comparing base (b7eb193) to head (1cec558).
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
.../apache/iotdb/db/queryengine/plan/Coordinator.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17581      +/-   ##
============================================
- Coverage     39.96%   39.96%   -0.01%     
  Complexity     2547     2547              
============================================
  Files          5175     5175              
  Lines        348396   348417      +21     
  Branches      44531    44537       +6     
============================================
+ Hits         139239   139243       +4     
- Misses       209157   209174      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants