Skip to content

Fix eviction queue race in buffer manager#425

Merged
adsharma merged 1 commit into
mainfrom
fix-large-aggregate-leak-flake
Apr 27, 2026
Merged

Fix eviction queue race in buffer manager#425
adsharma merged 1 commit into
mainfrom
fix-large-aggregate-leak-flake

Conversation

@adsharma

Copy link
Copy Markdown
Contributor

Fixes: #350

Earlier fix was insufficient. There is another race.

@adsharma

Copy link
Copy Markdown
Contributor Author

This commit fixes a race condition where two threads could simultaneously handle the same EVICTED page, causing inconsistent state between the eviction queue and page states. The fix introduces atomic operations and proper versioning to ensure thread safety.

Race Condition Analysis

The issue occurs when multiple threads scan the eviction queue and encounter the same evicted page candidate. Previously, the clear() operation wasn't atomic with respect to the expected candidate value, allowing race conditions where one thread could clear a candidate while another was still processing it.

Key Changes

1. Atomic tryClear Method

Added EvictionQueue::tryClear() which uses compare_exchange_strong to only clear the candidate if it matches the expected value buffer_manager.cpp:64-76 . This prevents clearing a candidate that another thread has already modified.

2. Version Increment in resetToEvicted()

Modified PageState::resetToEvicted() to increment the version when transitioning to EVICTED state page_state.h:87-94 . This ensures that stale reads can be detected after a page has been evicted and potentially reloaded.

3. Safer Eviction Logic

In evictPages(), when handling evicted candidates, the code now:

  • Uses tryClear() instead of clear() to ensure atomicity
  • Calls resetToEvicted() on the page state to properly increment the version buffer_manager.cpp:288-298

In tryEvictPage(), reordered operations to clear the eviction queue before resetting the page state buffer_manager.cpp:445-455 .

Thread Safety Improvements

The fix ensures that:

  1. Only one thread can successfully clear a specific eviction candidate
  2. Page state versions are properly incremented during eviction
  3. The eviction queue and page states remain consistent even under concurrent access

This prevents scenarios where two threads could think they both successfully evicted the same page, leading to memory leaks or corruption.

Notes

The version increment in resetToEvicted() is particularly important because it prevents optimistic reads from returning stale data after a page has been evicted and potentially reloaded by another thread buffer_manager.h:168-172 .

LargeAggregateLeakTest was still flaky after the earlier in-memory lock leak
fix because eviction could race when cleaning up stale queue entries.

evictPages() scans queue slots speculatively. If another thread evicts the
same page in tryEvictPage(), clears the slot, and publishes EVICTED before the
scanner reaches its cleanup path, the scanner can still observe the old
candidate and call EvictionQueue::clear() on a slot that is already empty.
That trips the UNREACHABLE_CODE assertion in clear() and causes the test to
fail instead of returning the expected buffer manager OOM.

Fix the concurrent cleanup path by adding EvictionQueue::tryClear(), which
only clears the slot if it still contains the exact candidate that was
observed during the scan. Keep clear() assertive for owned/single-threaded
paths.

Also tighten the page-state transitions around eviction:
- increment the page-state version in resetToEvicted() to avoid ABA on
  repeated EVICTED transitions
- clear the queue slot before publishing EVICTED in tryEvictPage(), so other
  threads do not observe an EVICTED-but-still-enqueued page

Verified locally with a rebuilt e2e_test and a 20-iteration bounded loop of
agg~hash_leak.LargeAggregateLeakTest, all passing without the assertion or a
hang.
@adsharma adsharma force-pushed the fix-large-aggregate-leak-flake branch from 3d04b98 to 59aec7b Compare April 27, 2026 16:02
@adsharma adsharma merged commit eff87c1 into main Apr 27, 2026
4 checks passed
@adsharma adsharma deleted the fix-large-aggregate-leak-flake branch April 27, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: LargeAggregateLeakTest is flaky

1 participant