Fix eviction queue race in buffer manager#425
Conversation
|
This commit fixes a race condition where two threads could simultaneously handle the same EVICTED page, causing inconsistent state between the eviction queue and page states. The fix introduces atomic operations and proper versioning to ensure thread safety. Race Condition AnalysisThe issue occurs when multiple threads scan the eviction queue and encounter the same evicted page candidate. Previously, the clear() operation wasn't atomic with respect to the expected candidate value, allowing race conditions where one thread could clear a candidate while another was still processing it. Key Changes1. Atomic tryClear MethodAdded EvictionQueue::tryClear() which uses compare_exchange_strong to only clear the candidate if it matches the expected value buffer_manager.cpp:64-76 . This prevents clearing a candidate that another thread has already modified. 2. Version Increment in resetToEvicted()Modified PageState::resetToEvicted() to increment the version when transitioning to EVICTED state page_state.h:87-94 . This ensures that stale reads can be detected after a page has been evicted and potentially reloaded. 3. Safer Eviction LogicIn evictPages(), when handling evicted candidates, the code now:
In tryEvictPage(), reordered operations to clear the eviction queue before resetting the page state buffer_manager.cpp:445-455 . Thread Safety ImprovementsThe fix ensures that:
This prevents scenarios where two threads could think they both successfully evicted the same page, leading to memory leaks or corruption. NotesThe version increment in resetToEvicted() is particularly important because it prevents optimistic reads from returning stale data after a page has been evicted and potentially reloaded by another thread buffer_manager.h:168-172 . |
LargeAggregateLeakTest was still flaky after the earlier in-memory lock leak fix because eviction could race when cleaning up stale queue entries. evictPages() scans queue slots speculatively. If another thread evicts the same page in tryEvictPage(), clears the slot, and publishes EVICTED before the scanner reaches its cleanup path, the scanner can still observe the old candidate and call EvictionQueue::clear() on a slot that is already empty. That trips the UNREACHABLE_CODE assertion in clear() and causes the test to fail instead of returning the expected buffer manager OOM. Fix the concurrent cleanup path by adding EvictionQueue::tryClear(), which only clears the slot if it still contains the exact candidate that was observed during the scan. Keep clear() assertive for owned/single-threaded paths. Also tighten the page-state transitions around eviction: - increment the page-state version in resetToEvicted() to avoid ABA on repeated EVICTED transitions - clear the queue slot before publishing EVICTED in tryEvictPage(), so other threads do not observe an EVICTED-but-still-enqueued page Verified locally with a rebuilt e2e_test and a 20-iteration bounded loop of agg~hash_leak.LargeAggregateLeakTest, all passing without the assertion or a hang.
3d04b98 to
59aec7b
Compare
Fixes: #350
Earlier fix was insufficient. There is another race.