Skip to content

Replace pthread mutex with coro::Mutex in CursorManager#416

Merged
xiexiaoy merged 1 commit intoeloqdata:mainfrom
xiexiaoy:fix_cursor_manager_mutex
Jan 9, 2026
Merged

Replace pthread mutex with coro::Mutex in CursorManager#416
xiexiaoy merged 1 commit intoeloqdata:mainfrom
xiexiaoy:fix_cursor_manager_mutex

Conversation

@xiexiaoy
Copy link
Collaborator

@xiexiaoy xiexiaoy commented Jan 8, 2026

When scan range more than one batch, client use cursor to get next batch. Collection owns a CursorManager, which maps cursor id to cursor object. In Mongo, CursorManager is protected by partitioned pthread mutex. In EloqDoc, command is executed in a coroutine, those pthread mutex should be replaced with corotine mutex.

Summary by CodeRabbit

  • Refactor
    • Optimized internal cursor and executor management systems to enhance concurrent operation handling and improve overall system efficiency in high-concurrency environments.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 8, 2026

Walkthrough

This pull request introduces conditional compilation based on D_USE_CORO_SYNC to enable alternative synchronization mechanisms across the query execution and cursor management subsystems. The changes replace partitioned mutex structures and registration tokens with coro-aware equivalents when the flag is enabled.

Changes

Cohort / File(s) Summary
Synchronization Primitive Abstraction
src/mongo/db/commands/authentication_commands.cpp
License header formatting standardized; conditional includes switch between mutex.h and coro_sync.h based on D_USE_CORO_SYNC. Runtime nonce generation now uses implicit lock guard template deduction instead of explicit SimpleMutex type; mutex member type changes from SimpleMutex to coro::Mutex under coro-sync mode.
Cursor Manager Public API Refactoring
src/mongo/db/cursor_manager.h, src/mongo/db/cursor_manager.cpp
registerExecutor() signature changes: non-coro-sync path returns PartitionId, coro-sync path returns void. Internal data structures switch from Partitioned containers with SimpleMutex _registrationLock to flat unordered_set/unordered_map with separate coro::Mutex members. Method deregisterAndDestroyCursor() conditionally hidden under coro-sync mode.
Plan Executor Registration State Management
src/mongo/db/query/plan_executor.h, src/mongo/db/query/plan_executor.cpp
Registration tracking replaces token-based approach: non-coro-sync retains _registrationToken; coro-sync introduces boolean _registrated flag. Methods unsetRegistered(), getRegistrationToken(), and setRegistrationToken() branch execution paths or mark as unreachable based on compilation mode.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • lzxddz

Poem

🐰 A tale of two paths, now gated with care,
Coro and mutex in conditional pair,
Tokens to flags, partitions laid bare,
One code, two dances—sync everywhere!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.89% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: replacing pthread mutexes with coro::Mutex in CursorManager, which is the core objective across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@xiexiaoy xiexiaoy linked an issue Jan 8, 2026 that may be closed by this pull request
@xiexiaoy xiexiaoy force-pushed the fix_cursor_manager_mutex branch 3 times, most recently from 3d70e15 to 3a5239f Compare January 8, 2026 10:59
@xiexiaoy xiexiaoy force-pushed the fix_cursor_manager_mutex branch from 3a5239f to fcf6a82 Compare January 8, 2026 11:07
@xiexiaoy xiexiaoy marked this pull request as ready for review January 8, 2026 11:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @src/mongo/db/cursor_manager.cpp:
- Around line 824-826: CursorManager::numCursors currently calls
_cursorMap.size(), but in the non-coro build _cursorMap is a
std::unique_ptr<Partitioned<...>> so you must dereference it; change the call to
use _cursorMap->size() to invoke Partitioned::size() on the pointed-to object
(refer to CursorManager::numCursors and _cursorMap/Partitioned).
- Line 438: The destructor currently calls invariant(_cursorMap.empty()) which
fails in the non-coro build because _cursorMap is a std::unique_ptr; change the
call to invariant(_cursorMap->empty()) so it dereferences the unique_ptr and
calls Partitioned::empty(); update the destructor code that references
_cursorMap.empty() to use _cursorMap->empty() to match other usages like
_cursorMap->lockAllPartitions().
🧹 Nitpick comments (1)
src/mongo/db/cursor_manager.h (1)

303-318: Consider updating the comment to document coro-sync lock ordering.

The existing comment (lines 303-315) describes the lock ordering for the non-coro path with _registrationLock, _registeredPlanExecutors, and _cursorMap mutexes. In the coro-sync path, _registrationLock is removed and replaced with _registeredPlanExecutorsMutex and _cursorMapMutex. Consider adding a note about the simplified lock ordering in the coro-sync path for maintainability.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ede59db and fcf6a82.

📒 Files selected for processing (5)
  • src/mongo/db/commands/authentication_commands.cpp
  • src/mongo/db/cursor_manager.cpp
  • src/mongo/db/cursor_manager.h
  • src/mongo/db/query/plan_executor.cpp
  • src/mongo/db/query/plan_executor.h
🧰 Additional context used
🧬 Code graph analysis (3)
src/mongo/db/query/plan_executor.h (1)
src/mongo/db/query/plan_executor.cpp (4)
  • reset (265-311)
  • reset (265-272)
  • PlanExecutor (228-263)
  • PlanExecutor (344-346)
src/mongo/db/cursor_manager.cpp (1)
src/mongo/db/cursor_manager.h (1)
  • CursorManager (85-332)
src/mongo/db/cursor_manager.h (1)
src/mongo/db/coro_sync.h (2)
  • coro (30-115)
  • Mutex (42-55)
🪛 Cppcheck (2.19.0)
src/mongo/db/cursor_manager.cpp

[error] 161-161: #error "Unsupported compiler family"

(preprocessorErrorDirective)

🔇 Additional comments (20)
src/mongo/db/query/plan_executor.cpp (3)

252-257: LGTM - Registration logic is correctly branched.

The coro-sync path properly registers the executor and sets the _registrated flag, while the non-coro path continues using token-based registration. This aligns with the CursorManager API changes.


283-287: LGTM - Reset correctly clears registration state.

The reset method properly clears _registrationToken or _registrated depending on the build configuration before re-registering.


738-754: LGTM - Dispose logic correctly handles both paths.

The coro-sync path checks _registrated instead of _registrationToken, deregisters the executor, and properly clears the flag to prevent double-deregistration.

src/mongo/db/commands/authentication_commands.cpp (2)

61-65: LGTM - Conditional include correctly selects synchronization header.

The include guards properly select the appropriate mutex implementation based on the D_USE_CORO_SYNC flag.


229-238: LGTM - Mutex type correctly switched for coro-sync mode.

The _randMutex member type is appropriately switched between SimpleMutex and coro::Mutex. The stdx::lock_guard without explicit template parameter relies on CTAD and should work correctly with both mutex types.

src/mongo/db/cursor_manager.cpp (9)

113-118: LGTM - GlobalCursorIdCache mutex type correctly switched.

The mutex type for _mutex is appropriately switched between SimpleMutex and coro::Mutex based on the D_USE_CORO_SYNC flag.


427-433: LGTM - CursorManager constructor correctly initializes _cursorMap.

The coro-sync path initializes _cursorMap as a direct stdx::unordered_map while the non-coro path uses a unique_ptr<Partitioned<...>>. The initialization is properly branched.


455-474: LGTM - invalidateAll executor iteration correctly handles both paths.

The coro-sync path iterates directly over _registeredPlanExecutors with a lock guard, while the non-coro path uses partitioned locking. Both paths correctly mark executors as killed.


624-637: LGTM - registerExecutor correctly branched for both paths.

The coro-sync path uses a simpler void return with direct set insertion under lock, while the non-coro path continues with token-based registration. This aligns with the PlanExecutor changes.


674-683: LGTM - pinCursor cleanup correctly handles killed cursor disposal.

In the coro-sync path, the cursor is properly removed from the map, the lock is released before disposal, and the error is returned. This avoids holding locks during potentially expensive disposal operations.


882-888: LGTM - registerCursor correctly uses appropriate locking.

The coro-sync path uses _cursorMapMutex for the critical section, while the non-coro path uses _registrationLock. Both protect cursor ID allocation and map insertion atomically.


910-915: LGTM - deregisterCursor correctly handles both paths.

The coro-sync path acquires _cursorMapMutex before erasing, while the non-coro path relies on the Partitioned container's internal locking via the erase method.


917-932: LGTM - deregisterAndDestroyCursor correctly gated behind non-coro path.

This helper is only needed in the non-coro path where partitioned locking is used. The coro-sync path inlines the erase and dispose logic directly at call sites.


564-574: No deadlock risk: Lock acquisition follows the documented ordering.

The code at lines 564-569 correctly acquires _registeredPlanExecutorsMutex before _cursorMapMutex, which aligns with the explicit lock ordering documented in the header file (lines 306-315): "_registeredPlanExecutors" mutexes must be acquired before "_cursorMap" mutexes. No other code path acquires these locks in the opposite order, and invalidateDocument is the only function holding both locks simultaneously. The sequential acquisition is safe and follows established protocol.

Likely an incorrect or invalid review comment.

src/mongo/db/query/plan_executor.h (3)

474-480: LGTM - unsetRegistered correctly handles both paths.

Properly clears the registration token in non-coro path and sets _registrated = false in coro path.


482-501: LGTM - Token methods correctly marked unreachable in coro path.

The getRegistrationToken() and setRegistrationToken() methods correctly use MONGO_UNREACHABLE in the coro-sync path since token-based registration is not used. This will catch any erroneous calls at runtime.


616-622: LGTM - Member declaration correctly branched.

The _registrationToken optional is replaced with a simple _registrated boolean in the coro-sync path, properly initialized to false.

src/mongo/db/cursor_manager.h (3)

46-50: LGTM - Conditional includes correctly select synchronization primitives.

The header correctly includes either the traditional mutex header or the coroutine-aware mutex header based on the D_USE_CORO_SYNC flag.


151-156: LGTM - registerExecutor API correctly branched.

The coro-sync path returns void since token-based registration is not used, while the non-coro path retains the PartitionId return type.


320-331: LGTM - Data members correctly branched for coro-sync mode.

The coro-sync path replaces the partitioned containers with simple mutex-protected containers (_registeredPlanExecutorsMutex + unordered_set and _cursorMapMutex + unordered_map). This is appropriate for coroutine-based execution where the partitioning optimization may not be needed.

// All cursors and PlanExecutors should have been deleted already.
invariant(_registeredPlanExecutors.empty());
invariant(_cursorMap->empty());
invariant(_cursorMap.empty());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's see the context around line 438 in cursor_manager.cpp
head -n 450 src/mongo/db/cursor_manager.cpp | tail -n 20

Repository: eloqdata/eloqdoc

Length of output: 749


🏁 Script executed:

#!/bin/bash
# Search for _cursorMap declarations and conditionals
rg -n "_cursorMap" src/mongo/db/cursor_manager.cpp | head -30

Repository: eloqdata/eloqdoc

Length of output: 1723


🏁 Script executed:

#!/bin/bash
# Check the structure of the file to understand coro vs non-coro paths
rg -n "ifdef.*CORO\|if.*coro\|Partitioned" src/mongo/db/cursor_manager.cpp -i | head -20

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Find the Partitioned class definition
fd -e h -e hpp | xargs rg -l "class Partitioned" | head -5

Repository: eloqdata/eloqdoc

Length of output: 99


🏁 Script executed:

#!/bin/bash
# Find the header file for CursorManager
fd -name "cursor_manager.h" src/

Repository: eloqdata/eloqdoc

Length of output: 290


🏁 Script executed:

#!/bin/bash
# Check _cursorMap member declaration and any conditionals in the header
rg -n "_cursorMap" src/mongo/db/cursor_manager.h -B2 -A2

Repository: eloqdata/eloqdoc

Length of output: 1652


🏁 Script executed:

#!/bin/bash
# Check the Partitioned class definition for empty() method
cat src/mongo/db/catalog/util/partitioned.h | head -100

Repository: eloqdata/eloqdoc

Length of output: 4019


🏁 Script executed:

#!/bin/bash
# Search for empty() method in Partitioned class
rg -n "empty\(\)" src/mongo/db/catalog/util/partitioned.h

Repository: eloqdata/eloqdoc

Length of output: 293


🏁 Script executed:

#!/bin/bash
# Check the ifdef guards to understand the conditional compilation
rg -B10 "unique_ptr.*Partitioned" src/mongo/db/cursor_manager.h | head -30

Repository: eloqdata/eloqdoc

Length of output: 588


🏁 Script executed:

#!/bin/bash
# Get more context on the Partitioned empty() method signature
sed -n '200,220p' src/mongo/db/catalog/util/partitioned.h

Repository: eloqdata/eloqdoc

Length of output: 918


🏁 Script executed:

#!/bin/bash
# Check if there are any operator overloads in unique_ptr that might allow .empty()
# or check if the code actually compiles with both paths
rg "ifdef.*CORO" src/mongo/db/cursor_manager.h -B5 -A20 | head -50

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Verify if unique_ptr has an empty() method or what it does
rg -A5 "class unique_ptr" /usr/include/c++* 2>/dev/null | head -20 || echo "Checking MongoDB code for how unique_ptr is used with empty()"
rg "unique_ptr.*\.empty\(\)" src/ | head -5

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check if there are other similar patterns in cursor_manager.cpp to understand the fix
rg "_cursorMap\->" src/mongo/db/cursor_manager.cpp | head -3

Repository: eloqdata/eloqdoc

Length of output: 264


Change _cursorMap.empty() to _cursorMap->empty() in the destructor.

In the non-coro path, _cursorMap is std::unique_ptr<Partitioned<...>>, so calling _cursorMap.empty() directly will not compile. The -> dereference operator is needed to access the Partitioned::empty() method. The coro path uses a direct unordered_map, which explains why this issue only manifests in non-coro builds. Other usages in the file correctly use _cursorMap-> syntax (e.g., _cursorMap->lockAllPartitions()).

🤖 Prompt for AI Agents
In @src/mongo/db/cursor_manager.cpp at line 438, The destructor currently calls
invariant(_cursorMap.empty()) which fails in the non-coro build because
_cursorMap is a std::unique_ptr; change the call to
invariant(_cursorMap->empty()) so it dereferences the unique_ptr and calls
Partitioned::empty(); update the destructor code that references
_cursorMap.empty() to use _cursorMap->empty() to match other usages like
_cursorMap->lockAllPartitions().

Comment on lines 824 to 826
size_t CursorManager::numCursors() const {
return _cursorMap->size();
return _cursorMap.size();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, check the Partitioned header for size() method
rg -n "size\(\)" src/mongo/db/catalog/util/partitioned.h -B2 -A5

Repository: eloqdata/eloqdoc

Length of output: 1627


🏁 Script executed:

# Check the CursorManager header to see _cursorMap declaration
rg -n "_cursorMap" src/mongo/db/cursor_manager.h -B2 -A2

Repository: eloqdata/eloqdoc

Length of output: 1652


🏁 Script executed:

# Check the current numCursors implementation in cursor_manager.cpp
rg -n "numCursors" src/mongo/db/cursor_manager.cpp -B2 -A5

Repository: eloqdata/eloqdoc

Length of output: 260


🏁 Script executed:

# Check if there are conditional compilation directives for coro/non-coro paths
rg -n "ENABLE_EXECUTION_CONTROL_FLOW|MONGO_CORO" src/mongo/db/cursor_manager.h | head -20

Repository: eloqdata/eloqdoc

Length of output: 42


Fix compilation error: _cursorMap.size() requires arrow operator in non-coro path.

In the non-coro path, _cursorMap is std::unique_ptr<Partitioned<...>>. Calling .size() on the unique_ptr directly will not compile; use _cursorMap->size() to dereference and access the pointed-to object's size() method. The Partitioned class correctly implements size() by accumulating sizes across all partitions, but the accessor syntax must match the type being used.

🤖 Prompt for AI Agents
In @src/mongo/db/cursor_manager.cpp around lines 824 - 826,
CursorManager::numCursors currently calls _cursorMap.size(), but in the non-coro
build _cursorMap is a std::unique_ptr<Partitioned<...>> so you must dereference
it; change the call to use _cursorMap->size() to invoke Partitioned::size() on
the pointed-to object (refer to CursorManager::numCursors and
_cursorMap/Partitioned).

@xiexiaoy xiexiaoy requested a review from lzxddz January 9, 2026 03:06
@xiexiaoy xiexiaoy merged commit 4122e2a into eloqdata:main Jan 9, 2026
3 checks passed
@xiexiaoy xiexiaoy deleted the fix_cursor_manager_mutex branch January 9, 2026 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: IndexScan throughput downgrade to ~zero

2 participants