-
Notifications
You must be signed in to change notification settings - Fork 0
Handle Unity reload gaps by retrying plugin session resolution #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThis PR adds domain reload resilience to the MCP server by implementing a bounded-wait retry loop for plugin session resolution, updating response handling to support MCPResponse objects with hints, and introducing comprehensive integration and unit tests to validate behavior during Unity domain reloads. Changes
Sequence DiagramsequenceDiagram
participant Client as Client/Plugin
participant Server as MCP Server
participant PluginHub as PluginHub
participant Registry as PluginRegistry
Client->>Server: send_command_with_retry(command)
Server->>PluginHub: _resolve_session_id()
alt Session Available
PluginHub->>PluginHub: _try_once() finds session
PluginHub-->>Server: return session
Server->>Client: execute command
Client-->>Server: response
else Domain Reload In Progress
PluginHub->>PluginHub: _try_once() no session
PluginHub->>PluginHub: start deadline-based wait loop
Note over PluginHub: poll with sleep_seconds interval<br/>until deadline or session appears
par Waiting for Reconnection
Client->>Client: Unity domain reload
Client->>Server: reconnect
end
PluginHub->>Registry: check for reconnected session
alt Reconnected in Time
PluginHub->>PluginHub: session restored
PluginHub-->>Server: return session (with log)
Server->>Client: retry command
Client-->>Server: response
else Timeout Exceeded
PluginHub->>PluginHub: raise RuntimeError
PluginHub-->>Server: error
Server-->>Client: fail with timeout error
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review this please |
|
@coderabbitai please review this |
|
@dsarno, I'll review the changes in this PR for you. This looks like an important enhancement to handle Unity domain reload scenarios more robustly with retry logic and improved detection mechanisms. ✅ Actions performedReview triggered.
|
|
@dsarno, I'll review the changes in this PR. This enhancement adds important retry logic for handling Unity domain reload scenarios, which should make the plugin more robust when Unity reloads assemblies. ✅ Actions performedReview triggered.
|
This wasn't already reviewed. IT needs a first review. |
Greptile OverviewGreptile SummaryThis PR implements retry logic to gracefully handle Unity domain reload gaps. When Unity reloads its domain, the WebSocket plugin disconnects and reconnects within seconds. Previously, MCP calls during this window would fail immediately with "No Unity plugins are currently connected" errors. Now, Key Changes:
Testing Coverage:
Confidence Score: 4/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Client as MCP Client
participant PluginHub as PluginHub
participant Registry as PluginRegistry
participant Unity as Unity Plugin
Note over Unity: Domain Reload Starts
Unity->>PluginHub: Disconnect WebSocket
PluginHub->>Registry: unregister(session_id)
Client->>PluginHub: send_command_for_instance()
PluginHub->>PluginHub: _resolve_session_id()
PluginHub->>Registry: get_session_id_by_hash(unity_instance)
Registry-->>PluginHub: None (no session)
PluginHub->>Registry: list_sessions()
Registry-->>PluginHub: {} (empty)
Note over PluginHub: Start retry loop<br/>(max_retries × sleep_seconds)
loop Retry until session appears or timeout
PluginHub->>PluginHub: await asyncio.sleep(sleep_seconds)
PluginHub->>Registry: list_sessions()
Registry-->>PluginHub: {} (still empty)
end
Note over Unity: Domain Reload Completes
Unity->>PluginHub: Connect WebSocket
Unity->>PluginHub: Register (session_id, project_hash)
PluginHub->>Registry: register(session_id, project_name, project_hash)
PluginHub->>Registry: list_sessions()
Registry-->>PluginHub: {session_id: PluginSession}
PluginHub-->>PluginHub: session_id found!
PluginHub->>Unity: send_command(session_id, command_type, params)
Unity-->>PluginHub: command result
PluginHub-->>Client: success response
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 files reviewed, 1 comment
| return next(iter(sessions.keys())) | ||
|
|
||
| session_id = await _try_once() | ||
| deadline = time.monotonic() + (max_retries * sleep_seconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: the deadline calculation should account for the time taken by the first _try_once() call
currently if _try_once() takes significant time, the retry loop gets less time than intended
| deadline = time.monotonic() + (max_retries * sleep_seconds) | |
| session_id = await _try_once() | |
| wait_started_monotonic = time.monotonic() | |
| deadline = wait_started_monotonic + (max_retries * sleep_seconds) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: Server/plugin_hub.py
Line: 240:240
Comment:
**logic:** the deadline calculation should account for the time taken by the first `_try_once()` call
currently if `_try_once()` takes significant time, the retry loop gets less time than intended
```suggestion
session_id = await _try_once()
wait_started_monotonic = time.monotonic()
deadline = wait_started_monotonic + (max_retries * sleep_seconds)
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs.meta
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
Server/models.py (1)
11-13: MCPResponse.hint extension looks good; consider documenting expected valuesThe optional
hintfield is a clean, backward-compatible way to surface client guidance like"retry". To keep things self-documenting as more hints are introduced, consider centralizing/documenting the allowed values (e.g., in a docstring or constants/enum) so downstream code doesn’t rely on ad-hoc string literals.Server/plugin_hub.py (1)
209-215: Session resolution wait/retry loop looks solid; minor docstring/lint cleanupThe new
_resolve_session_idbehavior is well-structured:
- Prefers the requested
unity_instancewhen available, otherwise falls back to the first active session.- Uses config-driven
reload_max_retries/reload_retry_mswith a monotonic deadline andasyncio.sleep, so it’s cooperative with the event loop.- Logs both the start of waiting and restoration/timeout, which should help debug reload issues.
Two small polish points:
- The docstring contains
in‑flightwith a non-breaking hyphen, which Ruff flags (RUF002). Replacing it with a normal-avoids that lint noise.- If you want to keep linters fully quiet on TRY003, you could eventually factor the
"No Unity plugins are currently connected"message into a dedicated exception type, but that feels optional given this is already a clear, internal error.Functionally this looks ready to ship.
Also applies to: 219-273
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs (1)
18-242: Good coverage of reload scenarios; be aware of timing-based flakiness from fixed waitsThe edit-mode tests do a nice job exercising the critical paths: creating scripts to trigger domain reloads, issuing rapid
read_consolecalls, and asserting both “all succeed” and “≥80% succeed” behavior under stress.One thing to watch is the reliance on fixed
WaitForSecondsdelays (0.05–0.2s). On slower or heavily loaded CI machines, domain reloads can take longer than these hard-coded waits, which may introduce intermittent failures. If these tests ever start flaking, consider:
- Swapping some of the fixed sleeps for short
yield return nullloops with an overall timeout, or- Using a higher-level readiness signal (e.g., polling a status/hint that indicates reload completion) instead of pure wall-clock delays.
As written they’re fine to merge; this is just something to keep in mind for long-term stability.
Server/unity_connection.py (1)
717-755: Alignsend_command_with_retry’s return typing with actual behaviorThe retry loop correctly:
- Uses
config.reload_max_retries/reload_retry_msby default.- Sleeps based on either a
retry_after_mshint from dict responses or the configured delay.- Preserves the final structured response when retries are exhausted.
However, in practice
conn.send_commandcan returnMCPResponseinstances (e.g., preflight “Unity is reloading; please retry”, or theparams is Noneguard), and when_is_reloading_responseis false or you exhaustmax_retries, thatMCPResponseis returned directly. Theasync_send_command_with_retrywrapper already reflects this by advertisingdict[str, Any] | MCPResponse, butsend_command_with_retryis still annotated asDict[str, Any].To avoid surprises for callers and static type checkers, consider one of:
- Broadening the sync signature and docstring to
dict[str, Any] | MCPResponse, matching the async wrapper; or- Normalizing
MCPResponseto a plain dict before returning (e.g., viamodel_dump()), if you’d prefer a pure-dict API at this layer.Functionally it’s fine; this is mostly about type clarity and predictability for callers.
Also applies to: 757-789
Server/tests/integration/test_domain_reload_resilience.py (3)
51-66: Restore originalPluginHubstate instead of setting globals toNoneDirectly setting
PluginHub._registryandPluginHub._locktoNonein the cleanup blocks can leak into other tests and cause surprising failures if they rely on the original values.Consider saving and restoring the previous state instead:
- # Configure PluginHub with our mock - PluginHub._registry = mock_registry - PluginHub._lock = asyncio.Lock() + # Configure PluginHub with our mock, preserving original state + original_registry = PluginHub._registry + original_lock = PluginHub._lock + PluginHub._registry = mock_registry + PluginHub._lock = asyncio.Lock() @@ - finally: - # Clean up - PluginHub._registry = None - PluginHub._lock = None + finally: + # Clean up: restore original state + PluginHub._registry = original_registry + PluginHub._lock = original_lock(and similarly in
test_plugin_hub_fails_after_timeout).Also applies to: 84-99
102-155: Tighten up lint issues infake_send_commandand the loopRuff’s hints here are valid and easy to address without changing behavior:
fake_send_commanddoesn’t useargs/kwargs.- The loop index
iisn’t used.You can make this intent explicit and silence the warnings:
- async def fake_send_command(*args, **kwargs): + async def fake_send_command(*_args, **_kwargs): @@ - for i in range(5): + for _ in range(5):Everything else in this simulated reload/read_console stress test looks good.
157-214: Align the default-session assertion with the test’s expectationThe docstring and comment say the default call “should return first available session,” but the assertion allows either session:
# Should return first available session assert session_id in ["session-1", "session-2"]If
_resolve_session_idis expected to deterministically choose the first available session, consider tightening this to:- # Should return first available session - assert session_id in ["session-1", "session-2"] + # Should return first available session + assert session_id == "session-1"This will better guard against regressions in the selection behavior while still keeping the test simple.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
Server/models.py(1 hunks)Server/plugin_hub.py(2 hunks)Server/tests/integration/test_domain_reload_resilience.py(1 hunks)Server/unity_connection.py(1 hunks)TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs(1 hunks)TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs.meta(1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.5)
Server/tests/integration/test_domain_reload_resilience.py
117-117: Unused function argument: args
(ARG001)
117-117: Unused function argument: kwargs
(ARG001)
136-136: Loop control variable i not used within loop body
(B007)
Server/plugin_hub.py
214-214: Docstring contains ambiguous ‑ (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?
(RUF002)
217-217: Avoid specifying long messages outside the exception class
(TRY003)
271-271: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🔇 Additional comments (3)
Server/unity_connection.py (1)
693-714: Centralized reload detection via_is_reloading_responselooks correctThe helper cleanly unifies the two cases you care about:
MCPResponsefrom preflight/transport, with an explicit"retry"hint taking precedence, then a fallback text scan overmessage/error.- Raw Unity dict payloads, checking both
state == "reloading"and"reload"in message/error.This should make it much easier to evolve reload signaling without scattering heuristics around the codebase.
Server/tests/integration/test_domain_reload_resilience.py (2)
15-62: Good coverage of the reconnection retry behaviorThis test nicely verifies that
_resolve_session_idactually waits and retries until a session appears, and asserts both the resolved ID and that multiple list calls occurred. Looks solid and aligned with the new bounded‑wait behavior.
69-100: Timeout behavior test is clear and focusedThis test clearly forces the “no sessions ever” path, overrides the reload retry config to keep it fast, and asserts the expected error message. The structure is straightforward and correctly scoped with the
patch('plugin_hub.config')context.
| fileFormatVersion: 2 | ||
| guid: $(uuidgen | tr -d '-' | tr '[:upper:]' '[:lower:]' | cut -c1-32) | ||
| MonoImporter: | ||
| externalObjects: {} | ||
| serializedVersion: 2 | ||
| defaultReferences: [] | ||
| executionOrder: 0 | ||
| icon: {instanceID: 0} | ||
| userData: | ||
| assetBundleName: | ||
| assetBundleVariant: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace shell command in guid with an actual Unity GUID
The guid field currently contains a shell pipeline:
guid: $(uuidgen | tr -d '-' | tr '[:upper:]' '[:lower:]' | cut -c1-32)Unity expects a concrete GUID string here, not a command substitution. As-is, the .meta file is invalid and Unity may ignore or regenerate it, which can break the link between the test asset and its metadata.
Update this to a fixed GUID generated by Unity (or copied from a Unity-created .meta), for example:
-fileFormatVersion: 2
-guid: $(uuidgen | tr -d '-' | tr '[:upper:]' '[:lower:]' | cut -c1-32)
+fileFormatVersion: 2
+guid: 0123456789abcdef0123456789abcdef # replace with actual Unity-generated GUIDBest is to let Unity create the .meta by reimporting the asset and commit the resulting file.
🤖 Prompt for AI Agents
In
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs.meta
lines 1-11, the guid field contains a shell command substitution instead of a
concrete Unity GUID; replace the entire guid line with an actual 32-character
lowercase hex GUID string (or remove the file and let Unity reimport the asset
to generate a correct .meta), then commit the resulting .meta so the asset GUID
is valid and stable.
| if not sessions: | ||
| return None | ||
| # Deterministic order: rely on insertion ordering | ||
| return next(iter(sessions.keys())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Wrong Unity instance selected when specific instance unavailable
When a specific unity_instance is requested but not found, _try_once falls through to the fallback logic and returns any available session instead of None. This causes commands intended for a specific Unity project to be routed to a different project. The function should return None immediately after get_session_id_by_hash returns None for a specific instance, allowing the retry loop to wait for the correct instance to reconnect rather than silently connecting to the wrong project.
|
This is a really good fix @dsarno , thank you |
Note
Adds retry-based plugin session resolution and reload-aware response detection, with new integration and Unity tests for domain reload resilience.
PluginHub._resolve_session_id: waits/retries for plugin reconnection during domain reloads (configurable viareload_max_retries/reload_retry_ms), prefers requestedunity_instance, falls back to first available session, and logs outcomes.MCPResponse: adds optionalhint(e.g.,"retry") for client handling.unity_connection._is_reloading_response: now recognizes both raw dicts andMCPResponse(includeshintdetection);send_command_with_retryreturn type updated accordingly.read_consolecalls and script creation stress cases.Written by Cursor Bugbot for commit 5d7418f. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.