Skip to content

Conversation

@dsarno
Copy link
Owner

@dsarno dsarno commented Nov 20, 2025

Note

Adds retry-based plugin session resolution and reload-aware response detection, with new integration and Unity tests for domain reload resilience.

  • Server resilience:
    • PluginHub._resolve_session_id: waits/retries for plugin reconnection during domain reloads (configurable via reload_max_retries/reload_retry_ms), prefers requested unity_instance, falls back to first available session, and logs outcomes.
    • MCPResponse: adds optional hint (e.g., "retry") for client handling.
    • unity_connection._is_reloading_response: now recognizes both raw dicts and MCPResponse (includes hint detection); send_command_with_retry return type updated accordingly.
  • Tests:
    • Python integration tests covering reconnection wait, timeout behavior, and instance preference.
    • Unity EditMode tests simulating domain reloads with rapid read_console calls and script creation stress cases.

Written by Cursor Bugbot for commit 5d7418f. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

  • New Features

    • Responses now support optional client-facing hints for enhanced guidance.
    • Enhanced plugin session recovery with configurable retry logic during Unity domain reloads, providing improved stability and connection resilience.
  • Tests

    • Added comprehensive integration and unit tests validating domain reload resilience, including reconnection timeout handling, session preference scenarios, and stress testing conditions.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR adds domain reload resilience to the MCP server by implementing a bounded-wait retry loop for plugin session resolution, updating response handling to support MCPResponse objects with hints, and introducing comprehensive integration and unit tests to validate behavior during Unity domain reloads.

Changes

Cohort / File(s) Summary
Response Model Enhancement
Server/models.py
Added optional hint: str | None = None field to MCPResponse model to enable client-facing hints in responses.
Plugin Session Retry Logic
Server/plugin_hub.py
Replaced immediate fallback with a deadline-based bounded-wait retry loop in _resolve_session_id. Introduces configurable retry parameters (max_retries, retry_ms, sleep_seconds) sourced from config. Adds deterministic session selection logic preferring specific Unity instances and includes debug logging for waiting, restoration, and reconnection failures.
Response Handling Updates
Server/unity_connection.py
Updated _is_reloading_response to accept generic object type instead of dict, now handles both dict payloads and MCPResponse objects by checking hint field for "retry" and composing messages from resp.message and resp.error. Expanded send_command_with_retry signature with keyword-only parameters: instance_id, max_retries, and retry_ms.
Server Integration Tests
Server/tests/integration/test_domain_reload_resilience.py
New test module containing five tests validating PluginHub session resolution resilience: waiting for plugin reconnection, timeout behavior, stress-read scenarios during simulated reload, and Unity instance preference handling. Uses AsyncMock stubs and direct registry manipulation for test isolation.
Unity Client Tests
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs
New edit-mode test suite validating MCP read_console resilience during and after domain reloads. Includes setup/teardown for temp asset management and multiple test cases covering script-triggered reloads, rapid console reads, and sequential domain reload scenarios.
Test Metadata
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs.meta
Unity MonoImporter metadata file for the new test class.

Sequence Diagram

sequenceDiagram
    participant Client as Client/Plugin
    participant Server as MCP Server
    participant PluginHub as PluginHub
    participant Registry as PluginRegistry
    
    Client->>Server: send_command_with_retry(command)
    Server->>PluginHub: _resolve_session_id()
    
    alt Session Available
        PluginHub->>PluginHub: _try_once() finds session
        PluginHub-->>Server: return session
        Server->>Client: execute command
        Client-->>Server: response
    else Domain Reload In Progress
        PluginHub->>PluginHub: _try_once() no session
        PluginHub->>PluginHub: start deadline-based wait loop
        Note over PluginHub: poll with sleep_seconds interval<br/>until deadline or session appears
        
        par Waiting for Reconnection
            Client->>Client: Unity domain reload
            Client->>Server: reconnect
        end
        
        PluginHub->>Registry: check for reconnected session
        alt Reconnected in Time
            PluginHub->>PluginHub: session restored
            PluginHub-->>Server: return session (with log)
            Server->>Client: retry command
            Client-->>Server: response
        else Timeout Exceeded
            PluginHub->>PluginHub: raise RuntimeError
            PluginHub-->>Server: error
            Server-->>Client: fail with timeout error
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Attention areas:
    • Server/plugin_hub.py: Verify retry loop correctness, deadline calculation, and edge cases around session selection logic and logging verbosity.
    • Server/unity_connection.py: Review type signature changes to _is_reloading_response and ensure MCPResponse hint handling aligns with response object contract.
    • Integration test mocking: Verify AsyncMock stubs correctly simulate plugin registry and session behavior; confirm test isolation via direct registry/lock manipulation.
    • Cross-layer integration: Confirm flow from send_command_with_retry through PluginHub retry logic and back through updated response handling.

Poem

🐰 A rabbit's ode to resilience

When domains reload and connections dissolve,
The retry loop waits with patience to solve,
With hints in responses and sessions restored,
The MCP hops onward—no crash to deplore! 🌱

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main objective of the changeset: adding retry logic for plugin session resolution during Unity domain reloads.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/domain-reload-retry

Comment @coderabbitai help to get the list of available commands and usage tips.

@dsarno
Copy link
Owner Author

dsarno commented Nov 20, 2025

@coderabbitai review this please

@dsarno
Copy link
Owner Author

dsarno commented Nov 20, 2025

@coderabbitai please review this

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

@dsarno, I'll review the changes in this PR for you. This looks like an important enhancement to handle Unity domain reload scenarios more robustly with retry logic and improved detection mechanisms.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

@dsarno, I'll review the changes in this PR. This enhancement adds important retry logic for handling Unity domain reload scenarios, which should make the plugin more robust when Unity reloads assemblies.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@dsarno
Copy link
Owner Author

dsarno commented Nov 20, 2025

@dsarno, I'll review the changes in this PR. This enhancement adds important retry logic for handling Unity domain reload scenarios, which should make the plugin more robust when Unity reloads assemblies.

✅ Actions performed
Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

This wasn't already reviewed. IT needs a first review.

@greptile-apps
Copy link

greptile-apps bot commented Nov 20, 2025

Greptile Overview

Greptile Summary

This PR implements retry logic to gracefully handle Unity domain reload gaps. When Unity reloads its domain, the WebSocket plugin disconnects and reconnects within seconds. Previously, MCP calls during this window would fail immediately with "No Unity plugins are currently connected" errors. Now, PluginHub._resolve_session_id waits up to ~10 seconds (configurable via reload_max_retries and reload_retry_ms) for the plugin to reconnect, making the reload transparent to clients.

Key Changes:

  • Added hint field to MCPResponse model for structured retry hints (e.g., hint="retry" during reloads)
  • Implemented bounded retry loop in PluginHub._resolve_session_id that polls for plugin reconnection
  • Enhanced _is_reloading_response to handle both dict and MCPResponse types
  • Added comprehensive Python integration tests and Unity C# stress tests

Testing Coverage:

  • Tests verify reconnection waiting, timeout behavior, and instance preference
  • Unity tests simulate rapid script creation (triggering domain reloads) + concurrent console reads

Confidence Score: 4/5

  • Safe to merge with one minor timing issue to address
  • The implementation is well-tested with both Python integration tests and Unity editor tests. The retry logic correctly handles plugin reconnection during domain reloads. However, there's a minor timing calculation issue where the deadline is computed before the first _try_once() call completes, which could slightly reduce the effective retry window
  • Pay attention to Server/plugin_hub.py - verify the deadline timing logic works as intended in your environment

Important Files Changed

File Analysis

Filename Score Overview
Server/models.py 5/5 Added optional hint field to MCPResponse for retry hints during Unity reload
Server/plugin_hub.py 4/5 Implemented retry logic in _resolve_session_id to wait for plugin reconnection during domain reloads with configurable timeout
Server/unity_connection.py 5/5 Enhanced _is_reloading_response to handle both dict and MCPResponse types, supporting the new hint field
Server/tests/integration/test_domain_reload_resilience.py 5/5 Added comprehensive integration tests for domain reload resilience covering reconnection wait, timeout, and instance preference

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant PluginHub as PluginHub
    participant Registry as PluginRegistry
    participant Unity as Unity Plugin

    Note over Unity: Domain Reload Starts
    Unity->>PluginHub: Disconnect WebSocket
    PluginHub->>Registry: unregister(session_id)
    
    Client->>PluginHub: send_command_for_instance()
    PluginHub->>PluginHub: _resolve_session_id()
    PluginHub->>Registry: get_session_id_by_hash(unity_instance)
    Registry-->>PluginHub: None (no session)
    PluginHub->>Registry: list_sessions()
    Registry-->>PluginHub: {} (empty)
    
    Note over PluginHub: Start retry loop<br/>(max_retries × sleep_seconds)
    
    loop Retry until session appears or timeout
        PluginHub->>PluginHub: await asyncio.sleep(sleep_seconds)
        PluginHub->>Registry: list_sessions()
        Registry-->>PluginHub: {} (still empty)
    end
    
    Note over Unity: Domain Reload Completes
    Unity->>PluginHub: Connect WebSocket
    Unity->>PluginHub: Register (session_id, project_hash)
    PluginHub->>Registry: register(session_id, project_name, project_hash)
    
    PluginHub->>Registry: list_sessions()
    Registry-->>PluginHub: {session_id: PluginSession}
    PluginHub-->>PluginHub: session_id found!
    
    PluginHub->>Unity: send_command(session_id, command_type, params)
    Unity-->>PluginHub: command result
    PluginHub-->>Client: success response
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

return next(iter(sessions.keys()))

session_id = await _try_once()
deadline = time.monotonic() + (max_retries * sleep_seconds)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: the deadline calculation should account for the time taken by the first _try_once() call

currently if _try_once() takes significant time, the retry loop gets less time than intended

Suggested change
deadline = time.monotonic() + (max_retries * sleep_seconds)
session_id = await _try_once()
wait_started_monotonic = time.monotonic()
deadline = wait_started_monotonic + (max_retries * sleep_seconds)
Prompt To Fix With AI
This is a comment left during a code review.
Path: Server/plugin_hub.py
Line: 240:240

Comment:
**logic:** the deadline calculation should account for the time taken by the first `_try_once()` call

currently if `_try_once()` takes significant time, the retry loop gets less time than intended

```suggestion
        session_id = await _try_once()
        wait_started_monotonic = time.monotonic()
        deadline = wait_started_monotonic + (max_retries * sleep_seconds)
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
Server/models.py (1)

11-13: MCPResponse.hint extension looks good; consider documenting expected values

The optional hint field is a clean, backward-compatible way to surface client guidance like "retry". To keep things self-documenting as more hints are introduced, consider centralizing/documenting the allowed values (e.g., in a docstring or constants/enum) so downstream code doesn’t rely on ad-hoc string literals.

Server/plugin_hub.py (1)

209-215: Session resolution wait/retry loop looks solid; minor docstring/lint cleanup

The new _resolve_session_id behavior is well-structured:

  • Prefers the requested unity_instance when available, otherwise falls back to the first active session.
  • Uses config-driven reload_max_retries / reload_retry_ms with a monotonic deadline and asyncio.sleep, so it’s cooperative with the event loop.
  • Logs both the start of waiting and restoration/timeout, which should help debug reload issues.

Two small polish points:

  • The docstring contains in‑flight with a non-breaking hyphen, which Ruff flags (RUF002). Replacing it with a normal - avoids that lint noise.
  • If you want to keep linters fully quiet on TRY003, you could eventually factor the "No Unity plugins are currently connected" message into a dedicated exception type, but that feels optional given this is already a clear, internal error.

Functionally this looks ready to ship.

Also applies to: 219-273

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs (1)

18-242: Good coverage of reload scenarios; be aware of timing-based flakiness from fixed waits

The edit-mode tests do a nice job exercising the critical paths: creating scripts to trigger domain reloads, issuing rapid read_console calls, and asserting both “all succeed” and “≥80% succeed” behavior under stress.

One thing to watch is the reliance on fixed WaitForSeconds delays (0.05–0.2s). On slower or heavily loaded CI machines, domain reloads can take longer than these hard-coded waits, which may introduce intermittent failures. If these tests ever start flaking, consider:

  • Swapping some of the fixed sleeps for short yield return null loops with an overall timeout, or
  • Using a higher-level readiness signal (e.g., polling a status/hint that indicates reload completion) instead of pure wall-clock delays.

As written they’re fine to merge; this is just something to keep in mind for long-term stability.

Server/unity_connection.py (1)

717-755: Align send_command_with_retry’s return typing with actual behavior

The retry loop correctly:

  • Uses config.reload_max_retries / reload_retry_ms by default.
  • Sleeps based on either a retry_after_ms hint from dict responses or the configured delay.
  • Preserves the final structured response when retries are exhausted.

However, in practice conn.send_command can return MCPResponse instances (e.g., preflight “Unity is reloading; please retry”, or the params is None guard), and when _is_reloading_response is false or you exhaust max_retries, that MCPResponse is returned directly. The async_send_command_with_retry wrapper already reflects this by advertising dict[str, Any] | MCPResponse, but send_command_with_retry is still annotated as Dict[str, Any].

To avoid surprises for callers and static type checkers, consider one of:

  • Broadening the sync signature and docstring to dict[str, Any] | MCPResponse, matching the async wrapper; or
  • Normalizing MCPResponse to a plain dict before returning (e.g., via model_dump()), if you’d prefer a pure-dict API at this layer.

Functionally it’s fine; this is mostly about type clarity and predictability for callers.

Also applies to: 757-789

Server/tests/integration/test_domain_reload_resilience.py (3)

51-66: Restore original PluginHub state instead of setting globals to None

Directly setting PluginHub._registry and PluginHub._lock to None in the cleanup blocks can leak into other tests and cause surprising failures if they rely on the original values.

Consider saving and restoring the previous state instead:

-    # Configure PluginHub with our mock
-    PluginHub._registry = mock_registry
-    PluginHub._lock = asyncio.Lock()
+    # Configure PluginHub with our mock, preserving original state
+    original_registry = PluginHub._registry
+    original_lock = PluginHub._lock
+    PluginHub._registry = mock_registry
+    PluginHub._lock = asyncio.Lock()
@@
-    finally:
-        # Clean up
-        PluginHub._registry = None
-        PluginHub._lock = None
+    finally:
+        # Clean up: restore original state
+        PluginHub._registry = original_registry
+        PluginHub._lock = original_lock

(and similarly in test_plugin_hub_fails_after_timeout).

Also applies to: 84-99


102-155: Tighten up lint issues in fake_send_command and the loop

Ruff’s hints here are valid and easy to address without changing behavior:

  • fake_send_command doesn’t use args/kwargs.
  • The loop index i isn’t used.

You can make this intent explicit and silence the warnings:

-    async def fake_send_command(*args, **kwargs):
+    async def fake_send_command(*_args, **_kwargs):
@@
-    for i in range(5):
+    for _ in range(5):

Everything else in this simulated reload/read_console stress test looks good.


157-214: Align the default-session assertion with the test’s expectation

The docstring and comment say the default call “should return first available session,” but the assertion allows either session:

# Should return first available session
assert session_id in ["session-1", "session-2"]

If _resolve_session_id is expected to deterministically choose the first available session, consider tightening this to:

-        # Should return first available session
-        assert session_id in ["session-1", "session-2"]
+        # Should return first available session
+        assert session_id == "session-1"

This will better guard against regressions in the selection behavior while still keeping the test simple.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9af504c and fd54f3f.

📒 Files selected for processing (6)
  • Server/models.py (1 hunks)
  • Server/plugin_hub.py (2 hunks)
  • Server/tests/integration/test_domain_reload_resilience.py (1 hunks)
  • Server/unity_connection.py (1 hunks)
  • TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs (1 hunks)
  • TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs.meta (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.5)
Server/tests/integration/test_domain_reload_resilience.py

117-117: Unused function argument: args

(ARG001)


117-117: Unused function argument: kwargs

(ARG001)


136-136: Loop control variable i not used within loop body

(B007)

Server/plugin_hub.py

214-214: Docstring contains ambiguous (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?

(RUF002)


217-217: Avoid specifying long messages outside the exception class

(TRY003)


271-271: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (3)
Server/unity_connection.py (1)

693-714: Centralized reload detection via _is_reloading_response looks correct

The helper cleanly unifies the two cases you care about:

  • MCPResponse from preflight/transport, with an explicit "retry" hint taking precedence, then a fallback text scan over message/error.
  • Raw Unity dict payloads, checking both state == "reloading" and "reload" in message/error.

This should make it much easier to evolve reload signaling without scattering heuristics around the codebase.

Server/tests/integration/test_domain_reload_resilience.py (2)

15-62: Good coverage of the reconnection retry behavior

This test nicely verifies that _resolve_session_id actually waits and retries until a session appears, and asserts both the resolved ID and that multiple list calls occurred. Looks solid and aligned with the new bounded‑wait behavior.


69-100: Timeout behavior test is clear and focused

This test clearly forces the “no sessions ever” path, overrides the reload retry config to keep it fast, and asserts the expected error message. The structure is straightforward and correctly scoped with the patch('plugin_hub.config') context.

Comment on lines 1 to 11
fileFormatVersion: 2
guid: $(uuidgen | tr -d '-' | tr '[:upper:]' '[:lower:]' | cut -c1-32)
MonoImporter:
externalObjects: {}
serializedVersion: 2
defaultReferences: []
executionOrder: 0
icon: {instanceID: 0}
userData:
assetBundleName:
assetBundleVariant:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Replace shell command in guid with an actual Unity GUID

The guid field currently contains a shell pipeline:

guid: $(uuidgen | tr -d '-' | tr '[:upper:]' '[:lower:]' | cut -c1-32)

Unity expects a concrete GUID string here, not a command substitution. As-is, the .meta file is invalid and Unity may ignore or regenerate it, which can break the link between the test asset and its metadata.

Update this to a fixed GUID generated by Unity (or copied from a Unity-created .meta), for example:

-fileFormatVersion: 2
-guid: $(uuidgen | tr -d '-' | tr '[:upper:]' '[:lower:]' | cut -c1-32)
+fileFormatVersion: 2
+guid: 0123456789abcdef0123456789abcdef  # replace with actual Unity-generated GUID

Best is to let Unity create the .meta by reimporting the asset and commit the resulting file.

🤖 Prompt for AI Agents
In
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs.meta
lines 1-11, the guid field contains a shell command substitution instead of a
concrete Unity GUID; replace the entire guid line with an actual 32-character
lowercase hex GUID string (or remove the file and let Unity reimport the asset
to generate a correct .meta), then commit the resulting .meta so the asset GUID
is valid and stable.

if not sessions:
return None
# Deterministic order: rely on insertion ordering
return next(iter(sessions.keys()))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Wrong Unity instance selected when specific instance unavailable

When a specific unity_instance is requested but not found, _try_once falls through to the fallback logic and returns any available session instead of None. This causes commands intended for a specific Unity project to be routed to a different project. The function should return None immediately after get_session_id_by_hash returns None for a specific instance, allowing the retry loop to wait for the correct instance to reconnect rather than silently connecting to the wrong project.

Fix in Cursor Fix in Web

@msanatan
Copy link

This is a really good fix @dsarno , thank you

@dsarno dsarno merged commit 2c3268d into use-uvx Nov 21, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants