Skip to content

Conversation

@dsarno
Copy link
Collaborator

@dsarno dsarno commented Jan 4, 2026

Summary by Sourcery

Introduce bounded, time-based retry behavior for Unity connections and plugin sessions to avoid long waits when editors or sessions are unavailable.

Enhancements:

  • Add response reason extraction helper to classify Unity reload and session-unavailable states across transports.
  • Bound Unity command reload retries by a configurable maximum wait duration and standardize short retry intervals with structured logging.
  • Cap plugin session resolution wait time with an environment-configurable deadline and improved diagnostic logging.
  • Return structured retryable error responses when Unity sessions are unavailable so callers can handle fast-fail and retry consistently.
  • Refine refresh_unity tool behavior to only mark true editor reload or missing-session cases as transient disconnects.

Summary by CodeRabbit

  • Bug Fixes
    • Improved connection reliability during Unity reloads and session loss, reducing false recoveries and handling error responses more safely.
  • Improvements
    • More predictable retry behavior with bounded per-attempt delays and configurable overall wait limits.
    • Faster, clearer handling when no Unity session is available, yielding consistent retry hints instead of generic failures.

✏️ Tip: You can customize this high-level summary in your review settings.

dsarno added 20 commits January 2, 2026 11:56
… stall prevention

- Add TestRunnerNoThrottle.cs: Sets editor to 'No Throttling' mode during test runs
  with SessionState persistence across domain reload
- Add run_tests_async and get_test_job tools for non-blocking test execution
- Add TestJobManager for async test job tracking with progress monitoring
- Add ForceSynchronousImport to all AssetDatabase.Refresh() calls to prevent stalls
- Mark DomainReloadResilienceTests as [Explicit] with documentation explaining
  the test infrastructure limitation (internal coroutine waits vs MCP socket polling)
- MCP workflow is unaffected - socket messages provide external stimulus that
  keeps Unity responsive even when backgrounded
- Remove unused Newtonsoft.Json.Linq import from TestJobManager
- Add throttling to SessionState persistence (once per second) to reduce overhead
- Critical job state changes (start/finish) still persist immediately
- Fix duplicate XML summary tag in DomainReloadResilienceTests
- Add run_tests_async and get_test_job to main README tools list
- Document background stall limitation for domain reload tests in DEV readme
Run [Explicit] domain_reload tests in their own job using -testCategory
Combines into single job with two test steps to reuse cached Library
- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set)
- Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues
- run_tests_async is now marked as preferred for long-running suites
- run_tests description notes it blocks and suggests async alternative
…sults

- manage_asset, manage_gameobject, manage_scene now check preflight return
  value and propagate busy/retry signals to clients (fixes Sourcery #1)
- TestJobManager.FinalizeCurrentJobFromRunFinished now sets job status to
  Failed when resultPayload.Failed > 0, not always Succeeded (fixes Sourcery #2)
When 'Force fresh server install' is enabled, uvx uses --no-cache --refresh
which rebuilds the package and takes significantly longer to start.

- Increase timeout from 10s to 45s when dev mode is enabled
- Add informative log message explaining the longer startup time
- Show actual timeout value in warning message
Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0
to correctly mark jobs as Failed when tests fail, even in the fallback path
when RunFinished callback is not delivered.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jan 4, 2026

Reviewer's Guide

Implements bounded, reason-aware retry behavior for Unity command and session handling, including an overall max-wait budget, improved retry interval handling, structured retry reasons, and consistent behavior across legacy transport, plugin hub, and refresh_unity tool.

Sequence diagram for PluginHub session resolution and retry response

sequenceDiagram
    actor Client
    participant PluginHub as PluginHub
    participant Registry as PluginRegistry
    participant UnityEditor as UnityEditor

    Client->>PluginHub: send_command_for_instance(unity_instance, command_type, params)
    PluginHub->>PluginHub: _resolve_session_id(unity_instance)
    PluginHub->>Registry: lookup sessions for unity_instance
    Registry-->>PluginHub: sessions, counts

    alt No sessions available
        PluginHub->>PluginHub: max_wait_s from UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S
        PluginHub->>PluginHub: retry_ms from config, sleep_seconds = clamp(retry_ms/1000, 0.05, 0.25)
        PluginHub->>PluginHub: deadline = time.monotonic() + max_wait_s
        PluginHub->>PluginHub: log No plugin session available, waiting up to max_wait_s

        loop Until session restored or deadline reached
            PluginHub->>PluginHub: await asyncio.sleep(sleep_seconds)
            PluginHub->>Registry: _try_once()
            Registry-->>PluginHub: session_id, session_count
        end

        alt session_id restored
            PluginHub->>PluginHub: log Plugin session restored
        else No session_id after wait
            PluginHub->>PluginHub: log No Unity plugin reconnected within max_wait_s
            PluginHub->>Client: raise RuntimeError("No Unity plugins are currently connected")
        end
    else Session already available
        PluginHub->>PluginHub: return session_id
    end

    Note over Client,PluginHub: send_command_for_instance handles RuntimeError

    PluginHub->>PluginHub: try: session_id = await _resolve_session_id(unity_instance)
    alt RuntimeError with "No Unity plugins are currently connected"
        PluginHub->>PluginHub: log Unity session unavailable, returning retry
        PluginHub-->>Client: MCPResponse(success=False, error="Unity session not available, please retry", hint=retry, data={reason: no_unity_session, retry_after_ms: 250})
    else Other errors
        PluginHub-->>Client: propagate exception
    end
Loading

File-Level Changes

Change Details Files
Introduce structured extraction of Unity response reasons and reuse it for reload detection and logging.
  • Add _extract_response_reason helper to normalize reason detection from MCPResponse and raw dict payloads, including support for embedded data.reason and message/error text heuristics.
  • Refactor _is_reloading_response to delegate to _extract_response_reason and only check for a standardized "reloading" reason.
  • Ensure non-matching responses return None instead of Falsey booleans to support richer reason handling.
Server/src/transport/legacy/unity_connection.py
Bound Unity command reload retries by a maximum wait duration, adjust sleep behavior, and add detailed logging plus a structured fallback error when the budget is exceeded.
  • Read UNITY_MCP_RELOAD_MAX_WAIT_S from the environment with a safe float parse and non-negative clamping.
  • Track the start time of reload waiting and break the retry loop when the max wait budget is exhausted or zero.
  • Normalize retry_after_ms extraction from both top-level and nested data fields, clamp per-iteration sleep between 50ms and 250ms, and log each retry with command, instance, reason, and sleep values.
  • On exit, log whether the wait completed or exceeded the budget and, if still reloading, return a standardized MCPResponse with hint=retry, reason=reloading, and a bounded retry_after_ms instead of looping indefinitely.
Server/src/transport/legacy/unity_connection.py
Bound Unity plugin session resolution by a maximum wait duration and align sleep behavior and logging with the new reload policy.
  • Replace retry-count-based deadline with a max_wait_s budget derived from UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S and clamp to non-negative values.
  • Adjust sleep interval to respect reload_retry_ms but clamp between 50ms and 250ms for consistency with command retry behavior.
  • Update debug and warning logs to use parameterized logging, reference the new max wait budget, and report actual wait time on success or failure.
  • Keep existing behavior for multi-session error handling while ensuring failure path clearly logs when no plugin reconnects within the budget.
Server/src/transport/plugin_hub.py
Surface a structured, retryable error when no Unity plugin sessions are available and treat that reason consistently in the refresh_unity tool.
  • Wrap _resolve_session_id in send_command_for_instance to intercept 'No Unity plugins are currently connected' RuntimeErrors, logging a debug message and returning an MCPResponse-like dict with success=False, hint=retry, reason=no_unity_session, and a fixed retry_after_ms.
  • Update refresh_unity to extract response.data.reason, include it in retryability decisions, and only mark recovered_from_disconnect when the reason is not 'reloading' or 'no_unity_session', avoiding premature success signaling while Unity is still unavailable.
  • Minor cleanup of trailing whitespace/blank lines in refresh_unity.
Server/src/transport/plugin_hub.py
Server/src/services/tools/refresh_unity.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 4, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds normalized extraction of response reasons, refines recoverability for Unity reload/no-session cases, implements time-budgeted reload waiting with per-attempt bounded delays and logging, and introduces bounded session resolution/readiness probing plus a NoUnitySessionError and retry MCPResponse handling.

Changes

Cohort / File(s) Summary
Refresh Unity recoverability
Server/src/services/tools/refresh_unity.py
Extracts reason from responses via new extractor and sets recovered_from_disconnect = True only when reason not in {"reloading","no_unity_session"}; preserves early-return behavior for non-wait/non-retryable errors.
Unity connection: reason extraction & reload wait
Server/src/transport/legacy/unity_connection.py
Adds `_extract_response_reason(resp) -> str
Plugin hub: session resolve & readiness probing
Server/src/transport/plugin_hub.py
Adds NoUnitySessionError. Implements UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S and UNITY_MCP_SESSION_READY_WAIT_SECONDS with validation/clamping; replaces retry-count dead-ends with time-based deadlines, clamps per-attempt sleeps to 0.05–0.25s, returns retry MCPResponses on no-session or readiness timeout, and updates logging to parameterized messages.

Sequence Diagrams

sequenceDiagram
    participant Client
    participant UnityConn as unity_connection
    participant Unity as Unity Process

    Client->>UnityConn: send_command_with_retry()
    Note over UnityConn: detect reloading -> record wait_start_time
    UnityConn->>Unity: send command
    Unity-->>UnityConn: MCPResponse (may indicate reloading)

    rect rgba(200,230,201,0.4)
        Note over UnityConn: Time-budgeted reload wait (max_wait_s)
        loop while elapsed < max_wait_s and response indicates reloading
            UnityConn->>UnityConn: reason = _extract_response_reason(response)
            UnityConn->>UnityConn: retry_after_ms = clamp(50..250)
            UnityConn->>UnityConn: sleep(retry_after_ms)
            UnityConn->>Unity: retry send command
            Unity-->>UnityConn: MCPResponse
        end
    end

    alt Budget exceeded (still reloading)
        UnityConn-->>Client: MCPResponse(reloading, reason, retry hint)
    else Recovered or non-reloading
        UnityConn-->>Client: final MCPResponse
    end
Loading
sequenceDiagram
    participant Client
    participant PluginHub as plugin_hub
    participant UnityConn as unity_connection
    participant Unity as Unity Process

    Client->>PluginHub: send_command_for_instance()

    alt No Unity plugins available
        rect rgba(255,224,178,0.35)
            PluginHub->>PluginHub: wait up to resolve_max_wait_s (clamped)
            PluginHub-->>Client: MCPResponse(retry)  <-- NoUnitySessionError handled
        end
    else Plugin exists but session not ready
        rect rgba(187,222,251,0.35)
            PluginHub->>UnityConn: readiness probe (bounded max wait)
            UnityConn->>Unity: ping
            alt Probe succeeds
                Unity-->>UnityConn: pong
                PluginHub->>UnityConn: send_command()
                UnityConn->>Unity: command
                Unity-->>UnityConn: response
                PluginHub-->>Client: response
            else Probe times out
                PluginHub-->>Client: MCPResponse(retry)
            end
        end
    else Session ready
        PluginHub->>UnityConn: send_command()
        UnityConn->>Unity: command
        Unity-->>UnityConn: response
        PluginHub-->>Client: response
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🐰 a rabbit nibbling on code,

counts each hop and every ping,
reasons found in sleepy logs,
waits bounded by gentle clocks,
then hops back with a happy spring.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Codex/implement bounded retry policy for unity' directly describes the main change: introducing bounded retry policies for Unity connections with configurable timeouts and structured error handling.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • When parsing UNITY_MCP_RELOAD_MAX_WAIT_S and UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, consider catching ValueError specifically and logging invalid values rather than suppressing all exceptions with a broad except, so misconfiguration is easier to diagnose.
  • In refresh_unity, the data/reason extraction can be simplified (the second isinstance(data, dict) check is redundant after the first), and you might reuse _extract_response_reason instead of duplicating the reason parsing logic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When parsing UNITY_MCP_RELOAD_MAX_WAIT_S and UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, consider catching ValueError specifically and logging invalid values rather than suppressing all exceptions with a broad except, so misconfiguration is easier to diagnose.
- In refresh_unity, the data/reason extraction can be simplified (the second isinstance(data, dict) check is redundant after the first), and you might reuse _extract_response_reason instead of duplicating the reason parsing logic.

## Individual Comments

### Comment 1
<location> `Server/src/transport/legacy/unity_connection.py:689-698` </location>
<code_context>
def _extract_response_reason(resp: object) -> str | None:
</code_context>

<issue_to_address>
**issue (bug_risk):** Normalize `reason` values (including from `data`) to avoid case-sensitive mismatches.

`_extract_response_reason` returns `data["reason"]` as-is, but callers (e.g. `_is_reloading_response`, `refresh_unity`) compare against lowercase strings like `"reloading"` / `"no_unity_session"`. If Unity returns different casing (e.g. `"Reloading"`), the checks will fail. Normalizing all reasons to lowercase in `_extract_response_reason` (both `data["reason"]` and the hard-coded values) would make these comparisons reliable and case-insensitive.
</issue_to_address>

### Comment 2
<location> `Server/src/services/tools/refresh_unity.py:50-55` </location>
<code_context>
     if isinstance(response, dict) and not response.get("success", True):
         hint = response.get("hint")
         err = (response.get("error") or response.get("message") or "")
+        data = response.get("data") if isinstance(response.get("data"), dict) else {}
+        reason = data.get("reason") if isinstance(data, dict) else None
         is_retryable = (hint == "retry") or ("disconnected" in str(err).lower())
         if (not wait_for_ready) or (not is_retryable):
             return MCPResponse(**response)
-        recovered_from_disconnect = True
+        if reason not in {"reloading", "no_unity_session"}:
+            recovered_from_disconnect = True

</code_context>

<issue_to_address>
**question (bug_risk):** Double-check the new `recovered_from_disconnect` condition; it may change semantics for some retryable cases.

Previously, any retryable error (`hint == "retry"` or `"disconnected" in err`) would set `recovered_from_disconnect = True`. Now that only happens when `reason` is *not* `"reloading"` or `"no_unity_session"`, so those two cases will now leave `recovered_from_disconnect = False` and may change downstream behavior that relies on that flag.

If the goal is to treat `"reloading"` / `"no_unity_session"` as a distinct transient category, consider an explicit `reason`-based branch (with a clear fallback for unknown reasons) instead of a negative check, so the semantics are easier to reason about and extend.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 50 to 55
data = response.get("data") if isinstance(response.get("data"), dict) else {}
reason = data.get("reason") if isinstance(data, dict) else None
is_retryable = (hint == "retry") or ("disconnected" in str(err).lower())
if (not wait_for_ready) or (not is_retryable):
return MCPResponse(**response)
recovered_from_disconnect = True
if reason not in {"reloading", "no_unity_session"}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (bug_risk): Double-check the new recovered_from_disconnect condition; it may change semantics for some retryable cases.

Previously, any retryable error (hint == "retry" or "disconnected" in err) would set recovered_from_disconnect = True. Now that only happens when reason is not "reloading" or "no_unity_session", so those two cases will now leave recovered_from_disconnect = False and may change downstream behavior that relies on that flag.

If the goal is to treat "reloading" / "no_unity_session" as a distinct transient category, consider an explicit reason-based branch (with a clear fallback for unknown reasons) instead of a negative check, so the semantics are easier to reason about and extend.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
Server/src/transport/legacy/unity_connection.py (1)

754-759: Consider narrowing the exception handler.

The broad Exception catch for parsing UNITY_MCP_RELOAD_MAX_WAIT_S is acceptable given the fallback default, but you could narrow it to ValueError for clarity since that's what float() raises on invalid input.

🔎 Proposed refinement
     try:
         max_wait_s = float(os.environ.get(
             "UNITY_MCP_RELOAD_MAX_WAIT_S", "2.0"))
-    except Exception:
+    except ValueError:
         max_wait_s = 2.0
Server/src/transport/plugin_hub.py (2)

365-370: Consider catching ValueError instead of bare Exception.

For environment variable parsing, float() raises ValueError on invalid input. Catching a specific exception type improves clarity and avoids masking unexpected errors.

Proposed fix
         try:
             max_wait_s = float(
                 os.environ.get("UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S", "2.0"))
-        except Exception:
+        except ValueError:
             max_wait_s = 2.0

453-468: String-based exception matching is fragile.

Checking "No Unity plugins are currently connected" in str(exc) couples this code to the exact wording of the error message at line 442. If that message is ever modified, this condition silently fails to match, causing the exception to propagate instead of returning a retry response.

Consider introducing a dedicated exception type (e.g., NoUnitySessionError) to make matching explicit and refactoring-safe.

Proposed approach
# Near PluginDisconnectedError definition (line 33)
class NoUnitySessionError(RuntimeError):
    """Raised when no Unity plugin session is available."""
-            raise RuntimeError("No Unity plugins are currently connected")
+            raise NoUnitySessionError("No Unity plugins are currently connected")
         try:
             session_id = await cls._resolve_session_id(unity_instance)
-        except RuntimeError as exc:
-            if "No Unity plugins are currently connected" in str(exc):
+        except NoUnitySessionError:
                 logger.debug(
                     "Unity session unavailable; returning retry: command=%s instance=%s",
                     command_type,
                     unity_instance or "default",
                 )
                 return MCPResponse(
                     success=False,
                     error="Unity session not available; please retry",
                     hint="retry",
                     data={"reason": "no_unity_session", "retry_after_ms": 250},
                 ).model_dump()
-            raise
Server/src/services/tools/refresh_unity.py (1)

50-51: Optional: Redundant type check on line 51.

The check isinstance(data, dict) on line 51 is redundant because line 50 guarantees that data is always a dict (either from response.get("data") if it's a dict, or an empty dict {}).

🔎 Proposed simplification
 data = response.get("data") if isinstance(response.get("data"), dict) else {}
-reason = data.get("reason") if isinstance(data, dict) else None
+reason = data.get("reason")
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbf137 and 600e353.

📒 Files selected for processing (3)
  • Server/src/services/tools/refresh_unity.py
  • Server/src/transport/legacy/unity_connection.py
  • Server/src/transport/plugin_hub.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-05T18:23:12.349Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

  • Server/src/transport/plugin_hub.py
  • Server/src/transport/legacy/unity_connection.py
🧬 Code graph analysis (1)
Server/src/transport/legacy/unity_connection.py (1)
Server/src/transport/plugin_hub.py (1)
  • send_command (133-210)
🪛 Ruff (0.14.10)
Server/src/transport/plugin_hub.py

368-368: Do not catch blind exception: Exception

(BLE001)

Server/src/transport/legacy/unity_connection.py

757-757: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (6)
Server/src/transport/legacy/unity_connection.py (3)

689-715: Well-structured response normalization helper.

The _extract_response_reason function effectively normalizes both MCPResponse objects and raw dict payloads. The fallback logic that infers "reloading" from message text (lines 697-698, 710-711) is pragmatic for handling legacy responses.

One consideration: the substring check for "reload" in error messages could theoretically match unrelated errors (e.g., "Failed to reload configuration"), though this is unlikely in practice given the Unity bridge context.


717-723: Clean refactoring.

The delegation to _extract_response_reason simplifies this predicate and improves maintainability.


763-825: Excellent time-budgeted retry implementation.

The reload-wait logic is well-designed with:

  • Configurable timeout via environment variable with safe clamping
  • Dynamic delay adjustment using Unity's retry_after_ms hint (lines 781-786)
  • Bounded sleep range (50-250ms) to prevent thrashing or long stalls
  • Detailed logging for debugging and observability
  • Graceful degradation returning a structured MCPResponse when the budget is exceeded

The tracking variables (wait_started, reason) and the final logging (lines 801-824) provide good operational visibility into reload behavior.

Server/src/transport/plugin_hub.py (2)

401-427: LGTM!

The deadline computation and parameterized log formatting are well-implemented. Using %-style placeholders in logging calls defers string interpolation until the message is actually emitted, which is both more efficient and a best practice.


475-501: Readiness probe logic is sound.

The while/else construct correctly distinguishes between a successful ping (breaks out) and timeout exhaustion (else branch returns retry). The clamped upper bound of 30s prevents runaway waits.

For consistency with the earlier suggestion, the exception handling at lines 477-479 and 486-487 could also be narrowed, though bare Exception at line 486 is defensible since any probe failure should be treated as "not ready."

Server/src/services/tools/refresh_unity.py (1)

55-56: Logic is correct, but clarify the comment.

The logic at lines 55-56 is actually intentional and correct, not inverted. The code distinguishes between:

  • Expected disconnect reasons ("reloading", "no_unity_session"): Return the original response with the reason so callers know why the disconnect occurred.
  • Unexpected disconnect reasons: Mark as "recovered" and return a success message.

Both "reloading" and "no_unity_session" are transient states from Unity's startup/reload cycle (verified in unity_connection.py and plugin_hub.py), so they should propagate the specific reason to clients rather than obscure it with a generic "recovered" message.

However, the code would benefit from a clarifying comment explaining this distinction. The current comment (line 44-46) mentions that reloading is legitimate, but doesn't explain why "reloading" and "no_unity_session" are excluded from being marked as "recovered". Consider adding a comment like:

# "reloading" and "no_unity_session" are expected transient states;
# only unexpected disconnects get marked as "recovered".

Likely an incorrect or invalid review comment.

Address code review feedback:

- Catch ValueError specifically (instead of broad Exception) when parsing
  UNITY_MCP_RELOAD_MAX_WAIT_S, UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, and
  UNITY_MCP_SESSION_READY_WAIT_SECONDS, with logging for easier diagnosis
  of misconfiguration

- Normalize reason values to lowercase in _extract_response_reason() to
  avoid case-sensitive mismatches in comparisons

- Simplify refresh_unity.py by removing redundant isinstance check and
  reusing _extract_response_reason instead of duplicating reason parsing
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
Server/src/transport/legacy/unity_connection.py (1)

689-729: LGTM! Case-insensitive reason extraction implemented correctly.

The helper functions properly normalize all reason strings to lowercase (lines 700, 713), including hard-coded values like "reloading", enabling reliable case-insensitive comparisons by callers. This addresses the past review concern about case-sensitive mismatches.

Server/src/services/tools/refresh_unity.py (1)

54-55: LGTM! Intentional refinement of recoverability semantics.

This change aligns with the PR objective to distinguish between true disconnects and transient Unity reload/session-unavailable states. The conditional logic correctly identifies only non-reload, non-missing-session retryable errors as disconnect recoveries, enabling more precise downstream handling.

🧹 Nitpick comments (1)
Server/src/transport/legacy/unity_connection.py (1)

805-805: Optional: Remove redundant max(0.0, ...) guard.

Since sleep_ms is already clamped to [50, 250] on line 796, the max(0.0, ...) guard here is redundant.

🔎 Proposed refactor
-time.sleep(max(0.0, sleep_ms / 1000.0))
+time.sleep(sleep_ms / 1000.0)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 600e353 and 4e1b090.

📒 Files selected for processing (3)
  • Server/src/services/tools/refresh_unity.py
  • Server/src/transport/legacy/unity_connection.py
  • Server/src/transport/plugin_hub.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-05T18:23:12.349Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

  • Server/src/transport/legacy/unity_connection.py
  • Server/src/transport/plugin_hub.py
  • Server/src/services/tools/refresh_unity.py
📚 Learning: 2025-10-13T13:27:23.040Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 316
File: TestProjects/UnityMCPTests/Assets/Tests/EditMode/Resources.meta:1-8
Timestamp: 2025-10-13T13:27:23.040Z
Learning: UnityMcpBridge is a legacy project kept for backwards compatibility; MCPForUnity is the only active Unity plugin project. GUID collisions between UnityMcpBridge and MCPForUnity are acceptable.

Applied to files:

  • Server/src/services/tools/refresh_unity.py
📚 Learning: 2025-10-13T13:41:00.086Z
Learnt from: JohanHoltby
Repo: CoplayDev/unity-mcp PR: 309
File: MCPForUnity/Editor/Helpers/ServerInstaller.cs:478-508
Timestamp: 2025-10-13T13:41:00.086Z
Learning: In the MCPForUnityTools feature (MCPForUnity/Editor/Helpers/ServerInstaller.cs), the design intentionally forces users to have only one .py file per MCPForUnityTools folder to keep file tracking simple. Package-style tools (subdirectories with __init__.py) are not supported.

Applied to files:

  • Server/src/services/tools/refresh_unity.py
🧬 Code graph analysis (3)
Server/src/transport/legacy/unity_connection.py (1)
Server/src/transport/plugin_hub.py (1)
  • send_command (133-210)
Server/src/transport/plugin_hub.py (1)
Server/tests/integration/test_helpers.py (3)
  • warning (39-40)
  • error (46-47)
  • model_dump (10-13)
Server/src/services/tools/refresh_unity.py (1)
Server/src/transport/legacy/unity_connection.py (2)
  • async_send_command_with_retry (837-869)
  • _extract_response_reason (689-719)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (6)
Server/src/transport/legacy/unity_connection.py (1)

759-773: LGTM! Robust environment variable parsing.

The parsing logic correctly handles invalid values with a warning fallback (lines 762-767) and clamps to non-negative values (line 768). The initialization of wait tracking variables is sound.

Server/src/services/tools/refresh_unity.py (1)

13-13: LGTM! Proper use of centralized reason extraction.

The import and usage of _extract_response_reason correctly normalizes Unity response reasons, enabling consistent handling across the codebase.

Also applies to: 50-50

Server/src/transport/plugin_hub.py (4)

364-376: LGTM! Consistent environment variable parsing pattern.

The parsing, error handling, and clamping logic for UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S follows the same robust pattern as in unity_connection.py. The sleep interval is properly bounded to [0.05, 0.25] seconds.


405-405: LGTM! Proper time-based deadline and structured logging.

The deadline calculation uses the new max_wait_s budget (line 405), and all logging statements correctly use parameterized formatting for performance and clarity.

Also applies to: 419-431, 440-443


457-472: LGTM! Structured fast-fail for unavailable sessions.

The fast-fail path correctly returns a structured MCPResponse with reason="no_unity_session" instead of raising an exception. This enables callers to handle the unavailable state with consistent retry semantics, aligning with the broader reason-based retry model introduced in this PR.


479-509: LGTM! Bounded readiness probe prevents premature fast-fail commands.

The readiness probe correctly uses ping/pong to verify Unity's main thread is responsive before dispatching fast-fail commands. The bounded wait (clamped to [0, 30] seconds) and structured retry response on timeout align with the PR's goal of avoiding long waits while maintaining reliability.

Comment on lines +789 to +826
delay_ms = retry_ms
if isinstance(response, dict):
retry_after = response.get("retry_after_ms")
if retry_after is None and isinstance(response.get("data"), dict):
retry_after = response["data"].get("retry_after_ms")
if retry_after is not None:
delay_ms = int(retry_after)
sleep_ms = max(50, min(int(delay_ms), 250))
logger.debug(
"Unity reload wait retry: command=%s instance=%s reason=%s retry_after_ms=%s sleep_ms=%s",
command_type,
instance_id or "default",
reason or "reloading",
delay_ms,
sleep_ms,
)
time.sleep(max(0.0, sleep_ms / 1000.0))
retries += 1
response = conn.send_command(command_type, params)
reason = _extract_response_reason(response)

if wait_started is not None:
waited = time.monotonic() - wait_started
if _is_reloading_response(response):
logger.debug(
"Unity reload wait exceeded budget: command=%s instance=%s waited_s=%.3f",
command_type,
instance_id or "default",
waited,
)
return MCPResponse(
success=False,
error="Unity is reloading; please retry",
hint="retry",
data={
"reason": "reloading",
"retry_after_ms": min(250, max(50, retry_ms)),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent retry_after_ms calculation.

Line 796 clamps int(delay_ms) to [50, 250], but line 825 clamps the original retry_ms parameter. This means the returned retry_after_ms (line 825) may differ from the actual sleep duration used in retries (line 796), especially when Unity provides a custom retry_after_ms in the response (lines 791-795).

For consistency and accuracy, consider using the same computed sleep_ms value:

🔎 Proposed fix
 sleep_ms = max(50, min(int(delay_ms), 250))
 logger.debug(
     "Unity reload wait retry: command=%s instance=%s reason=%s retry_after_ms=%s sleep_ms=%s",
     command_type,
     instance_id or "default",
     reason or "reloading",
     delay_ms,
     sleep_ms,
 )
 time.sleep(max(0.0, sleep_ms / 1000.0))
 retries += 1
 response = conn.send_command(command_type, params)
 reason = _extract_response_reason(response)

 if wait_started is not None:
     waited = time.monotonic() - wait_started
     if _is_reloading_response(response):
         logger.debug(
             "Unity reload wait exceeded budget: command=%s instance=%s waited_s=%.3f",
             command_type,
             instance_id or "default",
             waited,
         )
         return MCPResponse(
             success=False,
             error="Unity is reloading; please retry",
             hint="retry",
             data={
                 "reason": "reloading",
-                "retry_after_ms": min(250, max(50, retry_ms)),
+                "retry_after_ms": sleep_ms,
             },
         )
🤖 Prompt for AI Agents
In Server/src/transport/legacy/unity_connection.py around lines 789 to 826, the
computed sleep_ms (which clamps delay_ms to [50,250]) is used for the actual
sleep but the returned retry_after_ms is computed from the original retry_ms
parameter, causing inconsistency; change the returned retry_after_ms to use the
same clamped value (sleep_ms) used for sleeping (converted to int milliseconds)
so the response accurately reflects the actual retry delay.

- Add upper bound (30s) to UNITY_MCP_RELOAD_MAX_WAIT_S to prevent
  misconfiguration from causing excessive waits
- Add upper bound (30s) to UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S for
  consistency with readiness probe
- Introduce NoUnitySessionError custom exception to replace fragile
  string matching in send_command_for_instance

Addresses code review feedback for bounded retry policy PR.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
Server/src/transport/legacy/unity_connection.py (1)

790-828: Address the inconsistent retry_after_ms calculation.

As noted in a previous review, line 826 returns retry_after_ms computed from the original retry_ms parameter, but line 797 uses the clamped sleep_ms for the actual sleep delay. When Unity provides a custom retry_after_ms in the response (lines 791-796), these values can diverge, making the returned retry_after_ms inaccurate.

For consistency and accuracy, the returned retry_after_ms should reflect the actual sleep duration used.

🔎 Proposed fix
                 "reason": "reloading",
-                "retry_after_ms": min(250, max(50, retry_ms)),
+                "retry_after_ms": sleep_ms,
             },
         )
🧹 Nitpick comments (1)
Server/src/transport/plugin_hub.py (1)

37-39: Consider defining the default error message in the exception class.

While the current implementation is functional, the static analysis hint at line 451 is valid. Moving the error message to the exception class improves reusability and follows Python best practices.

🔎 Proposed refactor
 class NoUnitySessionError(RuntimeError):
-    """Raised when no Unity plugins are available."""
+    """Raised when no Unity plugins are available."""
+    
+    def __init__(self, message: str = "No Unity plugins are currently connected"):
+        super().__init__(message)

Then at line 451, simplify to:

-            raise NoUnitySessionError("No Unity plugins are currently connected")
+            raise NoUnitySessionError()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e1b090 and f4d2637.

📒 Files selected for processing (2)
  • Server/src/transport/legacy/unity_connection.py
  • Server/src/transport/plugin_hub.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-05T18:23:12.349Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

  • Server/src/transport/legacy/unity_connection.py
  • Server/src/transport/plugin_hub.py
🧬 Code graph analysis (1)
Server/src/transport/plugin_hub.py (1)
Server/tests/integration/test_helpers.py (1)
  • model_dump (10-13)
🪛 Ruff (0.14.10)
Server/src/transport/plugin_hub.py

451-451: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (8)
Server/src/transport/plugin_hub.py (4)

368-381: LGTM! Well-structured environment variable handling.

The bounded retry implementation correctly:

  • Parses and validates the environment variable with appropriate error handling
  • Clamps max_wait_s to [0, 30] to prevent misconfiguration
  • Bounds sleep intervals to [0.05, 0.25] seconds
  • Uses parameterized logging for debugging

410-451: LGTM! Consistent time-based deadline and logging.

The implementation correctly:

  • Uses time.monotonic() for deadline calculations (immune to system clock changes)
  • Applies parameterized logging consistently throughout the wait loop
  • Raises the new NoUnitySessionError exception for structured error handling

462-475: LGTM! Structured error handling for unavailable sessions.

The exception handling correctly:

  • Catches the new NoUnitySessionError exception
  • Returns a structured MCPResponse with retry semantics
  • Includes diagnostic data (reason and retry_after_ms) for clients
  • Uses appropriate logging

483-491: LGTM! Consistent environment variable handling.

The readiness wait configuration follows the same robust pattern as session resolution:

  • Validates environment variable with appropriate exception handling
  • Clamps to [0, 30] to prevent misconfiguration
  • Uses parameterized logging for diagnostics
Server/src/transport/legacy/unity_connection.py (4)

689-720: LGTM! Robust response reason extraction.

The implementation correctly:

  • Normalizes all reason values to lowercase for case-insensitive comparisons
  • Handles both MCPResponse objects and raw dict responses
  • Extracts reasons from both data["reason"] and message/error text
  • Maps reload-related responses to the canonical "reloading" string
  • Returns None for non-reloadable cases

This addresses the previous review concern about case-sensitive mismatches.


722-728: LGTM! Clean semantic wrapper.

The function provides a clear, semantic interface for checking reload status by leveraging the normalized reason extraction.


759-769: LGTM! Consistent bounded retry configuration.

The environment variable handling correctly:

  • Parses and validates UNITY_MCP_RELOAD_MAX_WAIT_S with appropriate error handling
  • Clamps to [0, 30] to prevent excessive waits from misconfiguration
  • Uses parameterized logging for diagnostics
  • Follows the same pattern as plugin_hub.py for consistency

773-789: LGTM! Well-implemented time-budgeted retry loop.

The retry implementation correctly:

  • Uses time.monotonic() for accurate elapsed time measurement
  • Tracks wait start time and enforces the time budget
  • Provides extensive diagnostic logging at each stage (start, per-retry, exceeded, completed)
  • Returns structured MCPResponse with retry hints when the budget is exceeded
  • Clamps per-retry sleep to [50, 250]ms for bounded delays

The observability and bounded behavior align well with the PR objectives.

Also applies to: 807-835

@dsarno dsarno merged commit 96b81ca into CoplayDev:main Jan 4, 2026
2 checks passed
@dsarno dsarno deleted the codex/implement-bounded-retry-policy-for-unity branch January 4, 2026 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant