Codex/implement bounded retry policy for unity #510

dsarno · 2026-01-04T20:25:37Z

Summary by Sourcery

Introduce bounded, time-based retry behavior for Unity connections and plugin sessions to avoid long waits when editors or sessions are unavailable.

Enhancements:

Add response reason extraction helper to classify Unity reload and session-unavailable states across transports.
Bound Unity command reload retries by a configurable maximum wait duration and standardize short retry intervals with structured logging.
Cap plugin session resolution wait time with an environment-configurable deadline and improved diagnostic logging.
Return structured retryable error responses when Unity sessions are unavailable so callers can handle fast-fail and retry consistently.
Refine refresh_unity tool behavior to only mark true editor reload or missing-session cases as transient disconnects.

Summary by CodeRabbit

Bug Fixes
- Improved connection reliability during Unity reloads and session loss, reducing false recoveries and handling error responses more safely.
Improvements
- More predictable retry behavior with bounded per-attempt delays and configurable overall wait limits.
- Faster, clearer handling when no Unity session is available, yielding consistent retry hints instead of generic failures.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

… stall prevention - Add TestRunnerNoThrottle.cs: Sets editor to 'No Throttling' mode during test runs with SessionState persistence across domain reload - Add run_tests_async and get_test_job tools for non-blocking test execution - Add TestJobManager for async test job tracking with progress monitoring - Add ForceSynchronousImport to all AssetDatabase.Refresh() calls to prevent stalls - Mark DomainReloadResilienceTests as [Explicit] with documentation explaining the test infrastructure limitation (internal coroutine waits vs MCP socket polling) - MCP workflow is unaffected - socket messages provide external stimulus that keeps Unity responsive even when backgrounded

- Remove unused Newtonsoft.Json.Linq import from TestJobManager - Add throttling to SessionState persistence (once per second) to reduce overhead - Critical job state changes (start/finish) still persist immediately - Fix duplicate XML summary tag in DomainReloadResilienceTests

- Add run_tests_async and get_test_job to main README tools list - Document background stall limitation for domain reload tests in DEV readme

Run [Explicit] domain_reload tests in their own job using -testCategory

Combines into single job with two test steps to reuse cached Library

- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set) - Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues

- run_tests_async is now marked as preferred for long-running suites - run_tests description notes it blocks and suggests async alternative

…sults - manage_asset, manage_gameobject, manage_scene now check preflight return value and propagate busy/retry signals to clients (fixes Sourcery #1) - TestJobManager.FinalizeCurrentJobFromRunFinished now sets job status to Failed when resultPayload.Failed > 0, not always Succeeded (fixes Sourcery #2)

When 'Force fresh server install' is enabled, uvx uses --no-cache --refresh which rebuilds the package and takes significantly longer to start. - Increase timeout from 10s to 45s when dev mode is enabled - Add informative log message explaining the longer startup time - Show actual timeout value in warning message

Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0 to correctly mark jobs as Failed when tests fail, even in the fallback path when RunFinished callback is not delivered.

sourcery-ai · 2026-01-04T20:25:44Z

Reviewer's Guide

Implements bounded, reason-aware retry behavior for Unity command and session handling, including an overall max-wait budget, improved retry interval handling, structured retry reasons, and consistent behavior across legacy transport, plugin hub, and refresh_unity tool.

Sequence diagram for PluginHub session resolution and retry response

sequenceDiagram
    actor Client
    participant PluginHub as PluginHub
    participant Registry as PluginRegistry
    participant UnityEditor as UnityEditor

    Client->>PluginHub: send_command_for_instance(unity_instance, command_type, params)
    PluginHub->>PluginHub: _resolve_session_id(unity_instance)
    PluginHub->>Registry: lookup sessions for unity_instance
    Registry-->>PluginHub: sessions, counts

    alt No sessions available
        PluginHub->>PluginHub: max_wait_s from UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S
        PluginHub->>PluginHub: retry_ms from config, sleep_seconds = clamp(retry_ms/1000, 0.05, 0.25)
        PluginHub->>PluginHub: deadline = time.monotonic() + max_wait_s
        PluginHub->>PluginHub: log No plugin session available, waiting up to max_wait_s

        loop Until session restored or deadline reached
            PluginHub->>PluginHub: await asyncio.sleep(sleep_seconds)
            PluginHub->>Registry: _try_once()
            Registry-->>PluginHub: session_id, session_count
        end

        alt session_id restored
            PluginHub->>PluginHub: log Plugin session restored
        else No session_id after wait
            PluginHub->>PluginHub: log No Unity plugin reconnected within max_wait_s
            PluginHub->>Client: raise RuntimeError("No Unity plugins are currently connected")
        end
    else Session already available
        PluginHub->>PluginHub: return session_id
    end

    Note over Client,PluginHub: send_command_for_instance handles RuntimeError

    PluginHub->>PluginHub: try: session_id = await _resolve_session_id(unity_instance)
    alt RuntimeError with "No Unity plugins are currently connected"
        PluginHub->>PluginHub: log Unity session unavailable, returning retry
        PluginHub-->>Client: MCPResponse(success=False, error="Unity session not available, please retry", hint=retry, data={reason: no_unity_session, retry_after_ms: 250})
    else Other errors
        PluginHub-->>Client: propagate exception
    end

File-Level Changes

Change	Details	Files
Introduce structured extraction of Unity response reasons and reuse it for reload detection and logging.	Add _extract_response_reason helper to normalize reason detection from MCPResponse and raw dict payloads, including support for embedded data.reason and message/error text heuristics. Refactor _is_reloading_response to delegate to _extract_response_reason and only check for a standardized "reloading" reason. Ensure non-matching responses return None instead of Falsey booleans to support richer reason handling.	`Server/src/transport/legacy/unity_connection.py`
Bound Unity command reload retries by a maximum wait duration, adjust sleep behavior, and add detailed logging plus a structured fallback error when the budget is exceeded.	Read UNITY_MCP_RELOAD_MAX_WAIT_S from the environment with a safe float parse and non-negative clamping. Track the start time of reload waiting and break the retry loop when the max wait budget is exhausted or zero. Normalize retry_after_ms extraction from both top-level and nested data fields, clamp per-iteration sleep between 50ms and 250ms, and log each retry with command, instance, reason, and sleep values. On exit, log whether the wait completed or exceeded the budget and, if still reloading, return a standardized MCPResponse with hint=retry, reason=reloading, and a bounded retry_after_ms instead of looping indefinitely.	`Server/src/transport/legacy/unity_connection.py`
Bound Unity plugin session resolution by a maximum wait duration and align sleep behavior and logging with the new reload policy.	Replace retry-count-based deadline with a max_wait_s budget derived from UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S and clamp to non-negative values. Adjust sleep interval to respect reload_retry_ms but clamp between 50ms and 250ms for consistency with command retry behavior. Update debug and warning logs to use parameterized logging, reference the new max wait budget, and report actual wait time on success or failure. Keep existing behavior for multi-session error handling while ensuring failure path clearly logs when no plugin reconnects within the budget.	`Server/src/transport/plugin_hub.py`
Surface a structured, retryable error when no Unity plugin sessions are available and treat that reason consistently in the refresh_unity tool.	Wrap _resolve_session_id in send_command_for_instance to intercept 'No Unity plugins are currently connected' RuntimeErrors, logging a debug message and returning an MCPResponse-like dict with success=False, hint=retry, reason=no_unity_session, and a fixed retry_after_ms. Update refresh_unity to extract response.data.reason, include it in retryability decisions, and only mark recovered_from_disconnect when the reason is not 'reloading' or 'no_unity_session', avoiding premature success signaling while Unity is still unavailable. Minor cleanup of trailing whitespace/blank lines in refresh_unity.	`Server/src/transport/plugin_hub.py` `Server/src/services/tools/refresh_unity.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-01-04T20:26:06Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds normalized extraction of response reasons, refines recoverability for Unity reload/no-session cases, implements time-budgeted reload waiting with per-attempt bounded delays and logging, and introduces bounded session resolution/readiness probing plus a NoUnitySessionError and retry MCPResponse handling.

Changes

Cohort / File(s)	Summary
Refresh Unity recoverability `Server/src/services/tools/refresh_unity.py`	Extracts `reason` from responses via new extractor and sets `recovered_from_disconnect = True` only when `reason` not in `{"reloading","no_unity_session"}`; preserves early-return behavior for non-wait/non-retryable errors.
Unity connection: reason extraction & reload wait `Server/src/transport/legacy/unity_connection.py`	Adds `_extract_response_reason(resp) -> str
Plugin hub: session resolve & readiness probing `Server/src/transport/plugin_hub.py`	Adds `NoUnitySessionError`. Implements `UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S` and `UNITY_MCP_SESSION_READY_WAIT_SECONDS` with validation/clamping; replaces retry-count dead-ends with time-based deadlines, clamps per-attempt sleeps to 0.05–0.25s, returns retry MCPResponses on no-session or readiness timeout, and updates logging to parameterized messages.

Sequence Diagrams

sequenceDiagram
    participant Client
    participant UnityConn as unity_connection
    participant Unity as Unity Process

    Client->>UnityConn: send_command_with_retry()
    Note over UnityConn: detect reloading -> record wait_start_time
    UnityConn->>Unity: send command
    Unity-->>UnityConn: MCPResponse (may indicate reloading)

    rect rgba(200,230,201,0.4)
        Note over UnityConn: Time-budgeted reload wait (max_wait_s)
        loop while elapsed < max_wait_s and response indicates reloading
            UnityConn->>UnityConn: reason = _extract_response_reason(response)
            UnityConn->>UnityConn: retry_after_ms = clamp(50..250)
            UnityConn->>UnityConn: sleep(retry_after_ms)
            UnityConn->>Unity: retry send command
            Unity-->>UnityConn: MCPResponse
        end
    end

    alt Budget exceeded (still reloading)
        UnityConn-->>Client: MCPResponse(reloading, reason, retry hint)
    else Recovered or non-reloading
        UnityConn-->>Client: final MCPResponse
    end

sequenceDiagram
    participant Client
    participant PluginHub as plugin_hub
    participant UnityConn as unity_connection
    participant Unity as Unity Process

    Client->>PluginHub: send_command_for_instance()

    alt No Unity plugins available
        rect rgba(255,224,178,0.35)
            PluginHub->>PluginHub: wait up to resolve_max_wait_s (clamped)
            PluginHub-->>Client: MCPResponse(retry)  <-- NoUnitySessionError handled
        end
    else Plugin exists but session not ready
        rect rgba(187,222,251,0.35)
            PluginHub->>UnityConn: readiness probe (bounded max wait)
            UnityConn->>Unity: ping
            alt Probe succeeds
                Unity-->>UnityConn: pong
                PluginHub->>UnityConn: send_command()
                UnityConn->>Unity: command
                Unity-->>UnityConn: response
                PluginHub-->>Client: response
            else Probe times out
                PluginHub-->>Client: MCPResponse(retry)
            end
        end
    else Session ready
        PluginHub->>UnityConn: send_command()
        UnityConn->>Unity: command
        Unity-->>UnityConn: response
        PluginHub-->>Client: response
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🐰 a rabbit nibbling on code,

counts each hop and every ping,
reasons found in sleepy logs,
waits bounded by gentle clocks,
then hops back with a happy spring.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Codex/implement bounded retry policy for unity' directly describes the main change: introducing bounded retry policies for Unity connections with configurable timeouts and structured error handling.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

When parsing UNITY_MCP_RELOAD_MAX_WAIT_S and UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, consider catching ValueError specifically and logging invalid values rather than suppressing all exceptions with a broad except, so misconfiguration is easier to diagnose.
In refresh_unity, the data/reason extraction can be simplified (the second isinstance(data, dict) check is redundant after the first), and you might reuse _extract_response_reason instead of duplicating the reason parsing logic.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- When parsing UNITY_MCP_RELOAD_MAX_WAIT_S and UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, consider catching ValueError specifically and logging invalid values rather than suppressing all exceptions with a broad except, so misconfiguration is easier to diagnose.
- In refresh_unity, the data/reason extraction can be simplified (the second isinstance(data, dict) check is redundant after the first), and you might reuse _extract_response_reason instead of duplicating the reason parsing logic.

## Individual Comments

### Comment 1
<location> `Server/src/transport/legacy/unity_connection.py:689-698` </location>
<code_context>
def _extract_response_reason(resp: object) -> str | None:
</code_context>

<issue_to_address>
**issue (bug_risk):** Normalize `reason` values (including from `data`) to avoid case-sensitive mismatches.

`_extract_response_reason` returns `data["reason"]` as-is, but callers (e.g. `_is_reloading_response`, `refresh_unity`) compare against lowercase strings like `"reloading"` / `"no_unity_session"`. If Unity returns different casing (e.g. `"Reloading"`), the checks will fail. Normalizing all reasons to lowercase in `_extract_response_reason` (both `data["reason"]` and the hard-coded values) would make these comparisons reliable and case-insensitive.
</issue_to_address>

### Comment 2
<location> `Server/src/services/tools/refresh_unity.py:50-55` </location>
<code_context>
     if isinstance(response, dict) and not response.get("success", True):
         hint = response.get("hint")
         err = (response.get("error") or response.get("message") or "")
+        data = response.get("data") if isinstance(response.get("data"), dict) else {}
+        reason = data.get("reason") if isinstance(data, dict) else None
         is_retryable = (hint == "retry") or ("disconnected" in str(err).lower())
         if (not wait_for_ready) or (not is_retryable):
             return MCPResponse(**response)
-        recovered_from_disconnect = True
+        if reason not in {"reloading", "no_unity_session"}:
+            recovered_from_disconnect = True

</code_context>

<issue_to_address>
**question (bug_risk):** Double-check the new `recovered_from_disconnect` condition; it may change semantics for some retryable cases.

Previously, any retryable error (`hint == "retry"` or `"disconnected" in err`) would set `recovered_from_disconnect = True`. Now that only happens when `reason` is *not* `"reloading"` or `"no_unity_session"`, so those two cases will now leave `recovered_from_disconnect = False` and may change downstream behavior that relies on that flag.

If the goal is to treat `"reloading"` / `"no_unity_session"` as a distinct transient category, consider an explicit `reason`-based branch (with a clear fallback for unknown reasons) instead of a negative check, so the semantics are easier to reason about and extend.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Server/src/transport/legacy/unity_connection.py

sourcery-ai · 2026-01-04T20:26:52Z

Server/src/services/tools/refresh_unity.py

+        data = response.get("data") if isinstance(response.get("data"), dict) else {}
+        reason = data.get("reason") if isinstance(data, dict) else None
        is_retryable = (hint == "retry") or ("disconnected" in str(err).lower())
        if (not wait_for_ready) or (not is_retryable):
            return MCPResponse(**response)
-        recovered_from_disconnect = True
+        if reason not in {"reloading", "no_unity_session"}:


question (bug_risk): Double-check the new recovered_from_disconnect condition; it may change semantics for some retryable cases.

Previously, any retryable error (hint == "retry" or "disconnected" in err) would set recovered_from_disconnect = True. Now that only happens when reason is not "reloading" or "no_unity_session", so those two cases will now leave recovered_from_disconnect = False and may change downstream behavior that relies on that flag.

If the goal is to treat "reloading" / "no_unity_session" as a distinct transient category, consider an explicit reason-based branch (with a clear fallback for unknown reasons) instead of a negative check, so the semantics are easier to reason about and extend.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

Server/src/transport/legacy/unity_connection.py (1)
754-759: Consider narrowing the exception handler.

The broad Exception catch for parsing UNITY_MCP_RELOAD_MAX_WAIT_S is acceptable given the fallback default, but you could narrow it to ValueError for clarity since that's what float() raises on invalid input.
🔎 Proposed refinement
     try:
         max_wait_s = float(os.environ.get(
             "UNITY_MCP_RELOAD_MAX_WAIT_S", "2.0"))
-    except Exception:
+    except ValueError:
         max_wait_s = 2.0
Server/src/transport/plugin_hub.py (2)
365-370: Consider catching ValueError instead of bare Exception.

For environment variable parsing, float() raises ValueError on invalid input. Catching a specific exception type improves clarity and avoids masking unexpected errors.
Proposed fix
         try:
             max_wait_s = float(
                 os.environ.get("UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S", "2.0"))
-        except Exception:
+        except ValueError:
             max_wait_s = 2.0
453-468: String-based exception matching is fragile.

Checking "No Unity plugins are currently connected" in str(exc) couples this code to the exact wording of the error message at line 442. If that message is ever modified, this condition silently fails to match, causing the exception to propagate instead of returning a retry response.

Consider introducing a dedicated exception type (e.g., NoUnitySessionError) to make matching explicit and refactoring-safe.
Proposed approach
# Near PluginDisconnectedError definition (line 33)
class NoUnitySessionError(RuntimeError):
    """Raised when no Unity plugin session is available."""
-            raise RuntimeError("No Unity plugins are currently connected")
+            raise NoUnitySessionError("No Unity plugins are currently connected")
         try:
             session_id = await cls._resolve_session_id(unity_instance)
-        except RuntimeError as exc:
-            if "No Unity plugins are currently connected" in str(exc):
+        except NoUnitySessionError:
                 logger.debug(
                     "Unity session unavailable; returning retry: command=%s instance=%s",
                     command_type,
                     unity_instance or "default",
                 )
                 return MCPResponse(
                     success=False,
                     error="Unity session not available; please retry",
                     hint="retry",
                     data={"reason": "no_unity_session", "retry_after_ms": 250},
                 ).model_dump()
-            raise
Server/src/services/tools/refresh_unity.py (1)
50-51: Optional: Redundant type check on line 51.

The check isinstance(data, dict) on line 51 is redundant because line 50 guarantees that data is always a dict (either from response.get("data") if it's a dict, or an empty dict {}).
🔎 Proposed simplification
 data = response.get("data") if isinstance(response.get("data"), dict) else {}
-reason = data.get("reason") if isinstance(data, dict) else None
+reason = data.get("reason")

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbf137 and 600e353.

📒 Files selected for processing (3)

Server/src/services/tools/refresh_unity.py
Server/src/transport/legacy/unity_connection.py
Server/src/transport/plugin_hub.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-05T18:23:12.349Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

Server/src/transport/plugin_hub.py
Server/src/transport/legacy/unity_connection.py

🧬 Code graph analysis (1)

Server/src/transport/legacy/unity_connection.py (1)

Server/src/transport/plugin_hub.py (1)

send_command (133-210)

🪛 Ruff (0.14.10)

Server/src/transport/plugin_hub.py

368-368: Do not catch blind exception: Exception

(BLE001)

Server/src/transport/legacy/unity_connection.py

757-757: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

🔇 Additional comments (6)

Server/src/transport/legacy/unity_connection.py (3)

689-715: Well-structured response normalization helper.

The _extract_response_reason function effectively normalizes both MCPResponse objects and raw dict payloads. The fallback logic that infers "reloading" from message text (lines 697-698, 710-711) is pragmatic for handling legacy responses.

One consideration: the substring check for "reload" in error messages could theoretically match unrelated errors (e.g., "Failed to reload configuration"), though this is unlikely in practice given the Unity bridge context.

717-723: Clean refactoring.

The delegation to _extract_response_reason simplifies this predicate and improves maintainability.

763-825: Excellent time-budgeted retry implementation.

The reload-wait logic is well-designed with:

Configurable timeout via environment variable with safe clamping

Dynamic delay adjustment using Unity's retry_after_ms hint (lines 781-786)

Bounded sleep range (50-250ms) to prevent thrashing or long stalls

Detailed logging for debugging and observability

Graceful degradation returning a structured MCPResponse when the budget is exceeded

The tracking variables (wait_started, reason) and the final logging (lines 801-824) provide good operational visibility into reload behavior.

Server/src/transport/plugin_hub.py (2)

401-427: LGTM!

The deadline computation and parameterized log formatting are well-implemented. Using %-style placeholders in logging calls defers string interpolation until the message is actually emitted, which is both more efficient and a best practice.

475-501: Readiness probe logic is sound.

The while/else construct correctly distinguishes between a successful ping (breaks out) and timeout exhaustion (else branch returns retry). The clamped upper bound of 30s prevents runaway waits.

For consistency with the earlier suggestion, the exception handling at lines 477-479 and 486-487 could also be narrowed, though bare Exception at line 486 is defensible since any probe failure should be treated as "not ready."
Server/src/services/tools/refresh_unity.py (1)
55-56: Logic is correct, but clarify the comment.

The logic at lines 55-56 is actually intentional and correct, not inverted. The code distinguishes between:

Expected disconnect reasons ("reloading", "no_unity_session"): Return the original response with the reason so callers know why the disconnect occurred.

Unexpected disconnect reasons: Mark as "recovered" and return a success message.

Both "reloading" and "no_unity_session" are transient states from Unity's startup/reload cycle (verified in unity_connection.py and plugin_hub.py), so they should propagate the specific reason to clients rather than obscure it with a generic "recovered" message.

However, the code would benefit from a clarifying comment explaining this distinction. The current comment (line 44-46) mentions that reloading is legitimate, but doesn't explain why "reloading" and "no_unity_session" are excluded from being marked as "recovered". Consider adding a comment like:
# "reloading" and "no_unity_session" are expected transient states;
# only unexpected disconnects get marked as "recovered".
Likely an incorrect or invalid review comment.

Address code review feedback: - Catch ValueError specifically (instead of broad Exception) when parsing UNITY_MCP_RELOAD_MAX_WAIT_S, UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, and UNITY_MCP_SESSION_READY_WAIT_SECONDS, with logging for easier diagnosis of misconfiguration - Normalize reason values to lowercase in _extract_response_reason() to avoid case-sensitive mismatches in comparisons - Simplify refresh_unity.py by removing redundant isinstance check and reusing _extract_response_reason instead of duplicating reason parsing

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

Server/src/transport/legacy/unity_connection.py (1)

689-729: LGTM! Case-insensitive reason extraction implemented correctly.

The helper functions properly normalize all reason strings to lowercase (lines 700, 713), including hard-coded values like "reloading", enabling reliable case-insensitive comparisons by callers. This addresses the past review concern about case-sensitive mismatches.

Server/src/services/tools/refresh_unity.py (1)

54-55: LGTM! Intentional refinement of recoverability semantics.

This change aligns with the PR objective to distinguish between true disconnects and transient Unity reload/session-unavailable states. The conditional logic correctly identifies only non-reload, non-missing-session retryable errors as disconnect recoveries, enabling more precise downstream handling.

🧹 Nitpick comments (1)

Server/src/transport/legacy/unity_connection.py (1)
805-805: Optional: Remove redundant max(0.0, ...) guard.

Since sleep_ms is already clamped to [50, 250] on line 796, the max(0.0, ...) guard here is redundant.
🔎 Proposed refactor
-time.sleep(max(0.0, sleep_ms / 1000.0))
+time.sleep(sleep_ms / 1000.0)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 600e353 and 4e1b090.

📒 Files selected for processing (3)

Server/src/services/tools/refresh_unity.py
Server/src/transport/legacy/unity_connection.py
Server/src/transport/plugin_hub.py

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-11-05T18:23:12.349Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

Server/src/transport/legacy/unity_connection.py
Server/src/transport/plugin_hub.py
Server/src/services/tools/refresh_unity.py

📚 Learning: 2025-10-13T13:27:23.040Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 316
File: TestProjects/UnityMCPTests/Assets/Tests/EditMode/Resources.meta:1-8
Timestamp: 2025-10-13T13:27:23.040Z
Learning: UnityMcpBridge is a legacy project kept for backwards compatibility; MCPForUnity is the only active Unity plugin project. GUID collisions between UnityMcpBridge and MCPForUnity are acceptable.

Applied to files:

Server/src/services/tools/refresh_unity.py

📚 Learning: 2025-10-13T13:41:00.086Z

Learnt from: JohanHoltby
Repo: CoplayDev/unity-mcp PR: 309
File: MCPForUnity/Editor/Helpers/ServerInstaller.cs:478-508
Timestamp: 2025-10-13T13:41:00.086Z
Learning: In the MCPForUnityTools feature (MCPForUnity/Editor/Helpers/ServerInstaller.cs), the design intentionally forces users to have only one .py file per MCPForUnityTools folder to keep file tracking simple. Package-style tools (subdirectories with __init__.py) are not supported.

Applied to files:

Server/src/services/tools/refresh_unity.py

🧬 Code graph analysis (3)

Server/src/transport/legacy/unity_connection.py (1)

Server/src/transport/plugin_hub.py (1)

send_command (133-210)

Server/src/transport/plugin_hub.py (1)

Server/tests/integration/test_helpers.py (3)

warning (39-40)

error (46-47)

model_dump (10-13)

Server/src/services/tools/refresh_unity.py (1)

Server/src/transport/legacy/unity_connection.py (2)

async_send_command_with_retry (837-869)

_extract_response_reason (689-719)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

🔇 Additional comments (6)

Server/src/transport/legacy/unity_connection.py (1)

759-773: LGTM! Robust environment variable parsing.

The parsing logic correctly handles invalid values with a warning fallback (lines 762-767) and clamps to non-negative values (line 768). The initialization of wait tracking variables is sound.

Server/src/services/tools/refresh_unity.py (1)

13-13: LGTM! Proper use of centralized reason extraction.

The import and usage of _extract_response_reason correctly normalizes Unity response reasons, enabling consistent handling across the codebase.

Also applies to: 50-50

Server/src/transport/plugin_hub.py (4)

364-376: LGTM! Consistent environment variable parsing pattern.

The parsing, error handling, and clamping logic for UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S follows the same robust pattern as in unity_connection.py. The sleep interval is properly bounded to [0.05, 0.25] seconds.

405-405: LGTM! Proper time-based deadline and structured logging.

The deadline calculation uses the new max_wait_s budget (line 405), and all logging statements correctly use parameterized formatting for performance and clarity.

Also applies to: 419-431, 440-443

457-472: LGTM! Structured fast-fail for unavailable sessions.

The fast-fail path correctly returns a structured MCPResponse with reason="no_unity_session" instead of raising an exception. This enables callers to handle the unavailable state with consistent retry semantics, aligning with the broader reason-based retry model introduced in this PR.

479-509: LGTM! Bounded readiness probe prevents premature fast-fail commands.

The readiness probe correctly uses ping/pong to verify Unity's main thread is responsive before dispatching fast-fail commands. The bounded wait (clamped to [0, 30] seconds) and structured retry response on timeout align with the PR's goal of avoiding long waits while maintaining reliability.

coderabbitai · 2026-01-04T21:07:13Z

Server/src/transport/legacy/unity_connection.py

+        delay_ms = retry_ms
+        if isinstance(response, dict):
+            retry_after = response.get("retry_after_ms")
+            if retry_after is None and isinstance(response.get("data"), dict):
+                retry_after = response["data"].get("retry_after_ms")
+            if retry_after is not None:
+                delay_ms = int(retry_after)
+        sleep_ms = max(50, min(int(delay_ms), 250))
+        logger.debug(
+            "Unity reload wait retry: command=%s instance=%s reason=%s retry_after_ms=%s sleep_ms=%s",
+            command_type,
+            instance_id or "default",
+            reason or "reloading",
+            delay_ms,
+            sleep_ms,
+        )
+        time.sleep(max(0.0, sleep_ms / 1000.0))
        retries += 1
        response = conn.send_command(command_type, params)
+        reason = _extract_response_reason(response)
+
+    if wait_started is not None:
+        waited = time.monotonic() - wait_started
+        if _is_reloading_response(response):
+            logger.debug(
+                "Unity reload wait exceeded budget: command=%s instance=%s waited_s=%.3f",
+                command_type,
+                instance_id or "default",
+                waited,
+            )
+            return MCPResponse(
+                success=False,
+                error="Unity is reloading; please retry",
+                hint="retry",
+                data={
+                    "reason": "reloading",
+                    "retry_after_ms": min(250, max(50, retry_ms)),
+                },


⚠️ Potential issue | 🟡 Minor

Inconsistent retry_after_ms calculation.

Line 796 clamps int(delay_ms) to [50, 250], but line 825 clamps the original retry_ms parameter. This means the returned retry_after_ms (line 825) may differ from the actual sleep duration used in retries (line 796), especially when Unity provides a custom retry_after_ms in the response (lines 791-795).

For consistency and accuracy, consider using the same computed sleep_ms value:

🔎 Proposed fix

sleep_ms = max(50, min(int(delay_ms), 250)) logger.debug( "Unity reload wait retry: command=%s instance=%s reason=%s retry_after_ms=%s sleep_ms=%s", command_type, instance_id or "default", reason or "reloading", delay_ms, sleep_ms, ) time.sleep(max(0.0, sleep_ms / 1000.0)) retries += 1 response = conn.send_command(command_type, params) reason = _extract_response_reason(response) if wait_started is not None: waited = time.monotonic() - wait_started if _is_reloading_response(response): logger.debug( "Unity reload wait exceeded budget: command=%s instance=%s waited_s=%.3f", command_type, instance_id or "default", waited, ) return MCPResponse( success=False, error="Unity is reloading; please retry", hint="retry", data={ "reason": "reloading", - "retry_after_ms": min(250, max(50, retry_ms)), + "retry_after_ms": sleep_ms, }, )

🤖 Prompt for AI Agents

In Server/src/transport/legacy/unity_connection.py around lines 789 to 826, the computed sleep_ms (which clamps delay_ms to [50,250]) is used for the actual sleep but the returned retry_after_ms is computed from the original retry_ms parameter, causing inconsistency; change the returned retry_after_ms to use the same clamped value (sleep_ms) used for sleeping (converted to int milliseconds) so the response accurately reflects the actual retry delay.

- Add upper bound (30s) to UNITY_MCP_RELOAD_MAX_WAIT_S to prevent misconfiguration from causing excessive waits - Add upper bound (30s) to UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S for consistency with readiness probe - Introduce NoUnitySessionError custom exception to replace fragile string matching in send_command_for_instance Addresses code review feedback for bounded retry policy PR.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

Server/src/transport/legacy/unity_connection.py (1)
790-828: Address the inconsistent retry_after_ms calculation.

As noted in a previous review, line 826 returns retry_after_ms computed from the original retry_ms parameter, but line 797 uses the clamped sleep_ms for the actual sleep delay. When Unity provides a custom retry_after_ms in the response (lines 791-796), these values can diverge, making the returned retry_after_ms inaccurate.

For consistency and accuracy, the returned retry_after_ms should reflect the actual sleep duration used.
🔎 Proposed fix
                 "reason": "reloading",
-                "retry_after_ms": min(250, max(50, retry_ms)),
+                "retry_after_ms": sleep_ms,
             },
         )

🧹 Nitpick comments (1)

Server/src/transport/plugin_hub.py (1)
37-39: Consider defining the default error message in the exception class.

While the current implementation is functional, the static analysis hint at line 451 is valid. Moving the error message to the exception class improves reusability and follows Python best practices.
🔎 Proposed refactor
 class NoUnitySessionError(RuntimeError):
-    """Raised when no Unity plugins are available."""
+    """Raised when no Unity plugins are available."""
+    
+    def __init__(self, message: str = "No Unity plugins are currently connected"):
+        super().__init__(message)
Then at line 451, simplify to:
-            raise NoUnitySessionError("No Unity plugins are currently connected")
+            raise NoUnitySessionError()

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e1b090 and f4d2637.

📒 Files selected for processing (2)

Server/src/transport/legacy/unity_connection.py
Server/src/transport/plugin_hub.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-05T18:23:12.349Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

Server/src/transport/legacy/unity_connection.py
Server/src/transport/plugin_hub.py

🧬 Code graph analysis (1)

Server/src/transport/plugin_hub.py (1)

Server/tests/integration/test_helpers.py (1)

model_dump (10-13)

🪛 Ruff (0.14.10)

Server/src/transport/plugin_hub.py

451-451: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

🔇 Additional comments (8)

Server/src/transport/plugin_hub.py (4)

368-381: LGTM! Well-structured environment variable handling.

The bounded retry implementation correctly:

Parses and validates the environment variable with appropriate error handling

Clamps max_wait_s to [0, 30] to prevent misconfiguration

Bounds sleep intervals to [0.05, 0.25] seconds

Uses parameterized logging for debugging

410-451: LGTM! Consistent time-based deadline and logging.

The implementation correctly:

Uses time.monotonic() for deadline calculations (immune to system clock changes)

Applies parameterized logging consistently throughout the wait loop

Raises the new NoUnitySessionError exception for structured error handling

462-475: LGTM! Structured error handling for unavailable sessions.

The exception handling correctly:

Catches the new NoUnitySessionError exception

Returns a structured MCPResponse with retry semantics

Includes diagnostic data (reason and retry_after_ms) for clients

Uses appropriate logging

483-491: LGTM! Consistent environment variable handling.

The readiness wait configuration follows the same robust pattern as session resolution:

Validates environment variable with appropriate exception handling

Clamps to [0, 30] to prevent misconfiguration

Uses parameterized logging for diagnostics

Server/src/transport/legacy/unity_connection.py (4)

689-720: LGTM! Robust response reason extraction.

The implementation correctly:

Normalizes all reason values to lowercase for case-insensitive comparisons

Handles both MCPResponse objects and raw dict responses

Extracts reasons from both data["reason"] and message/error text

Maps reload-related responses to the canonical "reloading" string

Returns None for non-reloadable cases

This addresses the previous review concern about case-sensitive mismatches.

722-728: LGTM! Clean semantic wrapper.

The function provides a clear, semantic interface for checking reload status by leveraging the normalized reason extraction.

759-769: LGTM! Consistent bounded retry configuration.

The environment variable handling correctly:

Parses and validates UNITY_MCP_RELOAD_MAX_WAIT_S with appropriate error handling

Clamps to [0, 30] to prevent excessive waits from misconfiguration

Uses parameterized logging for diagnostics

Follows the same pattern as plugin_hub.py for consistency

773-789: LGTM! Well-implemented time-budgeted retry loop.

The retry implementation correctly:

Uses time.monotonic() for accurate elapsed time measurement

Tracks wait start time and enforces the time budget

Provides extensive diagnostic logging at each stage (start, per-retry, exceeded, completed)

Returns structured MCPResponse with retry hints when the budget is exceeded

Clamps per-retry sleep to [50, 250]ms for bounded delays

The observability and bounded behavior align well with the PR objectives.

Also applies to: 807-835

dsarno added 20 commits January 2, 2026 11:56

Add editor readiness v2, refresh tool, and preflight guards

95b6210

Detect external package changes and harden refresh retry

3d58862

docs: add async test tools to README, document domain reload limitation

3550f8f

- Add run_tests_async and get_test_job to main README tools list - Document background stall limitation for domain reload tests in DEV readme

ci: add separate job for domain reload tests

7d75a18

Run [Explicit] domain_reload tests in their own job using -testCategory

ci: run domain reload tests in same job as regular tests

5aa22d9

Combines into single job with two test steps to reuse cached Library

fix: address coderabbit review issues

6e7de9e

- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set) - Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues

docs: update tool descriptions to prefer run_tests_async

bb77db4

- run_tests_async is now marked as preferred for long-running suites - run_tests description notes it blocks and suggests async alternative

docs: update README screenshot to v8.6 UI

16b8d00

docs: add v8.6 UI screenshot

f554eec

docs: update v8.6 UI screenshot

a8f34cd

docs: update v8.6 UI screenshot

a3977f6

docs: update v8.6 UI screenshot

3b7a4f8

Update README for MCP version and instructions for v8.7

766112b

fix: derive job status from test results in FinalizeFromTask fallback

e5774bb

Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0 to correctly mark jobs as Failed when tests fail, even in the fallback path when RunFinished callback is not delivered.

Bound Unity reload/session waits

39aa7d0

Merge upstream/main into codex/implement-bounded-retry-policy-for-unity

600e353

sourcery-ai bot reviewed Jan 4, 2026

View reviewed changes

coderabbitai bot reviewed Jan 4, 2026

View reviewed changes

dsarno merged commit 96b81ca into CoplayDev:main Jan 4, 2026
2 checks passed

dsarno deleted the codex/implement-bounded-retry-policy-for-unity branch January 4, 2026 22:57

Codex/implement bounded retry policy for unity #510

Codex/implement bounded retry policy for unity #510

Uh oh!

Conversation

dsarno commented Jan 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai bot commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for PluginHub session resolution and retry response

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dsarno commented Jan 4, 2026 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Jan 4, 2026 •

edited

Loading

coderabbitai bot commented Jan 4, 2026 •

edited

Loading