-
Notifications
You must be signed in to change notification settings - Fork 605
Codex/implement bounded retry policy for unity #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codex/implement bounded retry policy for unity #510
Conversation
… stall prevention - Add TestRunnerNoThrottle.cs: Sets editor to 'No Throttling' mode during test runs with SessionState persistence across domain reload - Add run_tests_async and get_test_job tools for non-blocking test execution - Add TestJobManager for async test job tracking with progress monitoring - Add ForceSynchronousImport to all AssetDatabase.Refresh() calls to prevent stalls - Mark DomainReloadResilienceTests as [Explicit] with documentation explaining the test infrastructure limitation (internal coroutine waits vs MCP socket polling) - MCP workflow is unaffected - socket messages provide external stimulus that keeps Unity responsive even when backgrounded
- Remove unused Newtonsoft.Json.Linq import from TestJobManager - Add throttling to SessionState persistence (once per second) to reduce overhead - Critical job state changes (start/finish) still persist immediately - Fix duplicate XML summary tag in DomainReloadResilienceTests
- Add run_tests_async and get_test_job to main README tools list - Document background stall limitation for domain reload tests in DEV readme
Run [Explicit] domain_reload tests in their own job using -testCategory
Combines into single job with two test steps to reuse cached Library
- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set) - Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues
- run_tests_async is now marked as preferred for long-running suites - run_tests description notes it blocks and suggests async alternative
…sults - manage_asset, manage_gameobject, manage_scene now check preflight return value and propagate busy/retry signals to clients (fixes Sourcery #1) - TestJobManager.FinalizeCurrentJobFromRunFinished now sets job status to Failed when resultPayload.Failed > 0, not always Succeeded (fixes Sourcery #2)
When 'Force fresh server install' is enabled, uvx uses --no-cache --refresh which rebuilds the package and takes significantly longer to start. - Increase timeout from 10s to 45s when dev mode is enabled - Add informative log message explaining the longer startup time - Show actual timeout value in warning message
Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0 to correctly mark jobs as Failed when tests fail, even in the fallback path when RunFinished callback is not delivered.
Reviewer's GuideImplements bounded, reason-aware retry behavior for Unity command and session handling, including an overall max-wait budget, improved retry interval handling, structured retry reasons, and consistent behavior across legacy transport, plugin hub, and refresh_unity tool. Sequence diagram for PluginHub session resolution and retry responsesequenceDiagram
actor Client
participant PluginHub as PluginHub
participant Registry as PluginRegistry
participant UnityEditor as UnityEditor
Client->>PluginHub: send_command_for_instance(unity_instance, command_type, params)
PluginHub->>PluginHub: _resolve_session_id(unity_instance)
PluginHub->>Registry: lookup sessions for unity_instance
Registry-->>PluginHub: sessions, counts
alt No sessions available
PluginHub->>PluginHub: max_wait_s from UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S
PluginHub->>PluginHub: retry_ms from config, sleep_seconds = clamp(retry_ms/1000, 0.05, 0.25)
PluginHub->>PluginHub: deadline = time.monotonic() + max_wait_s
PluginHub->>PluginHub: log No plugin session available, waiting up to max_wait_s
loop Until session restored or deadline reached
PluginHub->>PluginHub: await asyncio.sleep(sleep_seconds)
PluginHub->>Registry: _try_once()
Registry-->>PluginHub: session_id, session_count
end
alt session_id restored
PluginHub->>PluginHub: log Plugin session restored
else No session_id after wait
PluginHub->>PluginHub: log No Unity plugin reconnected within max_wait_s
PluginHub->>Client: raise RuntimeError("No Unity plugins are currently connected")
end
else Session already available
PluginHub->>PluginHub: return session_id
end
Note over Client,PluginHub: send_command_for_instance handles RuntimeError
PluginHub->>PluginHub: try: session_id = await _resolve_session_id(unity_instance)
alt RuntimeError with "No Unity plugins are currently connected"
PluginHub->>PluginHub: log Unity session unavailable, returning retry
PluginHub-->>Client: MCPResponse(success=False, error="Unity session not available, please retry", hint=retry, data={reason: no_unity_session, retry_after_ms: 250})
else Other errors
PluginHub-->>Client: propagate exception
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds normalized extraction of response reasons, refines recoverability for Unity reload/no-session cases, implements time-budgeted reload waiting with per-attempt bounded delays and logging, and introduces bounded session resolution/readiness probing plus a NoUnitySessionError and retry MCPResponse handling. Changes
Sequence DiagramssequenceDiagram
participant Client
participant UnityConn as unity_connection
participant Unity as Unity Process
Client->>UnityConn: send_command_with_retry()
Note over UnityConn: detect reloading -> record wait_start_time
UnityConn->>Unity: send command
Unity-->>UnityConn: MCPResponse (may indicate reloading)
rect rgba(200,230,201,0.4)
Note over UnityConn: Time-budgeted reload wait (max_wait_s)
loop while elapsed < max_wait_s and response indicates reloading
UnityConn->>UnityConn: reason = _extract_response_reason(response)
UnityConn->>UnityConn: retry_after_ms = clamp(50..250)
UnityConn->>UnityConn: sleep(retry_after_ms)
UnityConn->>Unity: retry send command
Unity-->>UnityConn: MCPResponse
end
end
alt Budget exceeded (still reloading)
UnityConn-->>Client: MCPResponse(reloading, reason, retry hint)
else Recovered or non-reloading
UnityConn-->>Client: final MCPResponse
end
sequenceDiagram
participant Client
participant PluginHub as plugin_hub
participant UnityConn as unity_connection
participant Unity as Unity Process
Client->>PluginHub: send_command_for_instance()
alt No Unity plugins available
rect rgba(255,224,178,0.35)
PluginHub->>PluginHub: wait up to resolve_max_wait_s (clamped)
PluginHub-->>Client: MCPResponse(retry) <-- NoUnitySessionError handled
end
else Plugin exists but session not ready
rect rgba(187,222,251,0.35)
PluginHub->>UnityConn: readiness probe (bounded max wait)
UnityConn->>Unity: ping
alt Probe succeeds
Unity-->>UnityConn: pong
PluginHub->>UnityConn: send_command()
UnityConn->>Unity: command
Unity-->>UnityConn: response
PluginHub-->>Client: response
else Probe times out
PluginHub-->>Client: MCPResponse(retry)
end
end
else Session ready
PluginHub->>UnityConn: send_command()
UnityConn->>Unity: command
Unity-->>UnityConn: response
PluginHub-->>Client: response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 2 issues, and left some high level feedback:
- When parsing UNITY_MCP_RELOAD_MAX_WAIT_S and UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, consider catching ValueError specifically and logging invalid values rather than suppressing all exceptions with a broad except, so misconfiguration is easier to diagnose.
- In refresh_unity, the data/reason extraction can be simplified (the second isinstance(data, dict) check is redundant after the first), and you might reuse _extract_response_reason instead of duplicating the reason parsing logic.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- When parsing UNITY_MCP_RELOAD_MAX_WAIT_S and UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, consider catching ValueError specifically and logging invalid values rather than suppressing all exceptions with a broad except, so misconfiguration is easier to diagnose.
- In refresh_unity, the data/reason extraction can be simplified (the second isinstance(data, dict) check is redundant after the first), and you might reuse _extract_response_reason instead of duplicating the reason parsing logic.
## Individual Comments
### Comment 1
<location> `Server/src/transport/legacy/unity_connection.py:689-698` </location>
<code_context>
def _extract_response_reason(resp: object) -> str | None:
</code_context>
<issue_to_address>
**issue (bug_risk):** Normalize `reason` values (including from `data`) to avoid case-sensitive mismatches.
`_extract_response_reason` returns `data["reason"]` as-is, but callers (e.g. `_is_reloading_response`, `refresh_unity`) compare against lowercase strings like `"reloading"` / `"no_unity_session"`. If Unity returns different casing (e.g. `"Reloading"`), the checks will fail. Normalizing all reasons to lowercase in `_extract_response_reason` (both `data["reason"]` and the hard-coded values) would make these comparisons reliable and case-insensitive.
</issue_to_address>
### Comment 2
<location> `Server/src/services/tools/refresh_unity.py:50-55` </location>
<code_context>
if isinstance(response, dict) and not response.get("success", True):
hint = response.get("hint")
err = (response.get("error") or response.get("message") or "")
+ data = response.get("data") if isinstance(response.get("data"), dict) else {}
+ reason = data.get("reason") if isinstance(data, dict) else None
is_retryable = (hint == "retry") or ("disconnected" in str(err).lower())
if (not wait_for_ready) or (not is_retryable):
return MCPResponse(**response)
- recovered_from_disconnect = True
+ if reason not in {"reloading", "no_unity_session"}:
+ recovered_from_disconnect = True
</code_context>
<issue_to_address>
**question (bug_risk):** Double-check the new `recovered_from_disconnect` condition; it may change semantics for some retryable cases.
Previously, any retryable error (`hint == "retry"` or `"disconnected" in err`) would set `recovered_from_disconnect = True`. Now that only happens when `reason` is *not* `"reloading"` or `"no_unity_session"`, so those two cases will now leave `recovered_from_disconnect = False` and may change downstream behavior that relies on that flag.
If the goal is to treat `"reloading"` / `"no_unity_session"` as a distinct transient category, consider an explicit `reason`-based branch (with a clear fallback for unknown reasons) instead of a negative check, so the semantics are easier to reason about and extend.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| data = response.get("data") if isinstance(response.get("data"), dict) else {} | ||
| reason = data.get("reason") if isinstance(data, dict) else None | ||
| is_retryable = (hint == "retry") or ("disconnected" in str(err).lower()) | ||
| if (not wait_for_ready) or (not is_retryable): | ||
| return MCPResponse(**response) | ||
| recovered_from_disconnect = True | ||
| if reason not in {"reloading", "no_unity_session"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (bug_risk): Double-check the new recovered_from_disconnect condition; it may change semantics for some retryable cases.
Previously, any retryable error (hint == "retry" or "disconnected" in err) would set recovered_from_disconnect = True. Now that only happens when reason is not "reloading" or "no_unity_session", so those two cases will now leave recovered_from_disconnect = False and may change downstream behavior that relies on that flag.
If the goal is to treat "reloading" / "no_unity_session" as a distinct transient category, consider an explicit reason-based branch (with a clear fallback for unknown reasons) instead of a negative check, so the semantics are easier to reason about and extend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
Server/src/transport/legacy/unity_connection.py (1)
754-759: Consider narrowing the exception handler.The broad
Exceptioncatch for parsingUNITY_MCP_RELOAD_MAX_WAIT_Sis acceptable given the fallback default, but you could narrow it toValueErrorfor clarity since that's whatfloat()raises on invalid input.🔎 Proposed refinement
try: max_wait_s = float(os.environ.get( "UNITY_MCP_RELOAD_MAX_WAIT_S", "2.0")) - except Exception: + except ValueError: max_wait_s = 2.0Server/src/transport/plugin_hub.py (2)
365-370: Consider catchingValueErrorinstead of bareException.For environment variable parsing,
float()raisesValueErroron invalid input. Catching a specific exception type improves clarity and avoids masking unexpected errors.Proposed fix
try: max_wait_s = float( os.environ.get("UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S", "2.0")) - except Exception: + except ValueError: max_wait_s = 2.0
453-468: String-based exception matching is fragile.Checking
"No Unity plugins are currently connected" in str(exc)couples this code to the exact wording of the error message at line 442. If that message is ever modified, this condition silently fails to match, causing the exception to propagate instead of returning a retry response.Consider introducing a dedicated exception type (e.g.,
NoUnitySessionError) to make matching explicit and refactoring-safe.Proposed approach
# Near PluginDisconnectedError definition (line 33) class NoUnitySessionError(RuntimeError): """Raised when no Unity plugin session is available."""- raise RuntimeError("No Unity plugins are currently connected") + raise NoUnitySessionError("No Unity plugins are currently connected")try: session_id = await cls._resolve_session_id(unity_instance) - except RuntimeError as exc: - if "No Unity plugins are currently connected" in str(exc): + except NoUnitySessionError: logger.debug( "Unity session unavailable; returning retry: command=%s instance=%s", command_type, unity_instance or "default", ) return MCPResponse( success=False, error="Unity session not available; please retry", hint="retry", data={"reason": "no_unity_session", "retry_after_ms": 250}, ).model_dump() - raiseServer/src/services/tools/refresh_unity.py (1)
50-51: Optional: Redundant type check on line 51.The check
isinstance(data, dict)on line 51 is redundant because line 50 guarantees thatdatais always a dict (either fromresponse.get("data")if it's a dict, or an empty dict{}).🔎 Proposed simplification
data = response.get("data") if isinstance(response.get("data"), dict) else {} -reason = data.get("reason") if isinstance(data, dict) else None +reason = data.get("reason")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
Server/src/services/tools/refresh_unity.pyServer/src/transport/legacy/unity_connection.pyServer/src/transport/plugin_hub.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-05T18:23:12.349Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.
Applied to files:
Server/src/transport/plugin_hub.pyServer/src/transport/legacy/unity_connection.py
🧬 Code graph analysis (1)
Server/src/transport/legacy/unity_connection.py (1)
Server/src/transport/plugin_hub.py (1)
send_command(133-210)
🪛 Ruff (0.14.10)
Server/src/transport/plugin_hub.py
368-368: Do not catch blind exception: Exception
(BLE001)
Server/src/transport/legacy/unity_connection.py
757-757: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (6)
Server/src/transport/legacy/unity_connection.py (3)
689-715: Well-structured response normalization helper.The
_extract_response_reasonfunction effectively normalizes both MCPResponse objects and raw dict payloads. The fallback logic that infers "reloading" from message text (lines 697-698, 710-711) is pragmatic for handling legacy responses.One consideration: the substring check for "reload" in error messages could theoretically match unrelated errors (e.g., "Failed to reload configuration"), though this is unlikely in practice given the Unity bridge context.
717-723: Clean refactoring.The delegation to
_extract_response_reasonsimplifies this predicate and improves maintainability.
763-825: Excellent time-budgeted retry implementation.The reload-wait logic is well-designed with:
- Configurable timeout via environment variable with safe clamping
- Dynamic delay adjustment using Unity's
retry_after_mshint (lines 781-786)- Bounded sleep range (50-250ms) to prevent thrashing or long stalls
- Detailed logging for debugging and observability
- Graceful degradation returning a structured
MCPResponsewhen the budget is exceededThe tracking variables (
wait_started,reason) and the final logging (lines 801-824) provide good operational visibility into reload behavior.Server/src/transport/plugin_hub.py (2)
401-427: LGTM!The deadline computation and parameterized log formatting are well-implemented. Using
%-style placeholders in logging calls defers string interpolation until the message is actually emitted, which is both more efficient and a best practice.
475-501: Readiness probe logic is sound.The
while/elseconstruct correctly distinguishes between a successful ping (breaks out) and timeout exhaustion (else branch returns retry). The clamped upper bound of 30s prevents runaway waits.For consistency with the earlier suggestion, the exception handling at lines 477-479 and 486-487 could also be narrowed, though bare
Exceptionat line 486 is defensible since any probe failure should be treated as "not ready."Server/src/services/tools/refresh_unity.py (1)
55-56: Logic is correct, but clarify the comment.The logic at lines 55-56 is actually intentional and correct, not inverted. The code distinguishes between:
- Expected disconnect reasons (
"reloading","no_unity_session"): Return the original response with the reason so callers know why the disconnect occurred.- Unexpected disconnect reasons: Mark as "recovered" and return a success message.
Both "reloading" and "no_unity_session" are transient states from Unity's startup/reload cycle (verified in
unity_connection.pyandplugin_hub.py), so they should propagate the specific reason to clients rather than obscure it with a generic "recovered" message.However, the code would benefit from a clarifying comment explaining this distinction. The current comment (line 44-46) mentions that reloading is legitimate, but doesn't explain why "reloading" and "no_unity_session" are excluded from being marked as "recovered". Consider adding a comment like:
# "reloading" and "no_unity_session" are expected transient states; # only unexpected disconnects get marked as "recovered".Likely an incorrect or invalid review comment.
Address code review feedback: - Catch ValueError specifically (instead of broad Exception) when parsing UNITY_MCP_RELOAD_MAX_WAIT_S, UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S, and UNITY_MCP_SESSION_READY_WAIT_SECONDS, with logging for easier diagnosis of misconfiguration - Normalize reason values to lowercase in _extract_response_reason() to avoid case-sensitive mismatches in comparisons - Simplify refresh_unity.py by removing redundant isinstance check and reusing _extract_response_reason instead of duplicating reason parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
Server/src/transport/legacy/unity_connection.py (1)
689-729: LGTM! Case-insensitive reason extraction implemented correctly.The helper functions properly normalize all reason strings to lowercase (lines 700, 713), including hard-coded values like
"reloading", enabling reliable case-insensitive comparisons by callers. This addresses the past review concern about case-sensitive mismatches.Server/src/services/tools/refresh_unity.py (1)
54-55: LGTM! Intentional refinement of recoverability semantics.This change aligns with the PR objective to distinguish between true disconnects and transient Unity reload/session-unavailable states. The conditional logic correctly identifies only non-reload, non-missing-session retryable errors as disconnect recoveries, enabling more precise downstream handling.
🧹 Nitpick comments (1)
Server/src/transport/legacy/unity_connection.py (1)
805-805: Optional: Remove redundant max(0.0, ...) guard.Since
sleep_msis already clamped to [50, 250] on line 796, themax(0.0, ...)guard here is redundant.🔎 Proposed refactor
-time.sleep(max(0.0, sleep_ms / 1000.0)) +time.sleep(sleep_ms / 1000.0)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
Server/src/services/tools/refresh_unity.pyServer/src/transport/legacy/unity_connection.pyServer/src/transport/plugin_hub.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-05T18:23:12.349Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.
Applied to files:
Server/src/transport/legacy/unity_connection.pyServer/src/transport/plugin_hub.pyServer/src/services/tools/refresh_unity.py
📚 Learning: 2025-10-13T13:27:23.040Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 316
File: TestProjects/UnityMCPTests/Assets/Tests/EditMode/Resources.meta:1-8
Timestamp: 2025-10-13T13:27:23.040Z
Learning: UnityMcpBridge is a legacy project kept for backwards compatibility; MCPForUnity is the only active Unity plugin project. GUID collisions between UnityMcpBridge and MCPForUnity are acceptable.
Applied to files:
Server/src/services/tools/refresh_unity.py
📚 Learning: 2025-10-13T13:41:00.086Z
Learnt from: JohanHoltby
Repo: CoplayDev/unity-mcp PR: 309
File: MCPForUnity/Editor/Helpers/ServerInstaller.cs:478-508
Timestamp: 2025-10-13T13:41:00.086Z
Learning: In the MCPForUnityTools feature (MCPForUnity/Editor/Helpers/ServerInstaller.cs), the design intentionally forces users to have only one .py file per MCPForUnityTools folder to keep file tracking simple. Package-style tools (subdirectories with __init__.py) are not supported.
Applied to files:
Server/src/services/tools/refresh_unity.py
🧬 Code graph analysis (3)
Server/src/transport/legacy/unity_connection.py (1)
Server/src/transport/plugin_hub.py (1)
send_command(133-210)
Server/src/transport/plugin_hub.py (1)
Server/tests/integration/test_helpers.py (3)
warning(39-40)error(46-47)model_dump(10-13)
Server/src/services/tools/refresh_unity.py (1)
Server/src/transport/legacy/unity_connection.py (2)
async_send_command_with_retry(837-869)_extract_response_reason(689-719)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (6)
Server/src/transport/legacy/unity_connection.py (1)
759-773: LGTM! Robust environment variable parsing.The parsing logic correctly handles invalid values with a warning fallback (lines 762-767) and clamps to non-negative values (line 768). The initialization of wait tracking variables is sound.
Server/src/services/tools/refresh_unity.py (1)
13-13: LGTM! Proper use of centralized reason extraction.The import and usage of
_extract_response_reasoncorrectly normalizes Unity response reasons, enabling consistent handling across the codebase.Also applies to: 50-50
Server/src/transport/plugin_hub.py (4)
364-376: LGTM! Consistent environment variable parsing pattern.The parsing, error handling, and clamping logic for
UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_Sfollows the same robust pattern as inunity_connection.py. The sleep interval is properly bounded to [0.05, 0.25] seconds.
405-405: LGTM! Proper time-based deadline and structured logging.The deadline calculation uses the new
max_wait_sbudget (line 405), and all logging statements correctly use parameterized formatting for performance and clarity.Also applies to: 419-431, 440-443
457-472: LGTM! Structured fast-fail for unavailable sessions.The fast-fail path correctly returns a structured
MCPResponsewithreason="no_unity_session"instead of raising an exception. This enables callers to handle the unavailable state with consistent retry semantics, aligning with the broader reason-based retry model introduced in this PR.
479-509: LGTM! Bounded readiness probe prevents premature fast-fail commands.The readiness probe correctly uses ping/pong to verify Unity's main thread is responsive before dispatching fast-fail commands. The bounded wait (clamped to [0, 30] seconds) and structured retry response on timeout align with the PR's goal of avoiding long waits while maintaining reliability.
| delay_ms = retry_ms | ||
| if isinstance(response, dict): | ||
| retry_after = response.get("retry_after_ms") | ||
| if retry_after is None and isinstance(response.get("data"), dict): | ||
| retry_after = response["data"].get("retry_after_ms") | ||
| if retry_after is not None: | ||
| delay_ms = int(retry_after) | ||
| sleep_ms = max(50, min(int(delay_ms), 250)) | ||
| logger.debug( | ||
| "Unity reload wait retry: command=%s instance=%s reason=%s retry_after_ms=%s sleep_ms=%s", | ||
| command_type, | ||
| instance_id or "default", | ||
| reason or "reloading", | ||
| delay_ms, | ||
| sleep_ms, | ||
| ) | ||
| time.sleep(max(0.0, sleep_ms / 1000.0)) | ||
| retries += 1 | ||
| response = conn.send_command(command_type, params) | ||
| reason = _extract_response_reason(response) | ||
|
|
||
| if wait_started is not None: | ||
| waited = time.monotonic() - wait_started | ||
| if _is_reloading_response(response): | ||
| logger.debug( | ||
| "Unity reload wait exceeded budget: command=%s instance=%s waited_s=%.3f", | ||
| command_type, | ||
| instance_id or "default", | ||
| waited, | ||
| ) | ||
| return MCPResponse( | ||
| success=False, | ||
| error="Unity is reloading; please retry", | ||
| hint="retry", | ||
| data={ | ||
| "reason": "reloading", | ||
| "retry_after_ms": min(250, max(50, retry_ms)), | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent retry_after_ms calculation.
Line 796 clamps int(delay_ms) to [50, 250], but line 825 clamps the original retry_ms parameter. This means the returned retry_after_ms (line 825) may differ from the actual sleep duration used in retries (line 796), especially when Unity provides a custom retry_after_ms in the response (lines 791-795).
For consistency and accuracy, consider using the same computed sleep_ms value:
🔎 Proposed fix
sleep_ms = max(50, min(int(delay_ms), 250))
logger.debug(
"Unity reload wait retry: command=%s instance=%s reason=%s retry_after_ms=%s sleep_ms=%s",
command_type,
instance_id or "default",
reason or "reloading",
delay_ms,
sleep_ms,
)
time.sleep(max(0.0, sleep_ms / 1000.0))
retries += 1
response = conn.send_command(command_type, params)
reason = _extract_response_reason(response)
if wait_started is not None:
waited = time.monotonic() - wait_started
if _is_reloading_response(response):
logger.debug(
"Unity reload wait exceeded budget: command=%s instance=%s waited_s=%.3f",
command_type,
instance_id or "default",
waited,
)
return MCPResponse(
success=False,
error="Unity is reloading; please retry",
hint="retry",
data={
"reason": "reloading",
- "retry_after_ms": min(250, max(50, retry_ms)),
+ "retry_after_ms": sleep_ms,
},
)🤖 Prompt for AI Agents
In Server/src/transport/legacy/unity_connection.py around lines 789 to 826, the
computed sleep_ms (which clamps delay_ms to [50,250]) is used for the actual
sleep but the returned retry_after_ms is computed from the original retry_ms
parameter, causing inconsistency; change the returned retry_after_ms to use the
same clamped value (sleep_ms) used for sleeping (converted to int milliseconds)
so the response accurately reflects the actual retry delay.
- Add upper bound (30s) to UNITY_MCP_RELOAD_MAX_WAIT_S to prevent misconfiguration from causing excessive waits - Add upper bound (30s) to UNITY_MCP_SESSION_RESOLVE_MAX_WAIT_S for consistency with readiness probe - Introduce NoUnitySessionError custom exception to replace fragile string matching in send_command_for_instance Addresses code review feedback for bounded retry policy PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
Server/src/transport/legacy/unity_connection.py (1)
790-828: Address the inconsistent retry_after_ms calculation.As noted in a previous review, line 826 returns
retry_after_mscomputed from the originalretry_msparameter, but line 797 uses the clampedsleep_msfor the actual sleep delay. When Unity provides a customretry_after_msin the response (lines 791-796), these values can diverge, making the returnedretry_after_msinaccurate.For consistency and accuracy, the returned
retry_after_msshould reflect the actual sleep duration used.🔎 Proposed fix
"reason": "reloading", - "retry_after_ms": min(250, max(50, retry_ms)), + "retry_after_ms": sleep_ms, }, )
🧹 Nitpick comments (1)
Server/src/transport/plugin_hub.py (1)
37-39: Consider defining the default error message in the exception class.While the current implementation is functional, the static analysis hint at line 451 is valid. Moving the error message to the exception class improves reusability and follows Python best practices.
🔎 Proposed refactor
class NoUnitySessionError(RuntimeError): - """Raised when no Unity plugins are available.""" + """Raised when no Unity plugins are available.""" + + def __init__(self, message: str = "No Unity plugins are currently connected"): + super().__init__(message)Then at line 451, simplify to:
- raise NoUnitySessionError("No Unity plugins are currently connected") + raise NoUnitySessionError()
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
Server/src/transport/legacy/unity_connection.pyServer/src/transport/plugin_hub.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-05T18:23:12.349Z
Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.
Applied to files:
Server/src/transport/legacy/unity_connection.pyServer/src/transport/plugin_hub.py
🧬 Code graph analysis (1)
Server/src/transport/plugin_hub.py (1)
Server/tests/integration/test_helpers.py (1)
model_dump(10-13)
🪛 Ruff (0.14.10)
Server/src/transport/plugin_hub.py
451-451: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (8)
Server/src/transport/plugin_hub.py (4)
368-381: LGTM! Well-structured environment variable handling.The bounded retry implementation correctly:
- Parses and validates the environment variable with appropriate error handling
- Clamps max_wait_s to [0, 30] to prevent misconfiguration
- Bounds sleep intervals to [0.05, 0.25] seconds
- Uses parameterized logging for debugging
410-451: LGTM! Consistent time-based deadline and logging.The implementation correctly:
- Uses
time.monotonic()for deadline calculations (immune to system clock changes)- Applies parameterized logging consistently throughout the wait loop
- Raises the new
NoUnitySessionErrorexception for structured error handling
462-475: LGTM! Structured error handling for unavailable sessions.The exception handling correctly:
- Catches the new
NoUnitySessionErrorexception- Returns a structured
MCPResponsewith retry semantics- Includes diagnostic data (reason and retry_after_ms) for clients
- Uses appropriate logging
483-491: LGTM! Consistent environment variable handling.The readiness wait configuration follows the same robust pattern as session resolution:
- Validates environment variable with appropriate exception handling
- Clamps to [0, 30] to prevent misconfiguration
- Uses parameterized logging for diagnostics
Server/src/transport/legacy/unity_connection.py (4)
689-720: LGTM! Robust response reason extraction.The implementation correctly:
- Normalizes all reason values to lowercase for case-insensitive comparisons
- Handles both
MCPResponseobjects and raw dict responses- Extracts reasons from both
data["reason"]and message/error text- Maps reload-related responses to the canonical
"reloading"string- Returns
Nonefor non-reloadable casesThis addresses the previous review concern about case-sensitive mismatches.
722-728: LGTM! Clean semantic wrapper.The function provides a clear, semantic interface for checking reload status by leveraging the normalized reason extraction.
759-769: LGTM! Consistent bounded retry configuration.The environment variable handling correctly:
- Parses and validates
UNITY_MCP_RELOAD_MAX_WAIT_Swith appropriate error handling- Clamps to [0, 30] to prevent excessive waits from misconfiguration
- Uses parameterized logging for diagnostics
- Follows the same pattern as
plugin_hub.pyfor consistency
773-789: LGTM! Well-implemented time-budgeted retry loop.The retry implementation correctly:
- Uses
time.monotonic()for accurate elapsed time measurement- Tracks wait start time and enforces the time budget
- Provides extensive diagnostic logging at each stage (start, per-retry, exceeded, completed)
- Returns structured
MCPResponsewith retry hints when the budget is exceeded- Clamps per-retry sleep to [50, 250]ms for bounded delays
The observability and bounded behavior align well with the PR objectives.
Also applies to: 807-835
Summary by Sourcery
Introduce bounded, time-based retry behavior for Unity connections and plugin sessions to avoid long waits when editors or sessions are unavailable.
Enhancements:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.