Skip to content

Conversation

@dsarno
Copy link
Collaborator

@dsarno dsarno commented Aug 8, 2025

Overview

This PR hardens the Unity MCP Bridge and streamlines the end‑to‑end setup. It removes the old Git‑based server installer in favor of an embedded Python server, reduces perceived downtime across domain reloads, and improves the Unity Editor UX for configuration and troubleshooting.

Behavior Changes (Before → After)

  • Ports could hop on reload (e.g., 6401/6402) → Reuses the same per‑project port (e.g., 6400) with micro bind‑retries
  • Bridge sometimes didn’t restart after compiles → Reliable auto‑restart with short retry loop and socket options for all tools.
  • Editor window could show stale status → UI reflects live bridge state; status survives domain reloads
  • Client setup noisy/manual → Cleaner UI with one “Auto‑Setup” button, clear status, and fewer logs by default
  • Server installed via Git after package import → Python server is embedded in the package; no network fetch for users

Why

  • Eliminate port hopping and connection gaps during domain reloads
  • Make installation self‑contained and robust on first import and upgrades
  • Provide a single, obvious way to repair local Python environments (.venv mismatch, system Python version changes)
  • Warn if python not detected, offer install link
  • Increase clarity in the Editor window while gating verbose logs behind a global toggle

Key Changes

Bridge lifecycle: UnityMcpBridge/Editor/UnityMcpBridge.cs

  • Static init starts bridge quickly; writes a heartbeat file indicating unity_port and reloading state
  • Start/Stop guarded by locks; short micro‑retry loop on AddressAlreadyInUse
  • Socket options for resiliency: ReuseAddress and Linger(0) to reduce TIME_WAIT binds
  • Periodic heartbeat written at ~0.5s to ~/.unity-mcp/unity-mcp-status-<hash>.json
  • Listener errors handled gracefully; ping fast‑path for health checks

Port management: UnityMcpBridge/Editor/Helpers/PortManager.cs

  • Prefers persisted project port (case‑insensitive project path match)
  • Returns stored project port even if transiently busy; Start uses micro‑retry to avoid port flips
  • Reads/writes both hashed per‑project file and legacy file for back‑compat

Embedded server installer: UnityMcpBridge/Editor/Helpers/ServerInstaller.cs

  • Install path is user‑writable (macOS: ~/Library/Application Support/UnityMCP), no Git required
  • New RepairPythonEnvironment() deletes .venv and runs uv sync to rebuild a clean env
  • Robust UV detection (common paths, PATH scan, which uv), plus EditorPrefs override (UnityMCP.UvPath)
  • Fallback logic: if installed path missing, uses embedded/dev source to repair

Editor UI: UnityMcpBridge/Editor/Windows/UnityMcpEditorWindow.cs

  • 2×2 layout with consistent sizing; header‑level “Show Debug Logs” toggle (globally gates verbose logs)
  • “Server Status” shows Installed/Installed (Embedded)/Not Installed
  • One‑click “Repair Python Env” with tooltip; fewer logs by default
  • Python detection warning with a link to official installers: https://www.python.org/downloads/
  • Dev‑mode aware path resolution via FindPackagePythonDirectory() so dev clones don’t appear misconfigured (in other words, all you have to do to swap to a dev clone is change the manifest.json file in Unity)

Python server (vendored)

  • Embedded at UnityMcpBridge/UnityMcpServer/src
  • Remove .python-version pin; set requires-python = ">=3.10" in pyproject.toml
  • uv.lock updated

Stability/UX Improvements

  • Sticky per‑project port across reloads and play‑mode transitions
  • Bridge restart is fast and reliable (micro‑retry loop); reduced perceived downtime for clients
  • Status survives domain reloads via heartbeat and UI refresh logic
  • Clear feedback for missing Python with direct link to installer
  • “Repair Python Env” fixes stale/mismatched venvs without manual steps

Compatibility

  • Backwards‑compatible: legacy port files still supported
  • No public tool API/schema changes
  • UPM distribution includes embedded server; local dev continues to work with file: manifests

Risks & Mitigations

  • UV discovery may miss unusual installations → PATH and which scan + EditorPrefs override
  • Rare bind failures after reload → micro‑retries in Start; prefers stored port to avoid flip‑flop
  • Heartbeat write is best‑effort; failures are non‑fatal

How to Verify (Quick)

  1. Import the package (UPM) and open the MCP window; confirm bridge starts and shows the port (e.g., 6400)
  2. Make a trivial script change to trigger a domain reload → bridge restarts; port unchanged
  3. Click “Repair Python Env” → completes without error; UV output errors appear in Console if any
  4. From a client, run read_console across reloads → remains responsive

Follow‑ups (Optional)

  • Consider a background auto‑repair if .venv import errors are detected at runtime
  • Optional: expose a manual UV path field next to the Repair button if override is frequently needed

Links

dsarno added 15 commits August 7, 2025 15:53
…stop on domain reload; start/stop locking; per-project sticky ports + brief release wait; Python discovery scans hashed+legacy files and probes; editor window live status refresh.
…ename auto-run toggle to client section ("Auto-connect to MCP Clients"); rename button to "Run Client Setup"; fix dev-mode status by using FindPackagePythonDirectory() for Claude/Desktop path checks
…nstaller

- Switch ServerInstaller to embedded copy-only (no network)
- Simplify Editor UI server status to 'Installed (Embedded)'
- Vendor UnityMcpServer/src into UnityMcpBridge/UnityMcpServer/src for UPM distribution
- Keep bridge recompile robustness (heartbeat + sticky port)
…paths case-insensitively to prevent sticky-port drift across reloads
… port and let bind micro-retry handle release to avoid port swapping on recompiles
…etup with Connected ✓ state; add Debug Logs toggle and gate verbose logs

fix(bridge): reuse stored port in StartAutoConnect; guard listener stop to avoid ObjectDisposedException

chore(clients): reorder dropdown to Cursor, Claude Code, Windsurf, Claude Desktop, VSCode
…opies embedded server, adds RepairPythonEnvironment() (deletes .venv, runs 'uv sync'); robust uv path discovery; macOS install path -> Application Support\n- UI: Server Status shows Installed(Embedded); Python missing warning with install link; Repair button tooltip; header Show Debug Logs; cleaned layout\n- Python: unpin .python-version; set requires-python >=3.10 in both pyprojects\n- Dev: improved package/dev path resolution
…hind Debug Logs toggle; improve Python and UV detection on Windows (flex versions, where.exe/Path scan); tidy installer messages
@dsarno
Copy link
Collaborator Author

dsarno commented Aug 9, 2025

From GPT 5 Pro: Here’s a focused review of PR #192 (“Unity MCP: bridge stability + embedded server”), based on the PR thread and the key commits/diffs. I’ve grouped notes into what looks great, risks/edge cases, and actionable suggestions, then closed with a quick test plan you can run before merge.

Note from PR author @dsarno : Great feedback, not sure it's all necessary for a merge, but I can work on some of it for a follow-up PR unless anyone feels strongly. Edit: Ended up hardening the reload and retry/debounce functionality, much better than in the original commit, see comment below.


What’s solid

  1. Sticky, per‑project ports + brief release wait
    PortManager now prefers the previously stored port, waits briefly during domain reloads, and only falls back when necessary. This directly targets the “port flipping on recompiles” pain. Nice touch adding a short wait and a quick socket probe to reduce churn. ([GitHub]1)

  2. Embedded Python server instead of Git-based installer
    Switching ServerInstaller to copy from an embedded UnityMcpServer/src removes network flakiness and makes installs deterministic. The code paths to detect embedded/dev layouts plus status/UI tweaks are a clear simplification. ([GitHub]2)

  3. Repair flow for Python env + better uv discovery
    Adding RepairPythonEnvironment() (delete .venv, unpin .python-version, run uv sync) and robust uv path discovery (EditorPrefs override + several common locations + PATH probing) will save users time when Python breaks. Also moving macOS install location to Application Support is appropriate for a user-writable target. ([GitHub]3)

  4. Small, user-facing polish
    Button rename (“Bind to Clients”) and other editor status/UX cleanups make the window less confusing. ([GitHub]4)

  5. Scope and intent match the PR description
    The PR’s emphasis on recompile/play-mode resilience, start/stop locking, sticky ports, and vendoring the server is consistent with the diffs/commits reviewed. ([GitHub]5)


Risks & edge cases to consider

  1. Main-thread stalls from networking checks
    IsPortUsedByUnityMcp does TcpClient.ConnectAsync(...).Wait(100). Even at 100 ms, this is a synchronous wait that can hitch the editor if called on the main thread during domain reload or window repaint. Consider a non-blocking pattern or doing this in a background task with a short timeout. ([GitHub]1)

  2. Heuristic for “port is MCP” may produce false positives
    Right now the probe only checks that something connects on the port. If another local process temporarily owned the same port, you might incorrectly “wait for MCP” and stall. A minimal handshake (see suggestion Invalid unity property in package.json #2) would make the check definitive. ([GitHub]1)

  3. Sticky-port logic during hot transitions
    You’ve reinforced “prefer stored port” (micro-retry then bind) to defend against swap-flapping—which is great—but keep an eye on scenarios where another process snipes the port between stop/start. Ensure the bind path has a bounded retry/backoff and logs why it finally moved. ([GitHub]6)

  4. Cross‑platform install paths / permissions
    macOS path change to Application Support is good. Verify Windows paths (e.g., %LOCALAPPDATA%\\Programs\\UnityMCP\\UnityMcpServer\\src) are consistently used across UI messaging and any auto-config writers. Mixed, dangling legacy paths can confuse users after upgrading from the Git-based installer. ([GitHub]3)

  5. Package bloat & import times
    Vendoring the Python server into the Unity package increases content under UPM. Keep an eye on import times and .meta noise. If this grows further, consider excluding non-essential dev artifacts from UPM distribution (e.g., large caches, test data). ([GitHub]2)


Actionable suggestions (fast wins)

  1. Make the port probe non-blocking
    Replace ConnectAsync(...).Wait(100) with a true async/timeout approach off the main thread. Two options:
  • Schedule the probe via EditorApplication.delayCall and continue the binding flow only after it returns.
  • Use Task.WhenAny(connectTask, Task.Delay(100)) and avoid blocking waits entirely.
    This removes potential 100–1500 ms UI hitches during reloads. ([GitHub]1)
  1. Add a lightweight identification handshake
    Have the server expose a trivial “hello” (e.g., send a single line and expect UNITY_MCP\n, or an HTTP /healthz if you’re using HTTP). Then IsPortUsedByUnityMcp can attempt that handshake and positively identify MCP vs random listeners. Document the handshake contract in the repo for future tooling. ([GitHub]1)

  2. Constrain/telemetry retry behavior on bind
    Where you “prefer stored port,” add:

  • A small, capped retry loop with exponential backoff (e.g., 3 attempts at 50/100/200 ms).
  • A clear final log that explains whether you stuck to the stored port or had to pick a new one.
    This will make “why did it change?” obvious in user reports. ([GitHub]6)
  1. Unify and gate logs
    You already have a “Show Debug Logs” toggle in the editor window. Route all bridge/server‑management logs through a single helper (prefix [UNITY‑MCP], level‑aware). This keeps normal use quiet and gives power users precise breadcrumbs when debugging. ([GitHub]3)

  2. Upgrade/migration note for legacy installs
    On first run after update, detect the legacy Git-based install location. If found, show a one-time banner:

  • “Embedded server is now used.”
  • Offer buttons: “Open Folder,” “Remove Legacy Copy,” “Dismiss.”
    It reduces confusion and orphaned data. ([GitHub]2)
  1. CI sanity checks (optional but useful)
  • A headless test that spins the Python server (via uv) and hits the handshake endpoint.
  • A portability test to validate FindUvPath() candidates on macOS/Linux/Windows naming variants.
    These catch regressions before users do. ([GitHub]3)

Targeted code nits

  • Port registry naming: You added hashing + legacy filename handling—good. Ensure path comparisons are case‑insensitive on Windows (looks like you covered this in follow-ups). Add a brief comment where you compute the hash stating the input (full project path, normalized). ([GitHub]1)
  • Thread.Sleep in editor code: WaitForPortRelease uses Thread.Sleep(step) in a loop. If there’s any path where this runs on the main thread, replace with async delay or editor updates to avoid UI stalls. ([GitHub]1)
  • Timeout values: Current 100 ms probe and ~1500 ms total wait are reasonable. Document the rationale and make them constants with names like kProbeTimeoutMs / kPortReleaseWaitMs to ease future tuning. ([GitHub]1)

Quick test plan (manual)

Run these in both macOS and Windows:

  1. Fresh install path

    • Remove existing installs; import the package.
    • Open Window → Unity MCP and confirm Installed (Embedded) status and no network fetching. ([GitHub]2)
  2. Recompile / domain reload stability

    • Start MCP Server, connect a client.
    • Trigger a script recompile (edit C# file) and toggle Play Mode on/off several times.
    • Confirm: port remains the same; reconnects are automatic; no “port already in use” flapping. ([GitHub]6)
  3. Port collision behavior

    • Manually bind a dummy listener to the stored port.
    • Start the bridge and verify: brief wait occurs; if non‑MCP process owns it, you don’t stall indefinitely and logs explain the fallback. ([GitHub]1)
  4. Repair flow

    • Corrupt .venv or pin an incompatible .python-version.
    • Click Repair; confirm uv sync runs and the server starts cleanly. ([GitHub]3)
  5. UI/wording

    • Verify “Bind to Clients” flows still update configs as expected and status is accurate. ([GitHub]4)

Overall take

The direction is right: stickiness across reloads, deterministic installs, and practical recovery tools. Before merge, I’d address the main‑thread waits, add a positive MCP handshake, and make the bind‑retry behavior explicit and well‑logged. With those tightened, this should noticeably improve day‑to‑day reliability.

If you want, I can draft inline GitHub review comments you can paste onto the PR (one per file/region) focusing on the specific lines in PortManager.cs and ServerInstaller.cs.

@msanatan
Copy link
Contributor

msanatan commented Aug 9, 2025

Hey @dsarno, thanks for this! I haven't fully checked out everything, realistically by tomorrow I can be more thorough. That said, make a few adjustments?

  • Merge into the main branch instead of master, which is there for backward compatibility. The main branch has some changes, so be sure to merge it into your branch in case of conflicts!
  • Now that UnityMcpServer is part of the installed Unity package, can you change its name to UnityMcpServer~. Why? It's not actual plugin code, and the ~ tells Unity to ignore those files in the import process. That's good, changes to those files won't required asset DB refreshes: https://docs.unity3d.com/Manual/SpecialFolders.html
    • Once the folder name is changed, can you remove the .meta files under UnityMcpServer~ as they're no longer needed?
  • I don't think we need the "old" UnityMcpServer folder at the root level, so can you delete it? If we do need it, why not just merge everything into the new UnityMcpServer folder that's within the plugin?
  • If the installation instructions have to change, update the README! But please merge main into your branch first to avoid the conflicts 😅 . If necessary update README-DEV as well
Screenshot 2025-08-09 at 12 02 48 PM

dsarno and others added 3 commits August 9, 2025 12:05
…ents

This merge combines upstream's organizational rebrand and updates with
our comprehensive bridge stability improvements:

**From Upstream:**
- CoplayDev organizational rebrand (README, LICENSE, documentation)
- Updated logo and deployment scripts
- Python version pinning (.python-version file)

**From Our Branch (Preserved):**
- Comprehensive bridge stability improvements (threading, heartbeat, retries)
- Enhanced debugging and diagnostic features
- Embedded server installation approach (more reliable than git-based)
- Broader Python compatibility (>=3.10 vs >=3.12)
- Advanced port management with per-project persistence
- Auto-setup and connection reliability features
- Robust error handling and recovery mechanisms

**Key Technical Decisions:**
- Used our comprehensive UnityMcpBridge.cs (625 lines vs 473) with all stability features
- Maintained embedded server approach over upstream's git-based installer
- Preserved broader Python compatibility (>=3.10) for better accessibility
- Used our optimized connection settings and retry logic
- Kept our user-centric server installation approach (on-demand vs automatic)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
…iles; delete old root UnityMcpServer; update editor lookup for tilde path; adjust deploy/restore scripts; remove orphan meta
…ty-mcp into feat/bridge-stability

* 'feat/bridge-stability' of https://github.com/dsarno/unity-mcp:
  Bridge logs: add bold blue UNITY-MCP prefix; gate PortManager logs behind Debug Logs toggle; improve Python and UV detection on Windows (flex versions, where.exe/Path scan); tidy installer messages
@dsarno
Copy link
Collaborator Author

dsarno commented Aug 9, 2025

Adding the fixes from @msanatan, as well as a couple others, now. Please wait to review until fixes are in, I will ping.

dsarno added 2 commits August 9, 2025 15:09
…binds.

Editor: auto-rewrite MCP client config when package path changes.

Server: heartbeat-aware retries, structured {state: reloading, retry_after_ms}, single auto-retry across tools; guard empty calls.

Repo: remove global *~ ignore (was hiding UnityMcpServer~), track tilde server folder (Unity still excludes it from assemblies).
@dsarno dsarno changed the title Enhancement: Stabilize MCP bridge: per‑project sticky ports, robust auto‑restart, UI fix Harden MCP Bridge: reliable reloads, sticky port, embedded server, cleaner Auto‑Setup UI Aug 9, 2025
@dsarno
Copy link
Collaborator Author

dsarno commented Aug 9, 2025

Per @msanatan 's requests

  • Merge upstream/main: CoplayDev rebrand with bridge stability improvements
  • Package Python server under UnityMcpServer~; remove redundant .meta files
  • Minor README updates: switched "Auto‑Configure" → "Auto‑Setup" and clarified package cache hash path + new image of MCPEditorWindow

Other screw-tightening on domain reloading:

  • Added stop‑before‑reload (OnBeforeAssemblyReload) + deferred init on editor idle → stable rebinds to the same per‑project port
  • Breadcrumb logs for stop/start/ping help trace reloads; ping fast‑path kept clients responsive

Note: Observed one transient read_console miss (server reported it was down) during a reload; it recovered immediately once the bridge restarted itself on the same port (6400). Can still make this slightly better UX-wise but shouldn't stall nearly as much as before.

Ready for review @msanatan @Scriptwonder .

Added new image of MCP Editor window to README
@msanatan msanatan changed the base branch from master to main August 10, 2025 00:46
@dsarno dsarno mentioned this pull request Aug 10, 2025
@tumml3r
Copy link

tumml3r commented Aug 10, 2025

I'm successfully on this branch now after experiencing the issues discussed here: #195. Just noting here for posterity.

Copy link
Contributor

@msanatan msanatan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a couple of questions, one is a minor edit suggestion and the other is just so I understand things better.

It was a lot to go through but overall looks good

{
Debug.Log($"Using stored port {storedPort}");
return storedPort;
if (IsDebugEnabled()) Debug.Log($"<b><color=#2EA3FF>UNITY-MCP</color></b>: Using stored port {storedConfig.unity_port} for current project");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self for later, set up a structured logger with levels and the like

}
}

private bool IsPythonDetected()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self for later, check if there's a more generic way to do this

What changed and why:
- Unity now writes a per-project port file named like
`~/.unity-mcp/unity-mcp-port-<hash>.json` to avoid projects overwriting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this solution

# Use the get_unity_connection function to retrieve the active connection instance
# Changed "MANAGE_GAMEOBJECT" to "manage_gameobject" to potentially match Unity expectation
response = get_unity_connection().send_command("manage_gameobject", params)
if isinstance(response, dict) and not response.get("success", True) and response.get("state") == "reloading":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any non-connection reason why response['success'] can be False? Also, why does it default to True?

To be clear I'm not asking for it to be changed, just asking for my knowledge.

Non critical, future task could be to use a wrapper function around get_unity_connection().send_command

Copy link
Collaborator Author

@dsarno dsarno Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msanatan

Is there any non-connection reason why response['success'] can be False?

Yes, two cases:

(1) "Preflight" call finds Unity is reloading, not ready, so retry...

{"success": false, "state": "reloading", "retry_after_ms": 250}

(2) Tool ran but bad params or other type of fail

{"success": false, "error": "GameObject 'Foo' not found"}

(Preflight returns before the tool executes ... tool‑level errors only appear after preflight passes.)

why does it default to True?

Overcautious and no longer strictly necessary, but ... the live ("old") version of the endpoint has no success field. So I kept a default to true in case we hit an older version of endpoint as we transition. In which case the missing success isn't treated as a fail, more like "this is the old version so go ahead"

Since server is now embedded, mixing old/new is unlikely; we could drop the default and do:

if isinstance(response, dict) and response.get("state") == "reloading" and response.get("success") is False:
    return response

But since people have been testing it and it seems to be working, maybe we slate that for a cleanup later with some of your other suggestions above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation!

return {
"success": False,
"state": "reloading",
"retry_after_ms": int(250),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the value from config.py?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msanatan Good call -- While addressing this, I found another issue with individual tools not retrying when they hit a recompile/reload gap. Will work on it tomorrow and hopefully that can be the last bit for now. Other fixes can go in a follow-up.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msanatan Ok re your above note, I replaced the magic 250 with a single-sourced value in config. Also realized individual tools weren't capable of retries after/during Unity reload, so added centralized reload-aware retries that every tool can use, so they retry instead of reporting fail and moving on. You can see this behavior if you ask e.g. Cursor to run all 8 tool calls in a row, starting with a manage_script to force a recompile.

Tools now:

  • detect reload, sleep reload_retry_ms (default 250ms), retry up to reload_max_retries (default 40 ≈ 10s)
  • preserve structured reloading failures if Unity takes longer
  • use shared helpers, avoiding per-tool duplication

Follow-up possibilities (future PR)

  • Add env overrides for reload_retry_ms and reload_max_retries
  • Include a retry counter and total waited ms in timeout responses/logs
  • Optionally expand reload detection to cover socket drops/exception paths
  • Revisit defaults if we see reloads >10s in CI or large projects

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @dsarno, great catch and good remedy

…_ms via config; increase default retry window (40 x 250ms); preserve structured reloading failures
Copy link
Contributor

@msanatan msanatan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@msanatan msanatan mentioned this pull request Aug 11, 2025
@dsarno dsarno merged commit bbbc26a into CoplayDev:main Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants