fix: CLI commands recover reliably after Unity reloads#1136
Conversation
Persist server readiness through a shared state file so CLI callers can distinguish startup, reload, recovery, ready, stopped, and failed states instead of relying on stale lock files. - centralize domain reload recovery behind tracked tasks and readiness probes - make CLI preflight and compile waits honor recovery state without infinite waits - keep recovery tests isolated from the real project IPC server
Ensure the internal project IPC readiness probe uses the same CLI metadata contract as real native CLI requests, rejects JSON-RPC error responses, and fails with a bounded timeout instead of leaving startup or recovery state stuck. Also let the native CLI read atomic state sidecars so it does not dispatch through a write or crash recovery window.
Use the internal get-version bridge command for server readiness so startup and recovery do not fail when execute-dynamic-code is disabled by project settings. Also publish stopped state after domain reloads that have no server to recover, preventing CLI waiters from observing a permanent recovering phase.
Keep compile-finish handling from publishing ready before the server readiness probe succeeds, and make recovery waits exit promptly when the server transitions to stopped.
Capture the pre-compile readiness snapshot so compiler-error or no-reload compiles do not leave CLI waiters stuck in the compiling phase, while startup and reload paths keep ownership of ready publication.
Return immediate recovery guidance when a busy server-state file remains after the target Unity process is gone, so normal CLI commands do not wait for the full readiness timeout.
Write atomic files through an in-progress sidecar before publishing the completed temp file so external CLI recovery never reads a partially written state. Isolate domain reload tests from the real project readiness state to avoid polluting local CLI readiness between test runs.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (4)
📒 Files selected for processing (12)
📝 WalkthroughWalkthroughThis PR adds comprehensive server readiness state tracking and lifecycle management to the Unity CLI Loop system. It persists server phase transitions (Starting, Ready, Compiling, Reloading, Recovering, Stopping, Stopped, Failed) to a shared JSON file with sidecar recovery, introduces readiness probing via IPC before marking servers ready, and replaces lock-file–based busy detection with state-driven readiness waits on both C# and Go sides. ChangesServer Readiness State Infrastructure
CLI Server State Reading & Classification
Readiness Probe Interface & First-Party Implementation
Server Controller Lifecycle & Readiness Integration
Compilation & Domain Reload State Tracking
CLI Compile-Wait & Recovery Integration
Error Handling & Server Readiness Failures
Composition Root & Wiring
Test Coverage
🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Packages/src/Cli`~/internal/cli/compile_wait_test.go:
- Around line 104-107: The goroutine in the test discards errors from
writeServerStateForTest, causing flakiness; change it to send the error back to
the test and fail deterministically. Replace the anonymous goroutine with one
that writes the result of writeServerStateForTest into a channel (or uses
sync/errgroup), then in the main test goroutine read the channel and call
t.Fatalf/t.Fatal if an error is returned; keep the same 20ms sleep and the call
to writeServerStateForTest so you only add error propagation around that call.
In `@Packages/src/Cli`~/internal/cli/tool_readiness.go:
- Around line 58-59: The call sites pass the parent ctx into
isBusyServerStateStale which can outlive the readiness timeout; change those
calls to use the timeout-scoped context (timeoutContext) instead of ctx so the
probe I/O is bounded. Locate the calls to isBusyServerStateStale (e.g., the
invocation currently using ctx at the first occurrence and the similar call
around the 86-88 region) and replace the ctx argument with timeoutContext,
ensuring timeoutContext is in scope where used.
In `@Packages/src/Editor/Application/SessionRecoveryService.cs`:
- Around line 74-79: The method currently awaits
_recoveryCoordinator.StartRecoveryIfNeededAsync but then unconditionally returns
ValidationResult.Success(); instead re-check _recoveryCoordinator.CurrentServer
(or the coordinator's state) after the await and, if CurrentServer is null or
!CurrentServer.IsRunning, return a failed ValidationResult (with an explanatory
message) rather than Success; update ExecuteAfterDomainReloadAsync to inspect
_recoveryCoordinator.CurrentServer after calling StartRecoveryIfNeededAsync and
convert the non-running outcome into ValidationResult.Failure.
In
`@Packages/src/Editor/CompositionRoot/UnityCliLoopFirstPartyServerLifecycleBinding.cs`:
- Line 15: Update the stale XML summary on the
UnityCliLoopFirstPartyServerLifecycleBinding class: replace the phrase
"execute-dynamic-code readiness" with the correct probe description "get-version
readiness" (or similar wording referencing the get-version probe) so the
class-level XML comment matches the current probing behavior.
In `@Packages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs`:
- Around line 136-144: Delete() currently removes only the main file and the
".bak" sidecar, leaving ".tmp" / ".tmp.write" files that RecoverSidecarFiles in
Read() can promote and resurrect deleted state; update Delete() (the method
named Delete in ServerReadinessStateStore) to also remove the temp sidecars by
invoking the AtomicFileWriter cleanup for temp files (or explicitly File.Delete
_stateFilePath + ".tmp" and _stateFilePath + ".tmp.write" if no cleanup helper
exists) so both ".tmp" and ".tmp.write" are removed alongside the ".bak".
In `@Packages/src/Editor/Infrastructure/UnityCliLoopBridgeServer.cs`:
- Around line 24-34: UnityCliLoopBridgeServer's ServerStarted and ServerStopping
accessors are empty so subscriptions are dropped; implement the add/remove to
forward handlers to the actual runtime server (e.g., call
underlyingServer.ServerStarted += value and underlyingServer.ServerStopping +=
value in add, and -= in remove) or, if the underlying server may be null when
subscribers attach, maintain a private delegate/list (e.g.,
_pendingServerStarted/_pendingServerStopping) and attach those when the runtime
server is assigned; reference the UnityCliLoopBridgeServer class and its
ServerStarted/ServerStopping events (or refactor
UnityCliLoopServerApplicationService to subscribe directly to the concrete
server instance exposed by IUnityCliLoopServerLifecycleSource) so subscriptions
are not silently ignored.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c73e171b-8d2c-4945-9a28-6b674e346566
⛔ Files ignored due to path filters (9)
Assets/Tests/Editor/AtomicFileWriterTests.cs.metais excluded by none and included by noneAssets/Tests/Editor/CompilationLockFileServiceTests.cs.metais excluded by none and included by noneAssets/Tests/Editor/UnityCliLoopFirstPartyServerLifecycleBindingTests.cs.metais excluded by none and included by nonePackages/src/Cli~/dist/darwin-amd64/uloopis excluded by!**/dist/**and included by nonePackages/src/Cli~/dist/darwin-arm64/uloopis excluded by!**/dist/**and included by nonePackages/src/Cli~/dist/windows-amd64/uloop.exeis excluded by!**/dist/**,!**/*.exeand included by nonePackages/src/Editor/Application/UnityCliLoopServerReadinessProbe.cs.metais excluded by none and included by nonePackages/src/Editor/CompositionRoot/AssemblyInfo.cs.metais excluded by none and included by nonePackages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs.metais excluded by none and included by none
📒 Files selected for processing (38)
Assets/Tests/Editor/AtomicFileWriterTests.csAssets/Tests/Editor/CompilationLockFileServiceTests.csAssets/Tests/Editor/DomainReloadDetectionServiceTests.csAssets/Tests/Editor/DomainReloadRecoveryUseCaseTests.csAssets/Tests/Editor/ProjectIpcWarmupClientTests.csAssets/Tests/Editor/UnityCliLoopFirstPartyServerLifecycleBindingTests.csAssets/Tests/Editor/UnityCliLoopServerControllerStartupLockTests.csAssets/Tests/Editor/UnityCliLoopServerStartupProtectionTests.csPackages/src/Cli~/internal/cli/command_registry.goPackages/src/Cli~/internal/cli/compile_wait.goPackages/src/Cli~/internal/cli/compile_wait_test.goPackages/src/Cli~/internal/cli/error_envelope.goPackages/src/Cli~/internal/cli/error_envelope_test.goPackages/src/Cli~/internal/cli/fix.goPackages/src/Cli~/internal/cli/fix_test.goPackages/src/Cli~/internal/cli/run.goPackages/src/Cli~/internal/cli/server_state.goPackages/src/Cli~/internal/cli/server_state_test.goPackages/src/Cli~/internal/cli/tool_readiness.goPackages/src/Cli~/internal/cli/tool_readiness_test.goPackages/src/Editor/Application/SessionRecoveryService.csPackages/src/Editor/Application/UnityCliLoopServerApplicationService.csPackages/src/Editor/Application/UnityCliLoopServerReadinessProbe.csPackages/src/Editor/Application/UseCases/DomainReloadRecoveryUseCase.csPackages/src/Editor/CompositionRoot/AssemblyInfo.csPackages/src/Editor/CompositionRoot/UnityCliLoopApplicationRegistration.csPackages/src/Editor/CompositionRoot/UnityCliLoopEditorBootstrapper.csPackages/src/Editor/CompositionRoot/UnityCliLoopFirstPartyServerLifecycleBinding.csPackages/src/Editor/FirstPartyTools/Compile/Skill/SKILL.mdPackages/src/Editor/Infrastructure/ProjectIpcWarmupClient.csPackages/src/Editor/Infrastructure/Server/CompilationLockFileService.csPackages/src/Editor/Infrastructure/Server/DomainReloadDetectionFileService.csPackages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.csPackages/src/Editor/Infrastructure/Server/UnityCliLoopServerController.csPackages/src/Editor/Infrastructure/Settings/AtomicFileWriter.csPackages/src/Editor/Infrastructure/UnityCliLoopBridgeServer.csPackages/src/Editor/ToolContracts/UnityCliLoopConstants.csPackages/src/Editor/ToolContracts/UnityCliLoopServerConfig.cs
💤 Files with no reviewable changes (1)
- Packages/src/Editor/CompositionRoot/UnityCliLoopEditorBootstrapper.cs
| public event Action ServerStarted | ||
| { | ||
| add { } | ||
| remove { } | ||
| } | ||
|
|
||
| public event Action ServerStopping | ||
| { | ||
| add { } | ||
| remove { } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for any code subscribing to ServerStarted/ServerStopping on the factory type
rg -nP --type=cs '\bUnityCliLoopBridgeServerInstanceFactory\b' -A 10 | rg -P '\.(ServerStarted|ServerStopping)\s*\+='
rg -nP --type=cs 'IUnityCliLoopServerLifecycleSource' -A 10 | rg -P '\.(ServerStarted|ServerStopping)\s*\+='Repository: hatayama/unity-cli-loop
Length of output: 349
Remove empty event accessors or ensure subscriptions are rerouted to the functional server instance.
External code in UnityCliLoopServerApplicationService.cs (lines 202, 207) subscribes to ServerStarted and ServerStopping events on the factory via IUnityCliLoopServerLifecycleSource. With empty accessors, these subscriptions are silently ignored and the handlers will never fire—breaking the application service's event handling logic. Either implement the event accessors to forward to the underlying server instance, or refactor the application service to subscribe to the server instance directly instead of the factory.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Packages/src/Editor/Infrastructure/UnityCliLoopBridgeServer.cs` around lines
24 - 34, UnityCliLoopBridgeServer's ServerStarted and ServerStopping accessors
are empty so subscriptions are dropped; implement the add/remove to forward
handlers to the actual runtime server (e.g., call underlyingServer.ServerStarted
+= value and underlyingServer.ServerStopping += value in add, and -= in remove)
or, if the underlying server may be null when subscribers attach, maintain a
private delegate/list (e.g., _pendingServerStarted/_pendingServerStopping) and
attach those when the runtime server is assigned; reference the
UnityCliLoopBridgeServer class and its ServerStarted/ServerStopping events (or
refactor UnityCliLoopServerApplicationService to subscribe directly to the
concrete server instance exposed by IUnityCliLoopServerLifecycleSource) so
subscriptions are not silently ignored.
There was a problem hiding this comment.
Not changed in a0854a0f.
I do not want to forward these raw factory events in this PR because the controller intentionally publishes server lifecycle after the readiness probe has passed. Forwarding the bridge instance event directly here would risk marking the server as started before the external CLI readiness contract is true.
A safer follow-up would be to split the loop-exit source contract from the ready-after-probe lifecycle publisher, but direct forwarding would weaken the recovery/readiness model this PR is trying to stabilize.
There was a problem hiding this comment.
@hatayama, understood — the empty accessors are an intentional guard. Since UnityCliLoopServerControllerService is the authoritative publisher of ServerStarted/ServerStopping (gated on the readiness probe), forwarding the raw bridge instance events from the factory would violate the readiness contract this PR is stabilizing. The empty accessors are the correct way to satisfy the IUnityCliLoopServerLifecycleSource interface without leaking premature signals.
I'll record this design intent and withdraw the concern. Would you like me to open a follow-up issue to track the proposed split of the loop-exit source contract from the probe-gated lifecycle publisher?
✏️ Learnings added
Learnt from: hatayama
Repo: hatayama/unity-cli-loop PR: 1136
File: Packages/src/Editor/Infrastructure/UnityCliLoopBridgeServer.cs:24-34
Timestamp: 2026-05-16T06:00:48.765Z
Learning: In `Packages/src/Editor/Infrastructure/UnityCliLoopBridgeServer.cs`, the `ServerStarted` and `ServerStopping` events on `UnityCliLoopBridgeServerInstanceFactory` intentionally have empty `add`/`remove` accessors. This is by design: `UnityCliLoopServerControllerService` is the authoritative publisher of these lifecycle events, and it gates publication on a readiness probe passing. Forwarding the raw bridge-instance events from the factory would bypass probe-gating and mark the server as started before the external CLI readiness contract is satisfied. Do not flag these empty accessors as a bug.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: hatayama
Repo: hatayama/uLoopMCP PR: 662
File: Packages/src/Editor/UI/McpEditorWindow.cs:229-238
Timestamp: 2026-02-22T13:32:37.251Z
Learning: In C# codebases (e.g., Packages/src/Editor/UI/McpEditorWindow.cs), apply a Fail Fast policy: avoid broad defensive try-catch for unexpected exceptions. Only catch expected, domain-specific exceptions (e.g., Win32Exception for missing executables) at the appropriate layer, and let genuine bugs surface rather than being silently swallowed. This improves visibility of defects and maintains early failure behavior across the codebase.
Learnt from: hatayama
Repo: hatayama/uLoopMCP PR: 662
File: Packages/src/Editor/UI/McpEditorWindow.cs:599-683
Timestamp: 2026-02-22T13:32:34.654Z
Learning: In C# code within the uLoopMCP repository, apply the policy that async void methods should avoid try-catch for control flow and resource release. Use try-finally blocks to ensure resource cleanup (e.g., resetting state flags, refreshing UI) and allow exceptions to propagate (Fail Fast) to surface unexpected errors immediately. This guideline should be followed for files under Packages/src/Editor/**/*.cs and, where applicable, similar async void methods in this project.
Learnt from: hatayama
Repo: hatayama/uLoopMCP PR: 698
File: Packages/src/Editor/Config/ToolSettings.cs:41-42
Timestamp: 2026-02-28T15:10:41.524Z
Learning: In Unity Editor C# code (e.g., ToolSettings.cs, ULoopSettings.cs), use Debug.Assert for internal invariants and preconditions as part of Contract Programming and Fail Fast. These assertions validate internal contracts and are expected to halt in Unity Editor during development builds; do not replace them with runtime exceptions for external input validation. Apply this guidance to editor-related C# files under Packages/src/Editor and similar internal code paths.
Learnt from: hatayama
Repo: hatayama/uLoopMCP PR: 698
File: Packages/src/Editor/Config/ToolSkillSynchronizer.cs:115-144
Timestamp: 2026-02-28T15:10:54.346Z
Learning: In uLoopMCP's Editor code, background tasks triggered by UI (e.g., ToolSkillSynchronizer.RunSkillsInstall) should emit Debug.LogWarning on failures instead of throwing exceptions to avoid crashing the Editor window. This is an intentional exception to the general Fail Fast policy, applied specifically to non-critical background tasks. Apply this pattern to editor-related C# files under the Packages/src/Editor directory. When failures occur, log a warning with contextual information (task name, identifiers) instead of propagating exceptions; use try-catch if needed to convert errors to log warnings.
Learnt from: hatayama
Repo: hatayama/unity-cli-loop PR: 806
File: Packages/src/Runtime/SimulateMouseInput/SimulateMouseInputOverlay.cs:131-145
Timestamp: 2026-03-19T13:09:52.378Z
Learning: In the hatayama/unity-cli-loop repository (and related uLoopMCP), follow the contract-programming convention documented in CLAUDE.md: use Debug.Assert to document and enforce programmer preconditions (fail-fast during development). Treat null/invalid inputs caused by incorrect usage as programmer errors, not recoverable runtime conditions. Do not recommend replacing Debug.Assert precondition checks with throw/InvalidOperationException (or similar explicit exceptions) when the intent is to fail-fast via assert in development builds.
Learnt from: hatayama
Repo: hatayama/unity-cli-loop PR: 887
File: AGENTS.md:5-5
Timestamp: 2026-04-04T11:51:37.434Z
Learning: In Unity projects, directory names that intentionally end with `~` (e.g., `TypeScriptServer~`, `Documentation~`) should be treated as valid. Unity’s asset pipeline ignores folders/files ending with `~`, so code/docs should not be flagged for using trailing tildes in Unity folder/directory names.
There was a problem hiding this comment.
No additional code change is needed beyond a0854a0f.
Thanks for confirming the design intent. I do not need a follow-up issue from this PR right now because the behavior is already intentional: lifecycle publication remains owned by UnityCliLoopServerControllerService after readiness probing, while the factory only forwards ServerLoopExited.
There was a problem hiding this comment.
3 issues found across 47 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="Packages/src/Cli~/internal/cli/tool_readiness.go">
<violation number="1" location="Packages/src/Cli~/internal/cli/tool_readiness.go:58">
P2: Use the readiness timeout context for stale-state checks; passing the parent context can bypass the 180s readiness deadline during process lookup.</violation>
</file>
<file name="Packages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs">
<violation number="1" location="Packages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs:132">
P2: Handle file read/parse failures in `Read()` so a corrupted readiness file does not crash recovery flow.</violation>
<violation number="2" location="Packages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs:140">
P2: Treat state-file deletion as best-effort by catching lock/permission exceptions during cleanup.
(Based on your team's feedback about best-effort temp cleanup during shutdown/reload.) [FEEDBACK_USED]</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic
| { | ||
| if (File.Exists(_stateFilePath)) | ||
| { | ||
| File.Delete(_stateFilePath); |
There was a problem hiding this comment.
P2: Treat state-file deletion as best-effort by catching lock/permission exceptions during cleanup.
(Based on your team's feedback about best-effort temp cleanup during shutdown/reload.)
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At Packages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs, line 140:
<comment>Treat state-file deletion as best-effort by catching lock/permission exceptions during cleanup.
(Based on your team's feedback about best-effort temp cleanup during shutdown/reload.) </comment>
<file context>
@@ -0,0 +1,156 @@
+ {
+ if (File.Exists(_stateFilePath))
+ {
+ File.Delete(_stateFilePath);
+ }
+
</file context>
| File.Delete(_stateFilePath); | |
| try | |
| { | |
| File.Delete(_stateFilePath); | |
| } | |
| catch (IOException) | |
| { | |
| } | |
| catch (UnauthorizedAccessException) | |
| { | |
| } |
There was a problem hiding this comment.
Not changed in a0854a0f.
Cleanup remains fail-fast here. Swallowing IOException or UnauthorizedAccessException would hide the exact condition that prevents the CLI/server from clearing stale readiness state. The applied fix removes all recoverable sidecars while still surfacing real filesystem failures clearly.
There was a problem hiding this comment.
Thanks for the feedback! I've saved this as a new learning to improve future reviews.
| return null; | ||
| } | ||
|
|
||
| string content = File.ReadAllText(_stateFilePath); |
There was a problem hiding this comment.
P2: Handle file read/parse failures in Read() so a corrupted readiness file does not crash recovery flow.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At Packages/src/Editor/Infrastructure/Server/ServerReadinessStateStore.cs, line 132:
<comment>Handle file read/parse failures in `Read()` so a corrupted readiness file does not crash recovery flow.</comment>
<file context>
@@ -0,0 +1,156 @@
+ return null;
+ }
+
+ string content = File.ReadAllText(_stateFilePath);
+ return JsonConvert.DeserializeObject<ServerReadinessState>(content);
+ }
</file context>
There was a problem hiding this comment.
Not changed in a0854a0f.
I am keeping corrupted readiness state as a visible failure instead of treating it as absent. The CLI side already reports unreadable state explicitly, and a C# self-healing path should be designed as an explicit result/logging flow rather than catch-and-continue in Read().
There was a problem hiding this comment.
Thanks for the feedback! I already have a similar learning that covers this, so I'm keeping the existing one.
There was a problem hiding this comment.
No additional code change is needed beyond a0854a0f.
Keeping this as-is intentionally preserves the fail-fast behavior for corrupted readiness state. If we later want self-healing here, I would treat it as a separate design change with an explicit result/logging path rather than catch-and-continue inside Read().
There was a problem hiding this comment.
Thanks for the feedback! I've saved this as a new learning to improve future reviews.
Tighten recovery reporting and state cleanup so CLI waiters do not treat incomplete recovery as success or resurrect deleted readiness state from atomic-write sidecars. - fail session recovery when no running server exists after recovery - remove all server-state sidecars from cleanup paths - bound stale-state process lookup by the readiness timeout context
Summary
User Impact
uloop fixoruloop launch.Changes
Verification
Packages/src/Cli~/dist/darwin-arm64/uloop compile --project-path /Users/a12115/ghq/hatayama/unity-cli-loop --wait-for-domain-reloadscripts/check-go-cli.shgit diff --check origin/v3-beta..HEAD~/.codex/skills/codex-review/scripts/codex-review --mode branch v3-betaOverview
This PR centralizes CLI command recovery after Unity domain reloads by introducing a persistent server readiness state file and a shared recovery signal mechanism. Instead of scattered checks across lock files and ad-hoc waiting, all CLI commands now wait on a single
ServerReadinessStateStorethat tracks server lifecycle phases atomically.Architecture & Key Components
Readiness State System
A new
ServerReadinessStateStore(C#) persists server state toTemp/UnityCliLoop/server-state.jsonwith the following lifecycle phases:starting- server initializationcompiling- compilation in progressreloading- domain reload in progressrecovering- recovery after reloadready- server operationalstopped- server shut downfailed- error occurredEach state record includes
Phase,GenerationId,UpdatedAt,Reason,Endpoint, andLastErrorfields, written atomically viaAtomicFileWriterwith.tmp.write→.tmp→ final-file rotation to ensure consistency across domain reloads.CLI-Side Recovery (Go)
Go-side code now reads server state from the shared location and implements:
isUnityBusyByServerState(): Determines busy status from persisted state instead of filesystem lock-file scanningwaitForRecoveringServerIfNeeded(): Waits only when server is in a busy phase (starting/compiling/reloading/recovering/stopping)uloop fix) instead of waiting until timeoutCompilation & Domain Reload Integration
CompilationLockFileService: Now records pre-compilation readiness state and restores it after compilation finishes, ensuring no-reload or compiler-error paths don't leave commands stuckDomainReloadDetectionFileService: Persists domain reload lifecycle (before/after transitions) with generation IDs and markers to coordinate recovery across the reload boundaryUnityCliLoopServerController: Centralized recovery coordinator that:readystatefailedstate on recovery errors instead of leaving commands in limboReadiness Probing
IUnityCliLoopServerReadinessProbeinterface enables pluggable readiness checksUnityCliLoopFirstPartyServerLifecycleBindingimplements the probe via an internalget-versionIPC request (not user-facingexecute-dynamic-code) that includes CLI metadata and rejects JSON-RPC error responsesCommand Dispatch Changes
RunProjectLocal()(Go) now conditionally waits for server readiness before tool execution:fixandfocus-windowcommands (safe to run in any state)Error Handling Improvements
Two new failure classifications in error envelope:
serverStoppedError: Server stopped before readiness; guidance is to restart bridgestaleServerStateError: Stale recovery state file detected; guidance is to runuloop fixThe
fixcommand now cleans recovery state files (server-state.json and variants) in addition to legacy lock files.Test Coverage
C# Test Isolation: Tests create isolated
ServerReadinessStateStoreinstances with unique temp directories, preventing cross-test interference:AtomicFileWriterTests: Sidecar file recovery behaviorCompilationLockFileServiceTests: Readiness state restoration after compilationDomainReloadDetectionServiceTests: State store integrationDomainReloadRecoveryUseCaseTests: Recovery without server instancesProjectIpcWarmupClientTests: JSON-RPC error response validationUnityCliLoopServerControllerStartupLockTests: Readiness probe timeout and success pathsUnityCliLoopFirstPartyServerLifecycleBindingTests: CLI metadata in readiness requestsGo Test Coverage:
compile_wait_test.go: Server state-based completion vs lock-file clearingerror_envelope_test.go: New server-stopped and stale-state classificationsfix_test.go: Recovery state cleanup (state file +.tmp+ legacy locks)server_state_test.go: (11 tests) State reading, sidecar recovery, busy detection, failure handling, stale detectiontool_readiness_test.go: Stopped-phase detection and stale-busy identificationBreaking Changes
RestoreServerStateIfNeeded()→RestoreServerStateIfNeededAsync()inSessionRecoveryServiceExecuteAfterDomainReloadAsync()nowasyncinDomainReloadRecoveryUseCaseUnityCliLoopBridgeServerInstanceFactory.ServerStarted/ServerStoppingevents changed to no-op accessorsWire Format & Constants
UnityCliLoopConstants.SERVER_STATE_FILE_NAME = "server-state.json"UnityCliLoopServerConfig.READINESS_PROBE_TIMEOUT_MS = 30000(30 seconds){ phase, generationId, updatedAt, reason, endpoint, lastError }Documentation
Updated
SKILL.mdtroubleshooting to distinguish:uloop fixuloop fix