Connection architecture refactoring: manager-owned lifecycle, setup integration, diagnostics#304
Merged
Merged
Conversation
Create ICredentialResolver interface and CredentialResolver implementation with canonical resolution: DeviceToken > SharedGatewayToken > BootstrapToken. - Add IDeviceIdentityReader/DeviceIdentityFileReader for testable identity reads - Add GatewayRecord/SshTunnelConfig data model - Add 11 new credential resolver tests covering both operator and node resolution - Create Services/Connection/ directory structure Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pure-logic state machine with operator and node sub-FSMs. GatewayConnectionSnapshot immutable record with DeriveOverall(). ConnectionTrigger, OverallConnectionState, RoleConnectionState enums. 50 exhaustive transition tests covering all valid/invalid paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Factory interface wrapping OpenClawGatewayClient construction. IGatewayClientLifecycle separates lifecycle (manager) from data (UI). GatewayClientFactory creates real WebSocket clients. GatewayClientLifecycleAdapter bridges the existing client to the interface. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Gateway catalog with JSON persistence (gateways.json). GatewayRecord immutable record with SshTunnelConfig. IFileSystem/RealFileSystem abstraction for testability. Thread-safe with lock-protected list, events fire outside lock. 15 tests: CRUD, persistence round-trip, SSH config, URL lookup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixed-capacity ring buffer for timestamped diagnostic events. IClock/SystemClock time abstraction for testable timestamps. 12 tests: recording, overflow, GetRecent, clear, event firing. Phase 1 Foundation complete — all new code, no behavioral changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add DeviceTokenReceived and HandshakeSucceeded events to both OpenClawGatewayClient and WindowsNodeClient. Events are additive — clients continue to write tokens internally for backward compat. Add DeviceTokenReceivedEventArgs, IDeviceIdentityStore interface. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Manager with interface, state machine, diagnostics, semaphore, generation counter, and client creation via factory. Handles connect/disconnect/reconnect/switch lifecycle. Generation-guarded event handlers for stale event protection. 11 manager tests covering lifecycle, state transitions, events. ConnectionErrorCategory, RetryPolicy, SetupCodeResult types. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add OperatorClientChangedEventArgs with OldClient/NewClient. Manager fires event on client creation and disposal. Enables App to re-wire data event handlers when client changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Critical switchover: App.xaml.cs no longer creates OpenClawGatewayClient directly. InitializeGatewayClient() now bridges settings to a GatewayRecord and delegates to _connectionManager.ConnectAsync(). - OnOperatorClientChanged re-wires all 27 data event handlers - OnManagerStateChanged maps OverallConnectionState to legacy ConnectionStatus - Disconnect paths use _connectionManager.DisconnectAsync() - _gatewayClient field kept for backward compat, updated by OperatorClientChanged Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create INodeConnector interface and NodeConnector implementation. NodeConnectionMode enum (Gateway/McpOnly/Disabled). NodeConnector owns WindowsNodeClient lifecycle, delegates capability setup to NodeService (which has WinUI deps and stays in App.xaml.cs). App's InitializeNodeService() remains for now — full migration in Phase 4. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add MigrateFromSettings() to GatewayRegistry — idempotent migration from SettingsData Token/BootstrapToken/SSH to GatewayRecord. Identity file is copied (not moved) for rollback safety. Mark SettingsData.Token/BootstrapToken as legacy in doc comments. 9 migration tests: creation, idempotency, SSH, identity copy, persistence. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement ApplySetupCodeAsync in GatewayConnectionManager: decode setup code → validate URL → disconnect → create/update GatewayRecord → set active → save registry → connect. All setup code entry points can now call this single method. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Step 3.2: DeviceIdentityStore implements IDeviceIdentityStore for manager-owned token writes via DeviceIdentity. Step 3.3: ISshTunnelManager interface wrapping SshTunnelService. SshTunnelManager delegates to existing service. Phase 3 Unification complete. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Delete UnsubscribeGatewayEvents() (27 unsubscribe lines — now handled by OnOperatorClientChanged). Replace 3 remaining manual disconnect paths with _connectionManager.DisconnectAsync(). Shutdown uses _connectionManager.Dispose() for clean teardown. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create IOperatorGatewayClient interface in OpenClaw.Shared with all 27 data events, query properties, and connection events. OpenClawGatewayClient now implements this interface. UI consumers can be typed to the interface to prevent lifecycle access. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SettingsChangeClassifier.Classify() determines minimum reconnect action: NoOp, UiOnly, CapabilityReload, NodeReconnect, OperatorReconnect, FullReconnect. 8 tests covering all classification categories. All 4 phases of the northstar connection architecture are now implemented. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two-column layout adapted from openclaw-windows-node-2: - Left: state machine visual (operator/node sub-FSMs), gateway catalog, credential resolution display - Right: color-coded event timeline with copy/clear Wired to ConnectionDiagnostics and IGatewayConnectionManager. Accessible via Debug page button and 'connectionstatus' tray command. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adapted from node-2 ConnectionPage, wired to our architecture: - Status card: disconnect button + connection attempts counter - Setup code: live decode preview (shows URL + token before apply), wired through manager.ApplySetupCodeAsync() - Recent gateways: list from GatewayRegistry with connect/remove buttons - Live state: subscribes to IGatewayConnectionManager.StateChanged - Advanced settings collapsed by default - Removed connect toggle (replaced with explicit disconnect button) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove mDNS gateway discovery UI (scan button, gateway list, token prompt) and all related code-behind (AutoScanAsync, PopulateGatewayList, OnConnectToGateway, OnConnectWithToken, OnCancelTokenPrompt). Gateway switching is now handled via the Recent Gateways card from GatewayRegistry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ConnectionStatusWindow now has a SETUP CODE section in the left column: - Paste setup code with live decode preview - Connect button applies code via manager.ApplySetupCodeAsync() - Disconnect button for teardown - Reconnect (empty code + Connect) for quick retry Auto-show diagnostics window on launch when no gateway is configured, enabling end-to-end setup and testing without the Hub UI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rors Based on connection-analysis-report.md comparing against gateway server and Mac/TS reference implementation: Fix 1 (HIGH): Node client now uses SignConnectPayloadV3 instead of legacy SignPayload/BuildDebugPayload. Gateway verifies v3 on first try. Fix 2 (MEDIUM): Remove operator 4-mode signature cascade (V3Auth→V3Empty→V2Auth→V2Empty). Always use v3 with auth token, matching reference client. Signature rejection = real error. Fix 3 (MEDIUM): Add TryGetErrorDetailCode helper for structured error.details.code parsing (DEVICE_AUTH_SIGNATURE_INVALID, etc.). Fix 4 (LOW): Mark SignPayload and BuildDebugPayload as [Obsolete] on DeviceIdentity with guidance to use SignConnectPayloadV3. Updated 2 tests to match new immediate-failure behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bugs fixed during end-to-end gateway testing: 1. Bootstrap token misclassification — setup codes always store as BootstrapToken 2. Settings bridge overwrites registry — skip if record exists 3. Stale bootstrap auto-connect — skip startup connect with expired tokens 4. Identity file not synced to per-gateway dir — copy on every connect 5. v3 signature rejected by gateway — v3→v2 fallback (flag persists per session) 6. v2 retry on same socket wastes token — set flag, let reconnect handle it 7. Bootstrap can't request operator.admin — use bounded scopes for bootstrap 8. PairingRequired shown as Error — new PairingRequired event + state 9. WS disconnect during pairing → Error — stays in PairingRequired now 10. Terminal auth errors cause reconnect loops — detect structured error codes 11. SendConnectMessageAsync crashes silently — wrapped with error logging Also added: - ApplySetupCodeAsync clears stored device tokens for fresh pairing - Registry-first startup connect (don't rely on settings) - DiagnosticTeeLogger pipes client logs to Connection Status timeline - Comprehensive handshake logging (role, scopes, signed payload, auth, raw errors) - Direction arrows (→ GW / ← GW) and word-wrap in timeline - 9 setup code flow unit tests - v3→v2 signature fallback with per-gateway persistence Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comprehensive audit of northstar implementation status: what's done, what's wired, what's created-but-unused, what's missing, test failures, and testing instructions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…de connector, per-gateway identity - Fix 3 broken shared tests (GatewayClientTestHelper.Client property, v3→v2 fallback test) - Wire SettingsChangeImpact classifier into OnSettingsSaved (no reconnect for UI-only changes) - Wire DeviceIdentityStore into Manager for token persistence - Finish SshTunnelManager.StartAsync (was stub) - Wire NodeConnector into Manager: starts after operator handshake, routes events to state machine - Share v2 signature flag from operator to node (avoid wasted v3 roundtrip) - Remove legacy Token/BootstrapToken fields from SettingsData and SettingsManager - Clear bootstrap token from GatewayRecord after node pairing completes - Per-gateway identity paths: OpenClawGatewayClient accepts identityPath parameter - Remove root identity sync — all identity operations use per-gateway directories - Add detailed node handshake logging ([HANDSHAKE] tags) matching operator style - Node connector uses DiagnosticTeeLogger for Connection Status window timeline - Fix credentials display to read per-gateway identity file - Pass bootstrap tokens via bootstrapToken parameter (not token) to WindowsNodeClient Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ConnectionStatusWindow: add ExtendsContentIntoTitleBar + icon for dark title bar - Auto-reconnect after pairing: manager polls every 5s to check if device was approved When still not approved, HandlePairingRequiredAsync fires again and re-schedules - Fix per-gateway identity comment (no longer root) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove auto-reconnect poll (reverted) - Operator: WebSocket disconnect/error suppressed when in PairingRequired state - Node: same — disconnect/error after pairing required preserves Pairing UI state - Gateway closes socket after PAIRING_REQUIRED; state machine now keeps showing Pairing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Clear bootstrap token from registry after operator pairing (not node) - Add operator device token as fallback in ResolveNode for node role-upgrade pairing - Remove root→per-gateway identity copy from InitializeGatewayClient (per-gateway is source of truth) - On restart: operator uses stored device token, node uses operator token for role-upgrade Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-upgrade - WindowsNodeClient.BuildConnectAuth: fallback to identity.DeviceToken (operator's) when no NodeDeviceToken/bootstrap exists — sends auth.deviceToken not auth.token - WindowsNodeClient.ResolveRequiredCredential: same fallback to avoid throw - Fixes 'gateway token mismatch' error when node uses operator token for pairing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Revert: clear bootstrap after node gets device token (not operator) - Revert: remove operator device token fallback (causes 'device token mismatch') - Device tokens are role-specific — operator token cannot auth node role - Bootstrap token stays in registry until node completes its own pairing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove ExtendsContentIntoTitleBar (caused overlap with window controls) - Allow PairingRequired transition from Error state (both operator and node) Fixes race: Error event fires before PairingRequired, blocking the transition Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add _nodePairingPending and _operatorPairingPending volatile flags - Set flag BEFORE acquiring semaphore in pairing handlers - Check flag BEFORE acquiring semaphore in status handlers - Prevents Disconnected event from winning semaphore race and overwriting PairingRequired - Flags cleared on successful Connected status Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace volatile flags with direct PairingStatus/IsPairingRequired checks - Both are set synchronously on the WS thread before async handlers run - Eliminates race: Disconnected handler checks connector state, not a flag that hasn't been set yet because both events fire in same synchronous dispatch Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix node state: add Connecting case to OnNodeStatusChanged (was missing, kept node in Idle) - Fix pairing race: preserve _isPendingApproval in OnDisconnected when _pairingBlocked - Fix EmitStateChanged: always fire (node sub-state changes need to reach UI) - Allow PairingRequired transition from Error state (both operator and node) - Suppress Disconnect/Error events when pairing is pending (check connector state directly) - Title bar: ExtendsContentIntoTitleBar with 40px top padding + title text - PairingStatusEventArgs: add RequestId field for future auto-approve support - Remove failed auto-approve attempt (requires operator.pairing scope) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…button - Replace text-only operator/node status with state machine pill rows - Read GatewayConnectionSnapshot directly (not old ConnectionStatus enum) - Show 'Awaiting Approval' for PairingRequired instead of 'Connecting...' - Pairing guidance card: approval command with Copy button + Connect/Disconnect - Fix connection counter: reset during pairing, don't increment - Read device ID from identity file when snapshot doesn't have it yet Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Diagnostics: add Direct Connect section (URL + Token fields, clears device tokens) - Diagnostics: pairing state feedback (Connect button changes to 'once approved') - Node auto-approve: re-enabled with scope guard (operator.admin/operator.pairing only) - Loop prevention: track lastAutoApprovedRequestId, one attempt per requestId - Shared gateway tokens now request operator.admin scope (removed local-only restriction) - Default URL: ws://localhost:18790 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace Advanced Connection Settings expander with Direct Connect card - Gateway URL + Token fields with Connect button (same flow as diagnostics) - Clears stored device tokens so shared token is used with admin scope - Saves SSH tunnel settings alongside - Default URL: ws://localhost:18790 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Store SshTunnelConfig on GatewayRecord when SSH is enabled - Start SSH tunnel via App.EnsureSshTunnelStarted() before connecting - Save SSH settings to SettingsManager for legacy compat - Show 'Starting SSH tunnel...' status during tunnel setup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Manager: use ws://localhost:{localPort} when record has SshTunnelConfig (operator + node)
- Diagnostics: add SSH toggle + fields to Direct Connect section
- Diagnostics: OnDirectConnect stores SshTunnelConfig on GatewayRecord, starts tunnel
- Connection page: same flow (already wired)
- SSH uses OS ssh client with key-based auth (no password in app)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Phase A: WSL Local Setup - SettingsManagerLocalGatewaySetupSettings syncs Token/BootstrapToken to GatewayRegistry on Save - CreateLocalOnly accepts optional GatewayRegistry parameter - App passes _gatewayRegistry to the factory Phase B: Onboarding Flow - Onboarding ConnectionPage uses Direct Connect flow: creates GatewayRecord + manager.ConnectAsync - Polls manager.CurrentSnapshot instead of app.GatewayClient - LocalSetupProgressPage uses manager.ReconnectAsync instead of ReinitializeGatewayClient - Expose App.ConnectionManager as public property Other: - Shared gateway tokens now request operator.admin scope on all gateways (not just local) - Update test: FreshStandardRemoteDevice now expects admin scopes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 8 InitializeGatewayClient/InitializeNodeService call sites with _connectionManager.ReconnectAsync(): - Tray connect toggle - Hub connect action - OnSettingsSaved full reconnect - ReconnectGateway - ReconnectNodeServiceOnly - SSH tunnel restart - Onboarding completion - Startup: kept InitializeGatewayClient but removed manual node init -68 lines from App.xaml.cs (4735 → 4214 cumulative this session) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…age UX Phase 1: Delete dead methods (InitializeNodeService, ReinitializeGatewayClient, ReconnectGateway, ReconnectNodeServiceOnly) and inline callers. Phase 2: Make OnManagerStateChanged the sole writer of _currentStatus, removing redundant writes from event handlers and disconnect actions. Phase 3: Move SSH tunnel lifecycle into GatewayConnectionManager via ISshTunnelManager. Manager starts/stops tunnel in ConnectAsync/DisconnectAsync. Split ISshTunnelManager interface from SshTunnelManager implementation. Phase 4: Widen IOperatorGatewayClient with 35+ request methods (sessions, usage, config, cron, skills, pairing, channels, wizard). Change IGatewayConnectionManager.OperatorClient return type to interface. Phase 5: Eliminate _gatewayClient field from App.xaml.cs, replacing 38 refs with _connectionManager?.OperatorClient. Update HubWindow and OnboardingState to use IOperatorGatewayClient. Phase 6: Add 55 new tests — RetryPolicyTests (35), NodeConnectorTests (8), StaleEventGuardTests (5), PairingFlowTests (7). Connection page UX: Add diagnostics button (opens Connection Status window), add Pending Device Pairing card with scope-gated Approve/Reject buttons (requires operator.admin or operator.pairing scope). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…le completed state When replaceExistingConfigurationConfirmed=true, the engine loaded a completed setup-state.json from a previous run and had nothing to do, leaving the progress page stuck with all stages showing empty circles. Also set AllowExistingDistro=true when replacing config since the WSL distro already exists from the previous run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the setup engine's throwaway OpenClawGatewayOperatorConnector with ConnectionManagerOperatorConnector that delegates to the app's GatewayConnectionManager. All operator handshake/pairing events now appear in the Connection Status diagnostics window. Key changes: - Add OperatorPairingRequestId to GatewayConnectionSnapshot so the setup engine can pass requestId to WSL CLI for explicit device approval - ConnectionManagerOperatorConnector: implements IGatewayOperatorConnector via the manager (registry record setup, ConnectAsync, state change wait) - Suppress node auto-connect during setup via _suppressNodeDuringSetup flag (cleared when engine completes/fails/cancels) - Fix state machine: allow WebSocketConnected and HandshakeSucceeded from Error state to handle WebSocketClientBase's built-in auto-reconnect - Setup engine factory accepts optional operatorConnectorOverride parameter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes from 5-model adversarial code review: - SSH tunnel lifecycle: don't stop on disconnect, only on gateway switch or dispose. Add browser proxy forward to SshTunnelConfig. - Suppress flag safety: try/catch on engine construction, stale engine guard prevents wrong engine from clearing flag. - Dispose: unsubscribe node events before semaphore disposal. - Token redaction: mask auth payloads, sigToken previews, and signed payloads in diagnostics logs. - Connector: move StateChanged subscription after fast-path checks, disconnect manager on timeout. - State machine: wrap SetNodeEnabled in semaphore, clear stale OperatorPairingRequestId on transition. - UI: re-enable approve/reject buttons on false return, move auto-approve guard after confirmed success. - DisposeActiveClient: sync-wait for node disconnect (2s timeout). - Remove diagnostics window auto-open on startup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a159d80 to
789ac76
Compare
Remove 6 working/aspirational documents (moved to external copilotdocs): - connection-architecture-northstar.md - connection-implementation-audit.md - gateway-node-integration.md - wsl-owner-open-issues.md - wsl-owner-validation.md - WINDOWS_NODE_TESTING.md Replace with one concise document describing the current architecture as implemented: components, state machine, credential resolution, setup flow integration, and what remains in App.xaml.cs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
789ac76 to
2cd0367
Compare
There was a problem hiding this comment.
Pull request overview
Refactors the Windows Tray connection layer toward a manager-owned lifecycle centered on GatewayConnectionManager, introducing a gateway registry + credential resolution pipeline, expanded diagnostics/UX surfaces, and broad test coverage to support the new architecture.
Changes:
- Introduces
GatewayConnectionManager+ supporting connection state machine/diagnostics abstractions, and updates UI surfaces to consume the new manager/client interfaces. - Moves gateway credentials out of
SettingsData/settings.jsoninto a new persistedGatewayRegistrymodel and updates onboarding/setup flows accordingly. - Adds substantial unit/integration test coverage for retry policy, registry migration, pairing/state transitions, and stale-event generation guards.
Show a summary per file
| File | Description |
|---|---|
| tests/OpenClaw.Tray.Tests/StartupSetupStateTests.cs | Removes legacy setup gating tests tied to settings-stored tokens. |
| tests/OpenClaw.Tray.Tests/SetupWarningPageGuardPolicyTests.cs | Updates existing-config guard tests to use gateway URL instead of token. |
| tests/OpenClaw.Tray.Tests/SettingsRoundTripTests.cs | Updates settings serialization tests to reflect removal of token fields from SettingsData. |
| tests/OpenClaw.Tray.Tests/OpenClaw.Tray.Tests.csproj | Links new connection/registry/manager sources into the Tray test project. |
| tests/OpenClaw.Tray.Tests/OnboardingExistingConfigGuardTests.cs | Removes legacy tests that treated settings token/bootstrap as “existing config”. |
| tests/OpenClaw.Tray.Tests/LocalSetupProgressGuardTests.cs | Updates local-setup guards to treat non-default gateway URL as “existing config”. |
| tests/OpenClaw.Tray.Tests/LocalGatewaySetupTests.cs | Adjusts setup engine construction tests to match registry-based credential flow. |
| tests/OpenClaw.Tray.Tests/Connection/StaleEventGuardTests.cs | Adds tests for generation-based stale-event suppression in the manager. |
| tests/OpenClaw.Tray.Tests/Connection/SetupCodeFlowTests.cs | Adds integration tests for setup-code decode → registry write → connect. |
| tests/OpenClaw.Tray.Tests/Connection/SettingsChangeImpactTests.cs | Adds tests for reconnect impact classification based on settings diffs. |
| tests/OpenClaw.Tray.Tests/Connection/RetryPolicyTests.cs | Adds tests for retry/backoff behavior. |
| tests/OpenClaw.Tray.Tests/Connection/PairingFlowTests.cs | Adds tests for node pairing transitions driven by INodeConnector events. |
| tests/OpenClaw.Tray.Tests/Connection/NodeConnectorTests.cs | Adds basic lifecycle/state tests for NodeConnector. |
| tests/OpenClaw.Tray.Tests/Connection/GatewayRegistryTests.cs | Adds persistence + mutation tests for the new gateway registry. |
| tests/OpenClaw.Tray.Tests/Connection/GatewayRegistryMigrationTests.cs | Adds tests for migrating legacy settings/identity into the registry format. |
| tests/OpenClaw.Tray.Tests/Connection/GatewayConnectionManagerTests.cs | Adds baseline tests for connect/disconnect/switch/diagnostics surfaces. |
| tests/OpenClaw.Tray.Tests/Connection/CredentialResolverTests.cs | Adds tests for canonical credential resolution order (device/shared/bootstrap). |
| tests/OpenClaw.Tray.Tests/Connection/ConnectionStateMachineTests.cs | Adds tests for the operator/node state machine transitions. |
| tests/OpenClaw.Tray.Tests/Connection/ConnectionDiagnosticsTests.cs | Adds tests for the diagnostics ring buffer behavior. |
| tests/OpenClaw.Shared.Tests/OpenClawGatewayClientTests.cs | Updates shared client tests for new scopes/pairing/signature fallback behavior. |
| tests/OpenClaw.Shared.Tests/DeviceIdentityTests.cs | Adds pragma for legacy/obsolete identity method coverage. |
| src/OpenClaw.Tray.WinUI/Windows/SetupWizardWindow.cs | Stops reading/writing tokens from settings in legacy setup wizard paths. |
| src/OpenClaw.Tray.WinUI/Windows/HubWindow.xaml.cs | Updates Hub to use IOperatorGatewayClient and wires diagnostics/window actions. |
| src/OpenClaw.Tray.WinUI/Windows/ConnectionStatusWindow.xaml | Adds a new diagnostics/status window UI surface. |
| src/OpenClaw.Tray.WinUI/Services/StartupSetupState.cs | Changes setup gating logic (now largely MCP/node-token based). |
| src/OpenClaw.Tray.WinUI/Services/SshTunnelManager.cs | Adds an ISshTunnelManager wrapper around existing SshTunnelService. |
| src/OpenClaw.Tray.WinUI/Services/SettingsManager.cs | Removes token persistence and adds ToSettingsData() snapshot generation. |
| src/OpenClaw.Tray.WinUI/Services/LocalGatewaySetup/LocalGatewaySetup.cs | Integrates registry syncing + setup-state cleanup + manager connector overrides. |
| src/OpenClaw.Tray.WinUI/Services/DeepLinkHandler.cs | Adds a deep-link action hook for opening the connection status surface. |
| src/OpenClaw.Tray.WinUI/Services/ConnectionManagerOperatorConnector.cs | Routes setup-engine operator connect/reconnect through the manager + registry. |
| src/OpenClaw.Tray.WinUI/Services/Connection/SetupCodeResult.cs | Introduces a result model for applying setup codes. |
| src/OpenClaw.Tray.WinUI/Services/Connection/SettingsChangeImpact.cs | Adds reconnect-impact classification logic for settings changes. |
| src/OpenClaw.Tray.WinUI/Services/Connection/RoleConnectionState.cs | Adds per-role connection state enum. |
| src/OpenClaw.Tray.WinUI/Services/Connection/RetryPolicy.cs | Adds retry/backoff policy implementation. |
| src/OpenClaw.Tray.WinUI/Services/Connection/OverallConnectionState.cs | Adds derived overall connection state enum. |
| src/OpenClaw.Tray.WinUI/Services/Connection/OperatorClientChangedEventArgs.cs | Adds event args for operator client instance swaps. |
| src/OpenClaw.Tray.WinUI/Services/Connection/NodeConnector.cs | Adds a lightweight node connector owning WindowsNodeClient lifecycle. |
| src/OpenClaw.Tray.WinUI/Services/Connection/NodeConnectionMode.cs | Adds node connection mode enum. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ISshTunnelManager.cs | Adds SSH tunnel lifecycle abstraction used by the manager. |
| src/OpenClaw.Tray.WinUI/Services/Connection/INodeConnector.cs | Adds a node-connection abstraction for manager-driven orchestration. |
| src/OpenClaw.Tray.WinUI/Services/Connection/IGatewayConnectionManager.cs | Adds the public manager interface for UI + setup integration. |
| src/OpenClaw.Tray.WinUI/Services/Connection/IGatewayClientFactory.cs | Adds factory + lifecycle abstraction for the underlying gateway client. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ICredentialResolver.cs | Adds credential resolver interface + GatewayCredential model. |
| src/OpenClaw.Tray.WinUI/Services/Connection/GatewayRegistry.cs | Adds persisted registry for known gateways + migration support. |
| src/OpenClaw.Tray.WinUI/Services/Connection/GatewayRecord.cs | Adds persisted per-gateway record model incl. tokens + SSH config. |
| src/OpenClaw.Tray.WinUI/Services/Connection/GatewayConnectionSnapshot.cs | Adds immutable snapshot model for cross-thread connection state. |
| src/OpenClaw.Tray.WinUI/Services/Connection/GatewayClientFactory.cs | Adds concrete factory/adaptor for creating gateway client lifecycles. |
| src/OpenClaw.Tray.WinUI/Services/Connection/DeviceIdentityStore.cs | Adds identity store adapter for persisting received device tokens. |
| src/OpenClaw.Tray.WinUI/Services/Connection/CredentialResolver.cs | Implements canonical credential selection order for operator/node roles. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ConnectionTrigger.cs | Adds trigger enum driving the state machine. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ConnectionStateMachine.cs | Adds core operator/node state transition logic and snapshot rebuilding. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ConnectionErrorCategory.cs | Adds error categorization for retry policy and diagnostics. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ConnectionDiagnostics.cs | Adds ring-buffer diagnostics timeline. |
| src/OpenClaw.Tray.WinUI/Services/Connection/ConnectionDiagnosticEvent.cs | Adds diagnostics event model. |
| src/OpenClaw.Tray.WinUI/Pages/DebugPage.xaml.cs | Adds action wiring to open the new connection status surface. |
| src/OpenClaw.Tray.WinUI/Pages/DebugPage.xaml | Adds a “Connection Status Window” button. |
| src/OpenClaw.Tray.WinUI/Pages/ConnectionPage.xaml | Refactors Connection page UX around manager state + pairing guidance + diagnostics. |
| src/OpenClaw.Tray.WinUI/Pages/ChatPage.xaml.cs | Reads shared token from GatewayRegistry for chat URL generation. |
| src/OpenClaw.Tray.WinUI/Onboarding/Services/OnboardingState.cs | Switches onboarding state to use IOperatorGatewayClient. |
| src/OpenClaw.Tray.WinUI/Onboarding/Services/OnboardingExistingConfigGuard.cs | Stops treating settings token/bootstrap fields as config (tokens now in registry). |
| src/OpenClaw.Tray.WinUI/Onboarding/Pages/LocalSetupProgressPage.cs | Uses connection manager reconnect path during local setup completion. |
| src/OpenClaw.Tray.WinUI/Onboarding/Pages/ConnectionPage.cs | Routes onboarding “Test connection” through registry + manager connect flow. |
| src/OpenClaw.Tray.WinUI/Onboarding/OnboardingWindow.cs | Uses registry token for chat overlay URL generation. |
| src/OpenClaw.Shared/WindowsNodeClient.cs | Adds v2 signature fallback flag + richer handshake events + pairing requestId plumbing. |
| src/OpenClaw.Shared/SettingsData.cs | Removes token/bootstrap fields from shared settings model. |
| src/OpenClaw.Shared/OpenClawGatewayClient.cs | Implements IOperatorGatewayClient, adds handshake/pairing/token events, logs redacted auth details, and v2 fallback handling. |
| src/OpenClaw.Shared/Models.cs | Extends PairingStatusEventArgs to include a pairing request id. |
| src/OpenClaw.Shared/IOperatorGatewayClient.cs | Introduces operator-client facade interface with request methods/events. |
| src/OpenClaw.Shared/IFileSystem.cs | Introduces filesystem abstraction for testability. |
| src/OpenClaw.Shared/IDeviceIdentityStore.cs | Introduces device token persistence abstraction. |
| src/OpenClaw.Shared/IDeviceIdentityReader.cs | Introduces device token read abstraction for credential resolution. |
| src/OpenClaw.Shared/IClock.cs | Introduces clock abstraction for deterministic diagnostics tests. |
| src/OpenClaw.Shared/DeviceTokenReceivedEventArgs.cs | Adds event args for device token receipt. |
| src/OpenClaw.Shared/DeviceIdentityFileReader.cs | Adds default identity reader implementation. |
| src/OpenClaw.Shared/DeviceIdentity.cs | Marks legacy signing helpers obsolete and clarifies usage. |
| src/OpenClaw.Cli/Program.cs | Stops loading token from settings and relies on --token override. |
| docs/wsl-owner-open-issues.md | Removes legacy WSL-owner Q&A doc. |
| docs/WINDOWS_NODE_TESTING.md | Removes STT guidance sections. |
| docs/gateway-node-integration.md | Removes stt.transcribe references from docs and examples. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
src/OpenClaw.Tray.WinUI/Services/StartupSetupState.cs:28
RequiresSetupcurrently returnstruefor the default case whereEnableNodeModeis false andEnableMcpServeris false (which is the default inSettingsManager). That would force onboarding on every startup for normal operator-only usage. Consider basing this on whether an operator credential exists (stored operator device token and/or an active gateway record with a usable token) rather than node token + MCP flag.
- Files reviewed: 84/84 changed files
- Comments generated: 11
Comment on lines
+106
to
+124
| public void Save() | ||
| { | ||
| RegistryData data; | ||
| lock (_lock) | ||
| { | ||
| data = new RegistryData { Gateways = _records.ToList(), ActiveId = _activeId }; | ||
| } | ||
| var json = JsonSerializer.Serialize(data, s_jsonOptions); | ||
|
|
||
| var dir = Path.GetDirectoryName(_filePath); | ||
| if (dir != null && !_fs.DirectoryExists(dir)) | ||
| _fs.CreateDirectory(dir); | ||
|
|
||
| // Atomic write: temp file then rename | ||
| var tempPath = _filePath + ".tmp"; | ||
| _fs.WriteAllText(tempPath, json); | ||
| // On Windows, File.Move with overwrite works as atomic rename | ||
| File.Move(tempPath, _filePath, overwrite: true); | ||
| } |
Comment on lines
+176
to
+191
| if (string.IsNullOrWhiteSpace(gatewayUrl)) | ||
| return false; | ||
|
|
||
| // Idempotent: don't duplicate if already migrated | ||
| if (FindByUrl(gatewayUrl) != null) | ||
| { | ||
| logger?.Info($"[Registry] Migration skipped — record already exists for {gatewayUrl}"); | ||
| return false; | ||
| } | ||
|
|
||
| var id = Guid.NewGuid().ToString(); | ||
| var record = new GatewayRecord | ||
| { | ||
| Id = id, | ||
| Url = gatewayUrl, | ||
| SharedGatewayToken = string.IsNullOrWhiteSpace(bootstrapToken) ? token : null, |
Comment on lines
+214
to
+222
| /// <summary> | ||
| /// Creates a snapshot of current settings as an immutable SettingsData record. | ||
| /// Used for settings change classification (no DPAPI protection applied). | ||
| /// </summary> | ||
| public SettingsData ToSettingsData() => new() | ||
| { | ||
| GatewayUrl = GatewayUrl, | ||
| // Token and BootstrapToken are no longer written — GatewayRegistry is the source of truth | ||
| UseSshTunnel = UseSshTunnel, |
Comment on lines
+1331
to
+1341
| // Sync credentials to GatewayRegistry (source of truth for connection architecture) | ||
| if (_registry != null && !string.IsNullOrWhiteSpace(GatewayUrl)) | ||
| { | ||
| var existing = _registry.FindByUrl(GatewayUrl); | ||
| var recordId = existing?.Id ?? System.Guid.NewGuid().ToString(); | ||
| var record = new OpenClawTray.Services.Connection.GatewayRecord | ||
| { | ||
| Id = recordId, | ||
| Url = GatewayUrl, | ||
| SharedGatewayToken = !string.IsNullOrWhiteSpace(Token) ? Token : existing?.SharedGatewayToken, | ||
| BootstrapToken = !string.IsNullOrWhiteSpace(BootstrapToken) ? BootstrapToken : existing?.BootstrapToken, |
Comment on lines
+131
to
+138
| public async Task<GatewayOperatorConnectionResult> ConnectWithStoredDeviceTokenAsync( | ||
| string gatewayUrl, CancellationToken cancellationToken = default) | ||
| { | ||
| _logger.Info("[SetupConnector] Reconnecting with stored device token via manager"); | ||
|
|
||
| // Reconnect — the credential resolver will pick up the stored device token | ||
| await _manager.ReconnectAsync(); | ||
|
|
Comment on lines
+172
to
174
| var token = options.TokenOverride ?? string.Empty; | ||
|
|
||
| return (gatewayUrl ?? string.Empty, token ?? string.Empty, loaded); |
Comment on lines
+114
to
117
| <Button Content="🔌 Connection Status Window" Click="OnOpenConnectionStatus" HorizontalAlignment="Stretch"/> | ||
| <Button x:Uid="DebugPage_Button_114" Content="📋 Copy Support Context" Click="OnCopySupportContext" | ||
| Style="{ThemeResource AccentButtonStyle}" HorizontalAlignment="Stretch"/> | ||
| <Button Content="🔄 Relaunch First-Run Setup" Click="OnRelaunchOnboarding" HorizontalAlignment="Stretch"/> |
Comment on lines
+65
to
+78
| <!-- Operator flow --> | ||
| <StackPanel Grid.Row="0" Margin="0,0,0,4"> | ||
| <StackPanel Orientation="Horizontal" Spacing="4" Margin="0,0,0,2"> | ||
| <TextBlock Text="OPERATOR" FontSize="9" FontWeight="Bold" Foreground="{ThemeResource TextFillColorSecondaryBrush}" CharacterSpacing="80" VerticalAlignment="Center" Width="70"/> | ||
| <Border x:Name="CpOpOff" CornerRadius="3" Padding="6,2"><TextBlock Text="Off" FontSize="10"/></Border> | ||
| <TextBlock Text="→" VerticalAlignment="Center" FontSize="9" Foreground="{ThemeResource TextFillColorSecondaryBrush}"/> | ||
| <Border x:Name="CpOpConnecting" CornerRadius="3" Padding="6,2"><TextBlock Text="Connecting" FontSize="10"/></Border> | ||
| <TextBlock Text="→" VerticalAlignment="Center" FontSize="9" Foreground="{ThemeResource TextFillColorSecondaryBrush}"/> | ||
| <Border x:Name="CpOpConnected" CornerRadius="3" Padding="6,2"><TextBlock Text="Connected" FontSize="10"/></Border> | ||
| </StackPanel> | ||
| <StackPanel Orientation="Horizontal" Spacing="4" Margin="74,0,0,0"> | ||
| <Border x:Name="CpOpPairing" CornerRadius="3" Padding="6,2"><TextBlock Text="Pairing" FontSize="10"/></Border> | ||
| <Border x:Name="CpOpError" CornerRadius="3" Padding="6,2"><TextBlock Text="Error" FontSize="10"/></Border> | ||
| </StackPanel> |
Comment on lines
14
to
+17
| return false; | ||
| } | ||
|
|
||
| return !string.IsNullOrWhiteSpace(settings.Token) || | ||
| !string.IsNullOrWhiteSpace(settings.BootstrapToken) || | ||
| HasStoredNodeDeviceToken(dataPath); | ||
| return HasStoredNodeDeviceToken(dataPath); |
Comment on lines
+33
to
+38
| /// <summary> | ||
| /// Identity directory name, deterministically derived from Id. | ||
| /// GUIDs are path-safe and guarantee uniqueness even if URLs change. | ||
| /// </summary> | ||
| public string IdentityDirName => Id; | ||
| } |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
shanselman
added a commit
that referenced
this pull request
May 10, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactors the connection layer toward the northstar architecture where GatewayConnectionManager owns the complete connection lifecycle — operator client, node connector, credential resolution, state machine, SSH tunnel, and diagnostics.
Key Changes
Architecture (Phases 1-5)
Setup Engine Integration
Connection Page UX
State Machine Fixes
Security & Robustness (from 5-model adversarial review)
Tests: +55 new tests (878 tray, 1442 shared — all passing)
Documentation