fix: setup flow, connection page, and chat token bugs#2
Closed
ranjeshj wants to merge 86 commits into
Closed
Conversation
- Add ExtendsContentIntoTitleBar + custom titlebar to OnboardingWindow, SetupWizardWindow, and WelcomeDialog to match HubWindow/CanvasWindow - Standardize titlebar height (48px), padding, emoji (FontSize 20), and title text (FontSize 13, CaptionTextBlockStyle) across all windows - CanvasWindow: update height 40->48px, emoji size 14->20, add FontSize 13 - CanvasWindow: move reload button inline next to title in separate grid column for proper click handling within titlebar drag region - Fix OnboardingWindow chat overlay sizing to use contentGrid.SizeChanged instead of rootGrid to avoid double-subtracting titlebar height Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace constructor-injected sample data with empty/loading states across Usage, Sessions, Nodes, Channels, Skills, and Cron pages. Skills and Cron APIs were already wired; this removes stale warnings and misleading placeholder data.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Skip the handle-resolved-path containment check when GetFinalPathFromHandle returns an empty value on non-Windows, while preserving the earlier symlink-resolution guard.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a WindowsFactAttribute for Windows-only tray tests and use it for the DPAPI settings-secret test so non-Windows runs skip the unsupported API cleanly.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add AutomationId (CanvasTitlebarReloadButton) and accessible name (Reload Canvas) to the icon-only reload button in the Canvas window titlebar. This enables UI automation discovery and screen reader announcement for the button. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Expand TokenSanitizer coverage and simplify ExecApprovalPolicy.Save() to serialize the same defensive snapshot used by GetPolicyData().\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add coverage for HttpUrlRiskEvaluator boundary cases and BrowserProxy capability path/port/query behavior.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Focuses the chat WebView when the popup is shown and asks the loaded chat document to focus the first visible textbox-style input so users can type immediately. Adds a regression test covering both the show and navigation success paths. Fixes openclaw#279 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…#278) * Initial plan * fix: revert repo-assist setup action SHA from v0.71.3 to v0.68.3 for consistency Agent-Logs-Url: https://github.com/openclaw/openclaw-windows-node/sessions/3b723d17-00bc-44a9-89d7-76bf9a9f9d1a Co-authored-by: shanselman <2892+shanselman@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: shanselman <2892+shanselman@users.noreply.github.com>
Previously, clicking Canvas in the tray menu when the window was already open did nothing because Activate() was only called when creating a new window. Move Activate() outside the creation guard so it always runs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Size the tray menu against the target cursor monitor DPI instead of the hidden window's stale size, and invalidate cached flyout geometry when DPI or rasterization scale changes. This keeps the first tray menu open after a display-scale change from rendering with stale compressed measurements. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Overhaul the Agent Events page with persistent caching, event deduplication, resolved stream classification, expandable event cards, and clearer summaries/badges.\n\nIncludes a maintainer follow-up to ensure assistant events expand to full text and raw JSON remains hidden for assistant/error/lifecycle streams.\n\nCo-authored-by: Christine Yan <christineyan@microsoft.com>\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… token handling (openclaw#287) * fix: connection stability — stop node reconnect storms, fix bootstrap token handling Critical fixes for connection management bugs introduced in PR openclaw#272: 1. Node reconnect storm during pairing (WindowsNodeClient) - Added ShouldAutoReconnect() override with _pairingBlocked flag - Flag survives OnDisconnected() (which clears _isPendingApproval) - Added rate-limit detection for terminal auth errors - Marked _pairingBlocked/_rateLimited as volatile for thread safety - Clear _rateLimited on successful hello-ok (transient, not permanent) 2. Backoff jitter (WebSocketClientBase) - Added 0-25% random jitter to prevent thundering herd when operator + node clients reconnect simultaneously 3. Client leak on reinitialize (App.xaml.cs) - Added _gatewayClient?.Dispose() before creating new client - Old clients were keeping reconnect loops alive as zombies 4. Bootstrap token not saved as Settings.Token - Setup code decoder no longer persists bootstrap to Settings.Token - Prevents reconnect storms on app restart with stale bootstrap token - TestConnection skips writing bootstrap value to Settings.Token - InitializeGatewayClient falls back to BootstrapToken for bootstrap flow 5. Token PasswordBox → TextBox - Users can see what they pasted (SetupWizardWindow + ConnectionPage) 6. Clear stale tray data on disconnect - Sessions/channels/nodes/models cleared when disconnected/error - Tray menu no longer shows old data alongside 'Disconnected' 7. Onboarding UX fixes - Removed disruptive auto-paste-on-focus from setup code field - Setup code state only updates on valid decode (prevents focus loss) - Added 'Relaunch First-Run Setup' button to Debug page Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: increase PowerShell echo test timeout to 30s for slow CI runners Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…law#290) The six individual sb.Append() calls that set fixed SSH connection options (-o BatchMode=yes, ExitOnForwardFailure=yes, ServerAliveInterval, ServerAliveCountMax, TCPKeepAlive, -N) are replaced by a single const string BaseOptions passed to the StringBuilder constructor. Benefits: - The full set of SSH connection options is visible in one place, making it easy to review the connection policy or adjust an option without scanning the Append chain. - The compiler folds all the string literals at compile time (no runtime allocation or concatenation for the static portion). - BuildArguments is shorter and the dynamic parts (port forwards, user@host) stand out more clearly. No functional change; all existing SshTunnelCommandLineTests pass. Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…openclaw#282) Update all three direct SDK.BuildTools references from 10.0.26100.4654 to 10.0.28000.1839 to align with the transitive requirement introduced by OpenClawTray.FunctionalUI's indirect dependency. Without this bundle update, Dependabot PR openclaw#268 (which only updated OpenClawTray.FunctionalUI.csproj) causes NU1605 downgrade errors on OpenClaw.Tray.WinUI and OpenClaw.Tray.UITests because TreatWarningsAsErrors is enabled in tests/Directory.Build.props. Supersedes Dependabot PR openclaw#268. Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds BridgeMessageReceived + PostBridgeMessage to CanvasWindow following the same pattern as WebChatWindow (c7630fa), closing the CanvasWindow item on the openclaw#191 checklist. Removes SendA2UIMessageAsync, ResetA2UIAsync, and their heuristic ExecuteScriptAsync helpers; both had no active callers and are replaced by the bridge. IsTrustedBridgeSource accepts only _trustedGatewayOrigin and openclaw-canvas.local. Source-scan test added in TrayMenuWindowMarkupTests. Co-authored-by: AlexAlves87 <alexalves87@github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…locked (openclaw#289) Replace the per-character IsControl/IsWhiteSpace loop with: 1. span.IndexOfAnyInRange('\x00', '\x20') — a single vectorized (SIMD) pass that detects all ASCII control chars (0x01–0x1F) and space (0x20). 2. span.IndexOf('\x7F') — catches DEL, which lies outside the range above. 3. A short fallback loop restricted to chars > 0x7F — non-ASCII control/whitespace (rare; env var names are almost always ASCII). The original three-case vectorized IndexOfAny(['=','\0','\r','\n']) is kept as-is; the new range scan replaces only the subsequent foreach loop. IsBlocked is called on every environment variable supplied with a system.run command, so the hot path (ASCII-only names that clear all checks) now runs in O(n/SIMD_width) instead of O(n * 2 calls/char). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nclaw#288) * Add Windows STT transcribe capability Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * stt: privacy hardening, localization, and test coverage Review-driven cleanup on top of the initial stt.transcribe capability. No behavior change for successful invocations. Privacy: * SttCapability no longer echoes the caller-supplied language tag in the "Invalid language tag" error, and no longer interpolates the underlying exception's Message into "Transcribe failed". Both could end up in the recent-activity stream and BuildSupportBundle output, which can be shared off-device. Full detail still goes to the local logger. * App.OnNodeInvokeCompleted now sanitizes failed-invoke details for privacy-sensitive commands (stt.transcribe, camera.snap/clip, screen.snapshot/record). Recent activity and support bundles record only "privacy-sensitive | <ms> | error" instead of the raw error string. Non-privacy-sensitive commands keep the error text since it is useful for diagnostics and does not carry mic/camera args. * Models.cs PermissionDiagnostics microphone detail now mentions stt.transcribe instead of "future voice features", so users hitting 0x800455A0 see microphone in their permissions checklist as relevant. Refactors for testability (no behavior change): * New Services/NodeInvokeActivityFormatter.cs owns GetPrivacyClass and BuildDetails. App.OnNodeInvokeCompleted delegates to it. * New Services/NodeCapabilityGating.cs owns the optional-capability predicates. NodeService.RegisterCapabilities calls into it instead of inlining "_settings?.NodeXxxEnabled" checks. Privacy-sensitive defaults stay off; everything else stays default-on. * Both helpers are linked into OpenClaw.Tray.Tests. Localization: * SettingsWindow.xaml gains x:Uid for every TTS and STT control. The literal Text/Header/PlaceholderText values are kept as dev-time fallbacks, matching the SettingsTokenTextBox and SettingsMcpDescription pattern already in the file. * en-us, fr-fr, nl-nl, zh-cn, and zh-tw .resw files gain matching entries for the 14 new TTS/STT keys. Brand names (ElevenLabs), command names (tts.speak, stt.transcribe, gateway.nodes.allowCommands, MSIX), BCP-47 tags, and the eleven_multilingual_v2 model identifier are kept verbatim across all locales. * SettingsMcpDescription.Text in all five locales now lists "microphone" and "speakers" alongside camera/screen/canvas so the local MCP-server description reflects the full Phase 1 + Phase 2 voice surface. Tests: * Two new privacy regression tests in CapabilityTests verify that an invalid language and a thrown handler exception never leak their text into the response error. * New NodeInvokeActivityFormatterTests pin the privacy-class table, the sanitized details for privacy-sensitive failures, and the full ActivityStreamService.BuildSupportBundle path. * New NodeCapabilityGatingTests pin that tts.speak and stt.transcribe default off (including for null settings) and that the two capabilities are independent consent surfaces. * New SettingsWindowLocalizationCoverageTests parses SettingsWindow.xaml and asserts every new TTS/STT x:Uid resolves to the expected .Header/.Text/.Content/.PlaceholderText keys in en-us. * ActivityStreamServiceTests and NodeInvokeActivityFormatterTests now share a non-parallel xUnit collection because ActivityStreamService is a static singleton; running both classes in parallel could otherwise cause flaky support-bundle assertions. * NodeCapabilityGatingTests cleans up its temp settings directories. Cleanup: * Drop "Phase 2" wording from SpeechToTextService.cs; the resw section comments referring to "Phase 1 TTS / Phase 2 STT" are likewise reworded to plain "TTS / STT settings". Phase numbering is a planning artifact and should not appear in the codebase. Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj --no-restore (1173 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests/OpenClaw.Tray.Tests.csproj --no-restore (465 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove fake/sample data from 6 UI pages Replace constructor-injected sample data with empty/loading states: - UsagePage: remove fabricated provider costs and daily data - SessionsPage: remove 3 fake AI conversation sessions - NodesPage: remove fake Desktop-PC/MacBook-Pro nodes - ChannelsPage: remove fake Telegram/WhatsApp channels - SkillsPage: remove fake skills and stale 'API not yet wired' warning - CronPage: remove fake cron jobs, stale warning, fix hardcoded defaults All pages now show proper empty states until real gateway data arrives. The Skills and Cron APIs were already fully wired; the warnings were simply outdated and misleading. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add voice/audio support with local Whisper STT Add full voice interaction capabilities to the Windows node: Core audio pipeline: - NAudio WASAPI microphone capture with MTA thread initialization - Energy-based voice activity detection with hysteresis - Whisper.net speech-to-text with multi-threaded inference - Pre-buffer to capture speech onset before VAD triggers - Auto-download of Whisper models from HuggingFace Voice overlay window: - Modern WinUI 3 floating window with Mica backdrop and custom title bar - Chat-style transcript bubbles with segment consolidation - Real-time audio level visualization - Start/Stop, Mute, and Settings controls STT node capability: - stt.listen and stt.status MCP commands for agent-initiated listening - Follows existing capability pattern (like TTS) Voice settings page: - Model size selection (tiny/base/small) with download management - Language selection (auto-detect + 9 languages) - Silence timeout slider - TTS voice picker with Windows neural voice enumeration - ElevenLabs provider configuration - Voice preview button Integration: - Tray menu Voice item - Ctrl+Alt+Shift+V global hotkey for push-to-talk - Deep links: openclaw://voice, openclaw://voice-stop - Gateway chat responses shown in voice overlay - TTS response playback with mic muting to prevent echo - Capabilities page STT toggle - Hub navigation Voice & Audio page Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Merge master into user/rbrid/stt-capability Master refactored 8 separate windows into a unified Hub app (#272), which removed src/OpenClaw.Tray.WinUI/Windows/SettingsWindow.xaml(.cs) and WebChatWindow.xaml.cs. Node-capability toggles now live in Pages/CapabilitiesPage as a code-built list (one icon + label per capability) instead of an XAML page with x:Uid-localized headers. Conflict resolution and re-integration: * Accepted master's deletion of SettingsWindow.xaml, SettingsWindow.xaml.cs, and WebChatWindow.xaml.cs. The TTS/STT controls and code-behind that this branch added to those files are obsolete with the new Hub UI. * Pages/CapabilitiesPage.xaml.cs gains a Speech-to-Text toggle alongside the existing Camera/Canvas/Screen/Location/TTS toggles, plus 'stt' in the active-capabilities summary string. This is the natural minimal alignment with the new pattern: one capability = one entry in the toggle list. * The TTS provider / ElevenLabs key/voice/model UI that this branch had added is dropped because master removed the corresponding settings surface entirely. The backend services (TextToSpeechService, ElevenLabsTextToSpeechClient) and the SettingsManager keys are intact; the values can be set via direct settings.json edit until a new UI surface lands. * Resolved 5 .resw conflicts (en-us, fr-fr, nl-nl, zh-cn, zh-tw) by taking master's content. All TTS/STT resource keys this branch had added are removed because the controls referencing them are gone. The earlier SettingsMcpDescription update (adding 'microphone' and 'speakers' to the capability list) is outside the conflict region and is preserved. * Deleted tests/OpenClaw.Tray.Tests/SettingsWindowLocalizationCoverageTests.cs. It pinned that 14 specific x:Uids on SettingsWindow.xaml had matching resw entries; the controls and the file no longer exist. Refactors from this branch survived the auto-merge cleanly: * App.xaml.cs OnNodeInvokeCompleted still delegates to NodeInvokeActivityFormatter for privacy-class scrubbing. * NodeService.RegisterCapabilities still calls NodeCapabilityGating predicates for every optional capability, including TTS and STT. Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj --no-restore (1183 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests/OpenClaw.Tray.Tests.csproj --no-restore (418 passed; restore required first because master's Tray.Tests now links GatewayDiscoveryService.cs which needs Zeroconf) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * stt/tts: refill settings UI gaps after the unified Hub merge Master's Hub refactor (#272) removed the per-capability detail UI that previously lived on SettingsWindow. The capability backends are intact but have no in-app surface anymore: STT had no way to set the BCP-47 language tag, and TTS had no way to pick the provider, ElevenLabs API key, voice ID, or model without hand-editing settings.json. CapabilitiesPage.xaml gains two new detail cards beneath the capability toggle grid, mirroring the existing McpCard pattern (visible only when the capability is enabled): * SttCard: - Language TextBox bound to SttLanguage. - Commits on LostFocus or Enter. - Empty input restores the "en-US" default rather than persisting "". - Validates with SttCapability.NormalizeLanguageTag before saving so a typo in Settings cannot ship a broken default to the WinRT recognizer. - Status text never echoes the user-supplied tag back on the failure path; only the local UI affordance shows it (the activity stream / support bundle path was already privacy-scrubbed by an earlier commit on this branch). * TtsCard: - Provider ComboBox (Windows built-in / ElevenLabs). - ElevenLabs sub-panel becomes visible only when that provider is selected. Holds API key (PasswordBox), voice ID, and model. - API key handling: when a key is already saved we render a fixed mask sentinel ("••••••••") instead of any plaintext. Saving the form treats the sentinel as "keep current key" so the user can change voice ID / model without retyping the key, and rotation requires explicitly typing a new key. The on-disk DPAPI encryption done by SettingsManager is unchanged. - All ElevenLabs fields commit on LostFocus. SttCapability.NormalizeLanguageTag is promoted from private to public so the UI validates against exactly the rule the wire protocol applies. No behavior change for the capability itself. Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj --no-restore (1183 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests/OpenClaw.Tray.Tests.csproj --no-restore (418 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: extend privacy class + tests for stt.listen and stt.status NodeInvokeActivityFormatter.GetPrivacyClass now classifies any stt.* command as privacy-sensitive, not just stt.transcribe. This catches stt.listen (microphone capture) and stt.status (engine internals) under the same scrubbing rules in the activity stream / support bundle, and keeps the rule simple ("anything in the stt namespace"). Tests added: * GetPrivacyClass: stt.listen, stt.status, stt.future-command rows. * PrivacySensitive_FailedInvoke_OmitsErrorTextFromDetails: theory rows for stt.listen and stt.status alongside the existing stt.transcribe / camera.* / screen.* coverage. * SttCapabilityTests: full coverage of the unified surface - Listen: timeoutMs clamps (below min, above max), default language "auto", invalid language rejected without echo, handler not wired, handler exception sanitized to "Listen failed", segments + engine metadata round-trip, cancellation. - Status: handler not wired, handler exception sanitized to "Status failed", per-engine readiness round-trip with download progress. - NormalizeLanguageTag: BCP-47 tags + "auto" sentinel (case-insensitive, normalized to lowercase) accepted; underscore / spaces / "automatic" rejected. * SettingsRoundTripTests: round-trips SttEngine, SttModelName, SttSilenceTimeout, VoiceTtsEnabled, VoiceAudioFeedback through SettingsData.ToJson / FromJson. Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests --no-restore (1266 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests --no-restore (425 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: STM, locale audit, and coverage tests for STT/TTS card * Added E:\OpenClawWindowsNode\Audio_STM.md — full STRIDE analysis of the merged audio surface (assets, trust boundaries, per-component threats, cross-references to code + tests, follow-up backlog). * Promoted every new STT/TTS card string in CapabilitiesPage.xaml to x:Uid + resw entries across all five locales (en-us, fr-fr, nl-nl, zh-cn, zh-tw): engine picker labels, language input + help, "More voice settings…" link, TTS provider picker, ElevenLabs sub-panel fields. Brand names (ElevenLabs), the "auto" BCP-47 sentinel, and the eleven_multilingual_v2 model identifier are kept verbatim and registered as InvariantOrDeferred in LocalizationValidationTests. * Added CapabilitiesPageLocalizationCoverageTests — pins every new STT/TTS x:Uid against expected resw key suffixes (.Text, .Header, .Content, .PlaceholderText) so a future hardcoded-string regression fails fast. Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests --no-restore (1266 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests --no-restore (461 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: extract SttEngineSelector + tests for engine selection rules The engine-selection logic that NodeService.OnSttTranscribeAsync / OnSttListenAsync / OnSttStatusAsync inline-implemented is now a pure helper in Services/SttEngineSelector.cs and is consumed identically from all three handlers. No behavior change. Selector rules (pinned by SttEngineSelectorTests, 21 cases): * Whisper preference + Whisper ready → Whisper, no fallback. * Whisper preference + Whisper NOT ready + WinRT ready → WinRT, fallbackReason="whisper-model-not-ready". Happy degradation while the model downloads on first launch. * Whisper preference + neither ready → keep Whisper preference, fallbackReason="whisper-and-winrt-unavailable". Dispatch fails; the user's preference is reported unchanged so stt.status is honest about what they asked for. * WinRT preference + WinRT ready → WinRT, no fallback. * WinRT preference + WinRT ready + Whisper ALSO ready → still WinRT. Critical invariant: explicit user choice is never silently upgraded to Whisper when the model finishes downloading. * WinRT preference + WinRT NOT ready → keep WinRT, fallbackReason="winrt-unavailable". Same invariant: do not fall back to Whisper without explicit user opt-in. * null/empty/whitespace/unknown engine string → treat as Whisper preference. A typo in settings.json must not hard-fail STT. * Case- and whitespace-insensitive parsing of "whisper" / "winrt". Engine identifier constants are mirrored locally on SttEngineSelector.SharedConstants (free of cross-assembly deps); MirroredConstantsMatchSttCapability pins they stay in sync. Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests --no-restore (1266 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests --no-restore (482 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: security review fixes from STM walkthrough Findings from the post-merge security review (full review recorded in the session at files/security-review.md and reflected in the STM follow-up backlog): CRITICAL (1 fixed, 1 deferred): * I-1 — UI now warns that selecting WinRT honors the Windows Online speech recognition toggle and may upload audio to Microsoft when that toggle is on. CapabilitiesPage SttEngineHint text updated to steer users to Whisper for fully local processing. * S-4 / T-1 — DEFERRED: SHA-256 verification of the Whisper model (download AND load time) requires embedding canonical hashes for tiny / base / small from HuggingFace. Tracked as a Critical pre-GA follow-up in Audio_STM.md section 6, not blocking this merge. (Existing TLS + system trust chain remains the only check.) HIGH (3 fixed): * S-3 / D-1 — NodeService.OnSttListenAsync now enforces a 1-second cooldown between successive stt.listen invocations. Imperceptible to a real user but throttles a hostile loop from a compromised gateway. Throws InvalidOperationException("Listen rate limit") which the SttCapability sanitization wraps as "Listen failed". * D-7 — AudioPipeline.CleanupCapture now wraps event-detach, capture.Dispose, and CTS dispose in independent try/catch blocks so a failure in one step doesn't leak the NAudio WasapiCapture COM object (which would hold the mic LED lit until process exit). Also added CleanupCapture() calls in StartAsync's two catch branches so the mic is released after a failed start. * I-2 — VoiceOverlayWindow audit confirmed no transcript text reaches ActivityStreamService. Status: PIN, no code change needed. MEDIUM (1 fixed): * NEW-1 — TtsCapability previously returned \$"Speak failed: {ex.Message}", which can leak ElevenLabs key prefixes from 401 responses or device names from OS audio errors into the support bundle. Now returns a fixed "Speak failed" matching the SttCapability pattern. NodeInvokeActivityFormatter.GetPrivacyClass also now classifies tts.* as privacy-sensitive (was metadata) so failed- invoke details are uniformly scrubbed. PIN (no change needed, confirmed by review): * T-3 — SttModelName path-traversal: WhisperModelManager validates against the {tiny, base, small} allow-list before any Path.Combine. * I-4 — ElevenLabs key DPAPI-encrypted at rest. * I-5 — ElevenLabs key UI shows masked sentinel; plaintext never re-rendered after save. * I-8 / PI-5 — stt.status response carries no PII (only readiness strings, engine name, capability flags, numeric download progress). * PI-3 — Validation/handler errors don't echo caller input or exception text across stt.* and now tts.* as well. Test additions: * Speak_HandlerException_DoesNotLeakExceptionMessageIntoError — pins the new TTS privacy invariant with an "ElevenLabs 401: invalid key sk-secret-prefix" payload. * Speak_ReturnsError_WhenHandlerThrows updated to assert the exact sanitized "Speak failed" message instead of leaking ex.Message. * GetPrivacyClass theory rows now cover tts.speak and tts.future-command as privacy-sensitive (was metadata). Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests --no-restore (1271 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests --no-restore (483 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: drop WinRT SpeechRecognizer + SAPI fallback; Whisper-only Both legacy stacks are removed; SttCapability now dispatches every stt.* call to a single Whisper engine via VoiceService. When the Whisper model is not yet downloaded, handlers return a clear error pointing the caller at the Voice Settings page download button — there is no automatic fallback engine. Rationale (from the discussion with Ranjesh): * WinRT SpeechRecognizer is an old API that fails to activate in unpackaged tray builds (the long-standing 0x800455A0 issue) and, when the OS Online speech recognition toggle is on, may upload audio to Microsoft cloud — at odds with our local-first posture. * System.Speech (desktop SAPI) is even older and has no value over Whisper for any modern scenario. * Carrying two engines complicated the merge with no real upside now that Whisper.net runs reliably on every supported PC. Removed: * src/OpenClaw.Tray.WinUI/Services/SpeechToText/SpeechToTextService.cs (the WinRT + SAPI engine). * src/OpenClaw.Tray.WinUI/Services/SttEngineSelector.cs (no engines to select between). * tests/OpenClaw.Tray.Tests/SttEngineSelectorTests.cs. * System.Speech NuGet package reference (was duplicated; both copies removed). * SttEngine setting (SettingsData + SettingsManager round-trip). * SttCapability.EngineWinRt and DefaultEngine constants. * SttTranscribeResult.EngineFallbackReason and SttListenResult.EngineFallbackReason — no fallback to report. * CapabilitiesPage Engine ComboBox + the engine-related UI strings in all five locales. * The "Windows built-in may upload audio" caveat (no longer relevant). Simplified: * SttStatusResult: replaced PreferredEngine/EffectiveEngine plus per-engine readiness blocks with a single Engine + Readiness pair (engine is always "whisper" today; the field stays so a future engine doesn't break the wire). * NodeService.OnSttTranscribeAsync / OnSttListenAsync / OnSttStatusAsync: dropped selector logic + WinRT marshalling. When VoiceService.IsWhisperReady is false, throw clear "Whisper model not downloaded" — wrapped to "Transcribe failed" / "Listen failed" by SttCapability's privacy sanitizer. * CapabilitiesPage STT card hint surfaces model download state ("Whisper model is ready" / "downloading" / "not downloaded — open More voice settings…"). * McpToolBridge curated descriptions: drop engineFallbackReason field and the per-engine blocks from stt.status. Tests: * CapabilityTests.Status_ReturnsEngineReadiness rewritten for the flat shape; now also asserts no language/path strings appear in the JSON (tightens PI-5 enforcement). * SettingsRoundTripTests: dropped SttEngine field assertions. * CapabilitiesPageLocalizationCoverageTests: dropped engine ComboBox Uids from the contract list. * LocalizationValidationTests: removed the engine ComboBox keys from the InvariantOrDeferred allow-list (no longer needed; the invariants list now only protects "auto", "ElevenLabs", and "eleven_multilingual_v2"). Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests --no-restore (1271 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests --no-restore (456 passed) Audio_STM.md and Audio_FollowUps.md updated to reflect the engine removal (smaller test-seam refactor surface; I-1 "WinRT online speech caveat" follow-up is retired). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: add Piper TTS provider via Sherpa-ONNX Adds a third TTS provider, "piper", that runs Piper voices fully locally on this PC through the official Sherpa-ONNX .NET binding (org.k2fsa.sherpa.onnx 1.13.0). No cloud egress; the voice model downloads once to %LOCALAPPDATA%\OpenClawTray\models\piper\<voice-id>\ and is reused across calls. Backend (OpenClaw.Shared/Audio/PiperVoiceManager.cs): * Curated catalog of 6 starter voices (en-US ×2, en-GB, fr-FR, de-DE, zh-CN) sourced from the sherpa-onnx tts-models GitHub release tarballs — these are repackaged Piper voices that include the language-specific espeak-ng-data, so the user only downloads one archive per voice instead of model + tokens + espeak separately. * Download with progress callback; extraction via OS-bundled tar.exe (Win10 1803+); atomic per-voice directory layout; cleanup of partial files on failure or cancellation. * IsVoiceDownloaded / GetVoiceSize / DeleteVoice for the (forthcoming) Voice Settings page UI. * TODO marker for SHA-256 verification (Audio_FollowUps.md §2). Tray service (OpenClawTray/Services/TextToSpeech/PiperTextToSpeechClient.cs): * Wraps SherpaOnnx.OfflineTts; loads one voice at a time and reuses the loaded model across calls (load is the expensive ~200-500 ms step). Single-flight gate prevents concurrent generates from racing the same TTS instance. * Inference runs on a background Task so cancellation can race the synthesis. * Converts Sherpa's 32-bit float PCM samples to a standard 16-bit PCM mono WAV blob the WinUI MediaPlayer can play with no further transcoding. Wiring (OpenClaw.Tray.WinUI/Services/TextToSpeech/TextToSpeechService.cs): * Third branch in SpeakAsync's provider dispatch. SpeakWithPiperAsync resolves the voice from args.VoiceId or settings.TtsPiperVoiceId, fails with a "voice not downloaded" error pointing the user at Voice Settings if the file isn't present, and otherwise reuses the cached PiperTextToSpeechClient (rebuilds it only when the voice id changes). * TextToSpeechService.PiperVoices exposed so the Voice Settings page can drive download / delete from the same instance. UI (OpenClaw.Tray.WinUI/Pages/CapabilitiesPage.xaml + .xaml.cs): * Added Piper as the first ComboBoxItem on the TTS provider picker ("Piper (local ML, recommended)"). Resw entries across all 5 locales (en-us, fr-fr, nl-nl, zh-cn, zh-tw). * UpdateTtsCard reads TtsProvider with a 3-way switch (piper / windows / elevenlabs); unknown / null defaults to Piper. Capability + settings: * TtsCapability.PiperProvider = "piper" wire constant. * SettingsData.TtsPiperVoiceId / SettingsManager.TtsPiperVoiceId, default "en_US-amy-low" (~50 MB, smallest English voice). Round-trip preserved through Save/Load. Tests: * SettingsRoundTripTests asserts TtsPiperVoiceId persists. * CapabilitiesPageLocalizationCoverageTests pins the new CapabilitiesPage_TtsProviderPiper x:Uid against en-us. * PiperVoiceManager + PiperTextToSpeechClient have no unit tests yet — same blocker as the rest of the audio engine layer (Audio_FollowUps.md §1: needs interface extraction first). Audio_FollowUps.md §3 updated with a "Status update — basic Piper plumbing landed" subsection enumerating exactly what shipped and what remains (Voice download UI, manager tests, SHA-256 verification, spike validation). Validation: * .\build.ps1 * dotnet test tests/OpenClaw.Shared.Tests --no-restore (1271 passed, 20 skipped) * dotnet test tests/OpenClaw.Tray.Tests --no-restore (462 passed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: voice download UI, Piper-as-default, first-listen polish User-visible * New Piper voice download panel on the Voice & Audio page (catalog of 6 voices, download with progress, delete, preview). * Piper is now the default TTS provider for fresh installs. * Read responses aloud toggle now drives every chat reply, not only voice-overlay sessions. * Voice Overlay's Settings button opens the Voice & Audio page (was a no-op stub). * First Whisper auto-download surfaces a status line in the Voice Overlay so the user knows the silent ~140 MB fetch is why nothing is being transcribed yet. * Speech Model card refreshes its 'Model ready / Download required' status whenever the page becomes visible, even if NodeService hasn't wired its VoiceService yet. * Stale 'Windows built-in' fallback text removed from the Speech-to-Text card description (5 locales). Whisper has been the only engine since ff11467. * Width bumps so labels no longer truncate (the Speech Model size combo, the Provider combo). * Dropped 'STT' jargon from the Language ComboBox header. * Fixed misleading '~50-80 MB each' Piper size copy (real range is ~25-150 MB depending on quality). Plumbing * New SettingsRequested event on VoiceOverlayWindow; App hooks it to ShowHub('voice'). * TtsCapability.ResolveProvider falls back to Piper. * App.OnNotificationReceived no longer gates TTS on VoiceMode != Inactive. * VoiceSettingsPage.UpdateModelStatus queries the file system via WhisperModelManager directly so it works before NodeService finishes lazy-init of VoiceService. * VoiceService.InitializeAsync fires DiagnosticMessage events around silent VAD/Whisper auto-downloads. Tests: Shared 1271 / Tray 462 (default-provider asserts updated). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: address rubber-duck review (Highs #2-#5, Mediums #6-#8, Low #9) High #2: Reuse a singleton TextToSpeechService for chat replies * App.SpeakResponseAsync now goes through NodeService.TextToSpeech (a new public accessor on the existing _textToSpeechService field) instead of constructing a fresh service per call. Cached Piper client is reused across replies; the service-internal _playbackGate + _activePlayer now actually serialize back-to-back replies, and Interrupt=true takes effect. High #3: Per-provider VoiceId routing * New TtsWindowsVoiceId setting (round-tripped via SettingsManager + SettingsData; SettingsRoundTripTests assert it). * SpeakResponseAsync no longer passes _settings.TtsElevenLabsVoiceId as a generic VoiceId; the per-provider Speak* paths each look up their own setting (TtsPiperVoiceId / TtsWindowsVoiceId / TtsElevenLabsVoiceId). * SpeakWithWindowsAsync falls back to TtsWindowsVoiceId when args.VoiceId is blank. * VoiceSettingsPage.OnWindowsVoiceChanged writes TtsWindowsVoiceId (was overwriting TtsElevenLabsVoiceId, a real cross-provider bug). High #4: stt.listen returns a complete utterance, not the first segment * New AudioPipeline.UtteranceTranscribed event fires once per silence- bounded utterance with all Whisper segments aggregated and an immutable Segments snapshot. * VoiceService bubbles it as UtteranceCompleted. * ListenOnceAsync subscribes to UtteranceCompleted (drops the per-fragment accumulator) so multi-segment utterances no longer return truncated text. High #5: Voice Overlay submits one chat message per utterance * OnTranscriptionReceived keeps the per-fragment streaming bubble update; chat submission moved to a new OnUtteranceCompleted handler so the gateway sees one message per spoken utterance. Medium #6: Per-asset cancellation tokens in VoiceSettingsPage * Split _downloadCts into _whisperDownloadCts and _piperDownloadCts so starting a Piper download no longer cancels an in-flight Whisper download (and vice versa). Medium #7: Preflight tar.exe before Piper download * PiperVoiceManager.EnsureExtractorAvailable runs a fast `tar --version` check before any network I/O. Downlevel Windows users now get a clear actionable error instead of a wasted ~50-150 MB download that would later fail at extraction. Medium #8: Refresh stale MCP tool descriptions * stt.transcribe / stt.listen / stt.status now describe the single Whisper engine surface (no preferredEngine / effectiveEngine / engineFallbackReason); stt.listen description explicitly notes the result is the full silence-bounded utterance. * tts.speak description includes `piper` in the provider list and notes the fresh-install default. * Updated McpToolBridgeTests assertion for the new shape. Low #9: Per-asset single-flight in download managers * Both WhisperModelManager and PiperVoiceManager wrap their Download*Async in a static ConcurrentDictionary<string,Task> keyed on the canonical asset ID. Concurrent calls for the same asset await the same in-flight Task instead of racing on the same .tmp file. Failed downloads remove themselves from the table so a fresh retry isn't blocked. Tests: Shared 1271 / Tray 462. Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: SHA-256 verification of Whisper models and Piper voices Critical (rubber-duck #1) — fail-closed integrity check before install. * New `Sha256` field on `WhisperModelInfo` and `PiperVoiceInfo`. * All 9 catalog entries (3 Whisper models + 6 Piper voices) carry a pinned lowercase-hex SHA-256, captured against the live HuggingFace and sherpa-onnx GitHub releases on 2026-05-05. * Download core methods now: 1. Refuse outright if the catalog entry has no pinned hash (`InvalidOperationException`). 2. Compute SHA-256 of the temp file BEFORE the atomic rename (Whisper) or BEFORE the tar extraction (Piper). 3. On mismatch, throw `System.Security.SecurityException`, delete the temp file, and let the catch block tear down any half-installed directory. Sanitized message — does NOT echo the actual hash (no confirmation oracle). * New `AssetHashPinningTests` enforces that every catalog entry has a 64-hex-char SHA-256 and an https URL — future additions that forget the hash now break the build. Audio_FollowUps.md §2 updated: * Status block at the top documents what landed today. * Pre-public-release TODO list trimmed to: independent re-verification of the pinned hashes, on-load verification (not just on download), and a future signed-manifest format so updates don't require a tray rebuild. The original detailed design notes are preserved as the spec for that next iteration. Tests: Shared 1275 / Tray 462. Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: Download Model button works without VoiceService OnDownloadClick previously routed through VoiceService.DownloadModelAsync, which silently no-op'd whenever _voiceService was null — and _voiceService is only constructed inside NodeService.RegisterCapabilities (which runs on Connect / StartLocalOnly, and only when NodeSttEnabled is true). A user who toggled STT on without reconnecting, or who hadn't enabled MCP-only mode, would tap Download and see nothing happen. Construct a WhisperModelManager directly from SettingsManager.SettingsDirectoryPath and download via that. Same on-disk result as the VoiceService auto-download path, but available regardless of NodeService lifecycle state. Same SHA-256 verification applies (the manager owns it). Tests: Tray 462 (no change in surface). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * ux: Companion rename, expanded NavView memory, right-click opens Hub Three coordinated tweaks based on the morning UX review. 1. Right-click on the tray icon now opens BOTH the popup quick-menu AND the companion app window. ShowHub gained an `activate` flag; for this code path we call ShowHub(activate:false) so the Hub surfaces via AppWindow.Show(activateWindow:false) and the popup (which is light-dismiss) stays the foreground window. Without this the Hub's Activate() would steal focus and dismiss the popup. 2. NavigationView pane mode is now expanded by default and remembered across sessions. PaneDisplayMode flipped from Auto to Left, and a new HubNavPaneOpen setting (default true) is round-tripped via SettingsManager / SettingsData. PaneOpening / PaneClosing handlers on HubWindow persist the user's last toggle. SettingsRoundTripTests covers the new field. 3. Renamed the mascot from 'Molty' to 'Companion' across the surface: User-facing strings: * VoiceOverlayWindow Title and header text → `Companion Voice`. * VoiceSettingsPage section header → `🔊 Companion Voice`. * Both Preview-button sample texts (Windows + Piper) now say `Hello! This is your Companion speaking.`. Code identifiers (HomePage): * MoltyRing → CompanionRing * MoltyProgressRing → CompanionProgressRing * UpdateMoltyRing → UpdateCompanionRing * Comment `<!-- Molty mascot -->` → `<!-- Companion mascot -->` `grep -i molty src/` returns zero hits. Tests: Shared 1275 / Tray 462. Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * ux: rubber-duck #2 — restore minimized Hub on right-click; pin pane default Two findings from the second rubber-duck pass. Medium: ShowHub(activate:false) was a no-op when the Hub was previously minimized. AppWindow.Show(activateWindow:false) does not restore minimized windows. Detect OverlappedPresenter.State == Minimized first and Restore(activateWindow:false) so the window actually surfaces behind the popup, then call Show. Low: regression test for HubNavPaneOpen migration. Settings files written before this field existed must deserialize to true (NavView expanded). Added an explicit FromJson(\"{}\") assertion plus pinned the field's default in MissingFields_UseDefaults and BackwardCompatibility_OldSettings* so a future refactor can't silently flip new installs to a collapsed pane. Tests: Tray 463 (one new). Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: deep-link parser strips trailing slash before query (#-) The Windows shell canonicalizes openclaw://send?args=... to openclaw://send/?args=... before handing it to us. The previous implementation called TrimEnd('/') on the WHOLE remainder before splitting off the query, so the trailing slash before the '?' was never trimmed and Path came out as 'send/' instead of 'send'. Trim the slash from the path SEGMENT after splitting off the query. Three new theory cases pin the regression for send / agent / activity deep links — categories that all carry query parameters in the launcher canonicalized form. Existing TrailingSlash test (no query) still passes with the new placement. Credit to the parallel Copilot session for catching this. Tests: Shared 1275 / Tray 466 (3 new). Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: don't drop final utterance on stop or timeout; bound transcription queue; normalize BCP-47 Three coordinated STT pipeline fixes from the latest review. #1 (High) — Buffered speech was discarded on Stop/Timeout * AudioPipeline.StopAsync used to call _cts.Cancel() BEFORE flushing, and the flush passed the canceled token straight into Whisper.net (which honored cancel and dropped the final utterance). Reordered to: stop capture -> flush with a fresh CancellationToken.None -> cancel _cts -> cleanup. Adds an overrideToken parameter on TranscribeSamplesAsync so the flush can opt out of the pipeline cancel. * VoiceService.ListenOnceAsync used to throw TimeoutException as soon as the linkedCts fired, even when speech was actively buffered. It now waits on Task.WhenAny(utteranceTcs, timeoutSentinel), and on timeout it gives pipeline.StopAsync up to 2 s to flush — only then reports timeout. stt.transcribe inherits this fix. #3 (Medium) — Whisper.net language mismatch * SpeechToTextService.NormalizeForWhisper trims BCP-47 input down to the 2-letter ISO 639-1 primary subtag that Whisper.net's WithLanguage call expects. `en-US` -> `en`, `zh-Hans-CN` -> `zh`, garbage -> `auto`. Capability validator + MCP docs continue to advertise the wider BCP-47 shape (no breaking change for callers); this fixes the gap to Whisper. * Result.Language now echoes the normalized form so the caller sees what Whisper actually used. #4 (Medium) — Unbounded transcription queue * Each VAD-bounded segment fired `_ = Task.Run(TranscribeSamplesAsync)` with no in-flight cap. SpeechToTextService gates Whisper work but callbacks accumulate behind the gate, each holding a sample buffer. Now bounded with Interlocked counter + MaxConcurrentTranscriptions cap (2). Excess segments are dropped with a clear DiagnosticMessage rather than silently queued — better UX than getting stale utterances arriving minutes after the user stopped speaking. Tests: Shared 1291 / Tray 466 (16 new normalizer tests). Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: SHA-256 verification of Silero VAD model Closes the inconsistency the buddy review flagged: Whisper and Piper download paths are hash-pinned and fail closed on mismatch, but the Silero VAD download path (VoiceService.DownloadVadModelAsync) was just HTTPS + system trust chain — no integrity verification before File.Move into the models directory. * New SileroVadModelManifest holds the URL, SHA-256, and approximate size as public constants in OpenClaw.Shared.Audio. Hash captured from the upstream raw URL on 2026-05-05; same pre-public-release re-verify TODO as the other manifests (Audio_FollowUps.md §2). * DownloadVadModelAsync now hashes the temp file with SHA-256 BEFORE the atomic rename. On mismatch it throws SecurityException and the catch block tears down the .tmp file. Sanitized error — does not echo the actual hash (no confirmation oracle). * AssetHashPinningTests gains a SileroVadModel_HasPinnedSha256 case so a future renaming/forgetting of the constant trips the build. Tests: Shared 1292 (1 new). Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: bring skill.md back in sync with capability registry The SkillMdDriftTests pinning test was failing — 14 commands present in McpToolBridge.KnownCommands had no matching ### heading in skill.md: * The 4 new entries this branch added: stt.transcribe, stt.listen, stt.status, tts.speak. * 10 pre-existing app.* entries (app.navigate, app.status, app.sessions, app.agents, app.nodes, app.config.get, app.settings.get, app.settings.set, app.menu, app.search) that already drifted before the audio work. Fixing them all in one pass so the test goes green and stays green. Each new section follows the existing format: H3 heading, brief description, JSON-shaped param block, return shape. Privacy + provider notes added for stt.* and tts.* so agent readers understand: stt.* is local Whisper only and requires NodeSttEnabled, tts.* defaults to Piper (local neural). Tests: SkillMdDriftTests now passes. Shared 1292 / Tray 466. Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * ux: throttle Whisper/Piper download progress UI; wire Re-download button Two manual-test follow-ups on the Voice Settings page. * Throttle progress UI updates to >=150 ms intervals on both the Whisper and Piper download paths. The streaming downloads emit a progress callback every ~80 KB chunk, so a 466 MB model produces ~5,800 dispatcher hops (Progress<T> + DispatcherQueue.TryEnqueue doubled the load). The dispatcher queue saturated and the app appeared frozen mid-download. Coalescing limits the rate to a few updates per second, with a forced final 100% report so the user never sees a stuck "99%" right before "Model ready". Also dropped the redundant inner DispatcherQueue.TryEnqueue (Progress<T> already marshals to the captured UI SyncContext). * Re-download button now actually re-downloads. WhisperModelManager short-circuits DownloadModelAsync when the file is already present, so OnDownloadClick now calls the existing DeleteModel(modelName) first when the file is on disk. Net effect: delete -> fresh fetch -> SHA-256 re-verify -> atomic rename. Same on-disk result. Tests: Shared 1292 / Tray 466 (no test surface change). Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: localize VoiceSettingsPage and VoiceOverlayWindow surfaces Closes the buddy review's last finding. The new voice UI was English-only hard-coded in both XAML and code-behind, while the rest of the tray (and the freshly redone CapabilitiesPage STT/TTS card) reads from .resw via x:Uid + LocalizationHelper.GetString. Coverage: * VoiceSettingsPage.xaml — every user-facing TextBlock / Header / ComboBoxItem / Button content / placeholder gets x:Uid (page title, card headers, STT toggle, model + language combos, voice chat controls, all 3 TTS provider items, Piper download/delete/preview, ElevenLabs slot, privacy note). * VoiceOverlayWindow.xaml — header text, status badge, empty state, status text, start/stop label, mute + settings tooltips. * VoiceSettingsPage.xaml.cs and VoiceOverlayWindow.xaml.cs — runtime status messages (download progress, model-ready, preview failures, pipeline state transitions, mute/listen state) now read from LocalizationHelper.GetString. Format strings use Lf(...) so {0}/{1} placeholders are honored under CurrentCulture. Translations pinned for en-us / fr-fr / nl-nl / zh-cn / zh-tw — ~95 new keys per locale (475 total resw entries). Translations are best-effort; native speakers should review pre-public-release. LocalizationValidationTests: * AllLocales_HaveExactlySameKeysAsEnUs ✅ * Resources_AreTranslatedAllOrNoneAcrossNonEnglishLocales ✅ (added VoiceSettingsPage_StatusError + ElevenLabs sample-ID placeholder keys to the InvariantOrDeferred list — they're intentionally identical across locales) Build green. Shared 1292 / Tray 466. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: stt.transcribe is now a true fixed-duration capture Closes the buddy review's stt.transcribe finding. The handler used to adapt SttTranscribeArgs into SttListenArgs and call ListenOnceAsync, which inherited VAD-based silence shutdown — so a 5 000 ms request would return after 1 s if the user stopped speaking. The advertised contract (skill.md, McpToolBridge) promises bounded fixed-duration capture, not silence-bounded. Implementation: * AudioPipeline.CaptureFixedDurationAsync — new top-level method that starts WASAPI capture, accumulates every resampled+gain-applied 16 kHz mono sample into _fixedCaptureBuffer for exactly durationMs (or until cancellation), then returns the buffer. OnDataAvailable branches on a new _fixedCaptureMode flag and bypasses the VAD path entirely in this mode. * VoiceService.TranscribeFixedDurationAsync — wraps CaptureFixedDurationAsync + SpeechToTextService.TranscribeAsync and returns SttTranscribeResult directly. Empty buffer (cancelled immediately or no audio) returns transcribed=false rather than throwing. * NodeService.OnSttTranscribeAsync now calls TranscribeFixedDurationAsync instead of bouncing through ListenOnceAsync. stt.listen behavior is unchanged. Tests: Shared 1292 / Tray 466. Build green. (No new tests — exercising this path requires a real WASAPI device. The capture/transcribe boundary is tightly coupled to NAudio + Whisper.net, which were the test seams already deferred to Audio_FollowUps.md §1.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: localize VoiceOverlayWindow root window title Adds x:Uid="VoiceOverlayWindow" on the WindowEx root, plus the VoiceOverlayWindow_winexWindowEx_2.Title key in all 5 locale resw files. Listed in InvariantOrDeferredResourceKeys so the parity test allows the title to read identical "Companion Voice" in every locale — matches the existing convention for ChatWindow / HubWindow / CanvasWindow / TrayMenuWindow. The visible header text and runtime status messages were already localized; this just closes the gap on the actual OS-level window title (alt-tab, taskbar). Build green. Shared 1292 / Tray 466. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: gate stt.* on file presence, not in-memory load state The MCP / wire-side stt.transcribe and stt.listen entry points short-circuited with "Whisper model not downloaded" whenever _voiceService.IsWhisperReady was false. That property reads SpeechToTextService.IsModelLoaded — which is true only after the model has been LOADED INTO MEMORY by EnsureInitializedAsync. On a freshly-launched tray (or any state where the user hasn't opened the Voice Overlay yet), the .bin file is on disk but the model isn't loaded. The pre-flight check rejected the call before the inner TranscribeFixedDurationAsync / ListenOnceAsync could run EnsureInitializedAsync to load it lazily. Net result: every first MCP STT call after launch failed with a misleading "model not downloaded" error, even though the file was right there. Switch the pre-flight check to IsModelDownloaded (file on disk). The lazy load happens inside the inner call as it always did. Verified end-to-end via the local MCP HTTP server: tools/call stt.transcribe with maxDurationMs:5000 returned a real transcript ("Hello, how is everybody doing?") on first invocation after a fresh tray launch. Tests: Shared 1292 / Tray 466. Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * ux: voice UI testing round — localization, shutdown, and Capabilities cleanup Three buckets of fixes from this afternoon's manual testing pass. i18n: dot-suffix lookup bug in code-behind * LocalizationHelper.GetString(X.Text) returns the raw key when the resource name has a dot — XAML x:Uid resolution interprets the trailing .Text as a property suffix, but direct programmatic lookup doesn't, so the resource map can't find it. Six call sites were displaying literal keys like "VoiceOverlayWindow_StatusBadge.Text" in the running UI. * Added six dot-free code-only keys (BadgeReady, StatusReadyMessage, ButtonStartListening, ButtonDownloadModel, PiperButtonDownloadVoice, PreviewVoiceButtonContent) translated across all 5 locales, and swapped the call sites in VoiceOverlayWindow.xaml.cs and VoiceSettingsPage.xaml.cs to use them. audio: Voice Overlay "Failed to encode audio features" on Stop * Mid-encode interruptions from Whisper.net don't surface as a clean OperationCanceledException — they bubble up as misleading errors like "Failed to encode audio features." Pressing Stop while a transcription Task.Run was in-flight produced exactly that toast. * AudioPipeline.StopAsync now drains in-flight transcriptions for up to 3 s before cancelling \_cts, so the user's last utterance has a chance to actually complete. * TranscribeSamplesAsync's catch block suppresses errors when \_isStopping or the cancel token is set — those are expected shutdown-induced interruptions, not user-visible failures. Also sanitized the diagnostic toast (no raw ex.Message). Capabilities page rework * Removed the redundant Language TextBox + label + help + status block. The Voice & Audio page already owns the language picker via a curated ComboBox (the textbox accepted any string and silently failed validation on garbage like "foobar", which was a paper cut). * "More voice settings…" hyperlink stays as the deep-link. * Speech-to-Text card hint now reads file presence directly via a fresh WhisperModelManager rooted at SettingsManager.SettingsDirectoryPath (instead of hub.VoiceServiceInstance?.IsWhisperReady, which is null on a freshly-launched tray and reads "loaded into memory" rather than "file on disk"). Same trick used by VoiceSettingsPage's UpdateModelStatus. * Updated the Capabilities help text in all 5 locales to say "Two-letter ISO 639-1 code (e.g. en, fr, ja)" instead of "BCP-47 tag (e.g. en-US, fr-FR, ja-JP)" — matches what NormalizeForWhisper actually accepts (region is stripped). (Help text is now only consumed by the language picker on Voice & Audio, but the resw key was renamed/repurposed to match.) * Dropped the now-orphan SttLanguageLabel/TextBox/Help resw entries from all 5 locales, the CapabilitiesPageLocalizationCoverageTests catalog, and the LocalizationValidationTests invariant list. Tests: Shared 1292 / Tray 460 (6 fewer cases — the CapabilitiesPageLocalizationCoverageTests theory shrank by 3 keys × 2 non-en locales). Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: align VoiceOverlayWindow root x:Uid with WindowEx convention The Title key in resw was VoiceOverlayWindow_winexWindowEx_2.Title, but the root x:Uid was just "VoiceOverlayWindow" — so WinUI's auto-derived property-suffix lookup (Window-typed elements get the _winexWindowEx_2 suffix) couldn't find a match and the title fell back to the XAML default. Aligned the x:Uid to "VoiceOverlayWindow_winexWindowEx_2", matching the existing pattern used by ChatWindow / HubWindow / CanvasWindow / TrayMenuWindow. (Also: the buddy's parallel "trailing whitespace in resw" finding is already addressed by subsequent commits — XmlDocument.Save normalized the formatting; `Get-Content | -match '\s+\$'` returns 0 on every locale today.) Build green. Tray 460. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * privacy: stop leaking ex.Message into voice UI status text The voice settings handlers and the Voice Overlay's start/stop catch were formatting raw exception messages straight into user-facing UI status text (and from there potentially into screenshots, error toasts, support bundles, the activity stream). ex.Message can carry URLs, local paths, hash digests, HTTP body fragments, or other implementation detail that the user shouldn't see. Seven call sites updated: * VoiceSettingsPage.xaml.cs — Whisper download error, Piper download failure, Piper delete failure, Piper preview failure, Windows voice enumeration failure, Windows preview failure (6 sites). * VoiceOverlayWindow.xaml.cs — overlay start/stop catch (1 site). For each: full ex (message + type + stack) is logged via Logger.Error or _logger.Error; the UI shows a generic localized message that ends in "(see Debug log)" so users know where the detail lives. Resw side: * Six error-string keys in all 5 locales had their {0} format placeholders replaced with self-contained generic messages (translated, not just placeholder-stripped). * VoiceSettingsPage_StatusError dropped from LocalizationValidationTests.InvariantOrDeferredResourceKeys — it used to be flagged invariant because the placeholder made every locale identical; with real translations it now varies and shouldn't be exempt. Tests: Tray 460. Build green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * audio: include stt.listen + stt.status in DangerousCommands These two commands were already wired up in NodeService and advertised by SttCapability, but the gateway's Windows platform-default policy hides any command that isn't either platform-default (system.*, browser.proxy) or in the node's DangerousCommands opt-in list. Only stt.transcribe was in that list, so chat agents only saw stt.transcribe even when NodeSttEnabled was on. Adding stt.listen and stt.status lets them get the same explicit gateway opt-in treatment as stt.transcribe, so once the operator allows them in gateway.nodes.allowCommands they flow through to the agent's tools list. Verified end-to-end: after re-pair, chat reports the full 24-command list including stt.listen, stt.status, and tts.speak. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(audio): isolate shared download cancellation Keep Whisper model and Piper voice single-flight downloads alive when one caller cancels its wait, and cover retry/cancellation behavior with focused tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(tray): keep right-click to context menu only Restore tray right-click behavior so it opens only the menu instead of also showing the companion hub. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(voice): allow local overlay without node pairing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Ranjesh Jaganathan <ranjeshj@microsoft.com> Co-authored-by: Scott Hanselman <scott@hanselman.com>
…m & settings (openclaw#292) * feat: add recording state tracking to NodeService Add IsScreenRecording/IsCameraRecording properties and RecordingStateChanged event to NodeService. Wrap OnScreenRecord and OnCameraClip handlers to set state and raise events before/after async recording calls. This enables downstream UI components (tray icon, toasts, activity log) to react to recording lifecycle changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add toast notifications for screen and camera recording Show toasts on recording start, completion, and failure for both screen recording and camera clips. Extract reusable ShowToast helper and add localized strings for all 5 locales (en-us, fr-fr, zh-cn, zh-tw, nl-nl). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: log recording events to activity stream Add recording start/complete events with emoji indicators (🔴/✅) to the activity stream. Render emoji in a separate TextBlock element to prevent color emoji clipping by the card's CornerRadius clip mask. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add recording consent dialog before first recording Show a standalone WindowEx consent dialog the first time an agent requests screen or camera recording. Consent is tracked separately per recording type (ScreenRecordingConsentGiven, CameraRecordingConsentGiven) so users can allow screen recording without granting camera access. The dialog uses extend-into-titlebar styling, Mica backdrop, and SetForegroundWindow to ensure visibility. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add privacy settings UI and polish consent dialog - Add Privacy section to Settings with screen/camera recording toggles - Settings toggles auto-refresh when consent changes externally - Fix consent dialog z-order with HWND_TOPMOST technique - Fix button width (MinWidth instead of fixed Width) - Add SettingsManager.Saved event for cross-component reactivity - Allow button uses AccentButtonStyle for consistency - Remove misleading 'only asked once' from privacy text Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: serialize consent dialogs and settings saves to prevent races - Add SemaphoreSlim guard in EnsureRecordingConsentAsync so concurrent recording requests coalesce onto a single consent dialog per type - Add lock around SettingsManager.Save() to prevent concurrent file writes - Update privacy toggle text in all 5 locales to clarify that enabling skips future consent prompts (e.g. 'Allow screen recording without prompting') Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add 3-2-1 countdown overlay before recording starts Show a translucent topmost countdown window (3 → 2 → 1) before screen and camera recordings begin, similar to Windows Snipping Tool. Gives users clear visual indication that recording is about to start. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(video): harden recording consent persistence Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: settings dirty-state guard, consent dialog copy, and tests - Add dirty-state guard to SettingsPage: external consent saves no longer overwrite unsaved user edits on the Settings page - Update consent dialog description in all 5 locales to explicitly state that the choice persists until changed in Settings - Add 4 focused tests for settings save thread safety, Saved event, consent persistence, and consent revocation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Christine Yan <christineyan@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Scott Hanselman <scott@hanselman.com>
Resolved conflict with current master, preserving navigation pane persistence while keeping the wider minimum size and dynamic pane width.\n\nValidation: local ARM64 build passed; Shared tests 1296 passed / 20 skipped; Tray tests 466 passed; remote CI green.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sanitizes node capability error responses while preserving details in local logs and updates tests for generic responses.\n\nValidation: local ARM64 build passed; Shared tests 1296 passed / 20 skipped; Tray tests 466 passed; remote CI green.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Precompiles reusable regexes in redaction/UI helpers and keeps QuickSend-specific overlap out of this PR.\n\nValidation: local ARM64 build passed; Shared tests 1296 passed / 20 skipped; Tray tests 466 passed; remote CI green.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…aw#270) Handles PowerShell EncodedCommand aliases and separator forms, including -e, so approval evaluation remains fail-closed.\n\nValidation: local ARM64 build passed; Shared tests 1319 passed / 20 skipped; Tray tests 466 passed; remote CI green.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…law#285) Keeps the Quick Send custom titlebar styling while preserving the Windows hotkey foreground/topmost retry path and avoiding close-on-deactivation data loss.\n\nValidation: local ARM64 build passed; Shared tests 1319 passed / 20 skipped; Tray tests 466 passed; remote CI green.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Normalize command identity for exec approvals and fail closed on PowerShell EncodedCommand abbreviations, including -en.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Revert: clear bootstrap after node gets device token (not operator) - Revert: remove operator device token fallback (causes 'device token mismatch') - Device tokens are role-specific — operator token cannot auth node role - Bootstrap token stays in registry until node completes its own pairing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove ExtendsContentIntoTitleBar (caused overlap with window controls) - Allow PairingRequired transition from Error state (both operator and node) Fixes race: Error event fires before PairingRequired, blocking the transition Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add _nodePairingPending and _operatorPairingPending volatile flags - Set flag BEFORE acquiring semaphore in pairing handlers - Check flag BEFORE acquiring semaphore in status handlers - Prevents Disconnected event from winning semaphore race and overwriting PairingRequired - Flags cleared on successful Connected status Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace volatile flags with direct PairingStatus/IsPairingRequired checks - Both are set synchronously on the WS thread before async handlers run - Eliminates race: Disconnected handler checks connector state, not a flag that hasn't been set yet because both events fire in same synchronous dispatch Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix node state: add Connecting case to OnNodeStatusChanged (was missing, kept node in Idle) - Fix pairing race: preserve _isPendingApproval in OnDisconnected when _pairingBlocked - Fix EmitStateChanged: always fire (node sub-state changes need to reach UI) - Allow PairingRequired transition from Error state (both operator and node) - Suppress Disconnect/Error events when pairing is pending (check connector state directly) - Title bar: ExtendsContentIntoTitleBar with 40px top padding + title text - PairingStatusEventArgs: add RequestId field for future auto-approve support - Remove failed auto-approve attempt (requires operator.pairing scope) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…button - Replace text-only operator/node status with state machine pill rows - Read GatewayConnectionSnapshot directly (not old ConnectionStatus enum) - Show 'Awaiting Approval' for PairingRequired instead of 'Connecting...' - Pairing guidance card: approval command with Copy button + Connect/Disconnect - Fix connection counter: reset during pairing, don't increment - Read device ID from identity file when snapshot doesn't have it yet Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Diagnostics: add Direct Connect section (URL + Token fields, clears device tokens) - Diagnostics: pairing state feedback (Connect button changes to 'once approved') - Node auto-approve: re-enabled with scope guard (operator.admin/operator.pairing only) - Loop prevention: track lastAutoApprovedRequestId, one attempt per requestId - Shared gateway tokens now request operator.admin scope (removed local-only restriction) - Default URL: ws://localhost:18790 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace Advanced Connection Settings expander with Direct Connect card - Gateway URL + Token fields with Connect button (same flow as diagnostics) - Clears stored device tokens so shared token is used with admin scope - Saves SSH tunnel settings alongside - Default URL: ws://localhost:18790 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Store SshTunnelConfig on GatewayRecord when SSH is enabled - Start SSH tunnel via App.EnsureSshTunnelStarted() before connecting - Save SSH settings to SettingsManager for legacy compat - Show 'Starting SSH tunnel...' status during tunnel setup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Manager: use ws://localhost:{localPort} when record has SshTunnelConfig (operator + node)
- Diagnostics: add SSH toggle + fields to Direct Connect section
- Diagnostics: OnDirectConnect stores SshTunnelConfig on GatewayRecord, starts tunnel
- Connection page: same flow (already wired)
- SSH uses OS ssh client with key-based auth (no password in app)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Phase A: WSL Local Setup - SettingsManagerLocalGatewaySetupSettings syncs Token/BootstrapToken to GatewayRegistry on Save - CreateLocalOnly accepts optional GatewayRegistry parameter - App passes _gatewayRegistry to the factory Phase B: Onboarding Flow - Onboarding ConnectionPage uses Direct Connect flow: creates GatewayRecord + manager.ConnectAsync - Polls manager.CurrentSnapshot instead of app.GatewayClient - LocalSetupProgressPage uses manager.ReconnectAsync instead of ReinitializeGatewayClient - Expose App.ConnectionManager as public property Other: - Shared gateway tokens now request operator.admin scope on all gateways (not just local) - Update test: FreshStandardRemoteDevice now expects admin scopes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 8 InitializeGatewayClient/InitializeNodeService call sites with _connectionManager.ReconnectAsync(): - Tray connect toggle - Hub connect action - OnSettingsSaved full reconnect - ReconnectGateway - ReconnectNodeServiceOnly - SSH tunnel restart - Onboarding completion - Startup: kept InitializeGatewayClient but removed manual node init -68 lines from App.xaml.cs (4735 → 4214 cumulative this session) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…age UX Phase 1: Delete dead methods (InitializeNodeService, ReinitializeGatewayClient, ReconnectGateway, ReconnectNodeServiceOnly) and inline callers. Phase 2: Make OnManagerStateChanged the sole writer of _currentStatus, removing redundant writes from event handlers and disconnect actions. Phase 3: Move SSH tunnel lifecycle into GatewayConnectionManager via ISshTunnelManager. Manager starts/stops tunnel in ConnectAsync/DisconnectAsync. Split ISshTunnelManager interface from SshTunnelManager implementation. Phase 4: Widen IOperatorGatewayClient with 35+ request methods (sessions, usage, config, cron, skills, pairing, channels, wizard). Change IGatewayConnectionManager.OperatorClient return type to interface. Phase 5: Eliminate _gatewayClient field from App.xaml.cs, replacing 38 refs with _connectionManager?.OperatorClient. Update HubWindow and OnboardingState to use IOperatorGatewayClient. Phase 6: Add 55 new tests — RetryPolicyTests (35), NodeConnectorTests (8), StaleEventGuardTests (5), PairingFlowTests (7). Connection page UX: Add diagnostics button (opens Connection Status window), add Pending Device Pairing card with scope-gated Approve/Reject buttons (requires operator.admin or operator.pairing scope). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…le completed state When replaceExistingConfigurationConfirmed=true, the engine loaded a completed setup-state.json from a previous run and had nothing to do, leaving the progress page stuck with all stages showing empty circles. Also set AllowExistingDistro=true when replacing config since the WSL distro already exists from the previous run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the setup engine's throwaway OpenClawGatewayOperatorConnector with ConnectionManagerOperatorConnector that delegates to the app's GatewayConnectionManager. All operator handshake/pairing events now appear in the Connection Status diagnostics window. Key changes: - Add OperatorPairingRequestId to GatewayConnectionSnapshot so the setup engine can pass requestId to WSL CLI for explicit device approval - ConnectionManagerOperatorConnector: implements IGatewayOperatorConnector via the manager (registry record setup, ConnectAsync, state change wait) - Suppress node auto-connect during setup via _suppressNodeDuringSetup flag (cleared when engine completes/fails/cancels) - Fix state machine: allow WebSocketConnected and HandshakeSucceeded from Error state to handle WebSocketClientBase's built-in auto-reconnect - Setup engine factory accepts optional operatorConnectorOverride parameter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes from 5-model adversarial code review: - SSH tunnel lifecycle: don't stop on disconnect, only on gateway switch or dispose. Add browser proxy forward to SshTunnelConfig. - Suppress flag safety: try/catch on engine construction, stale engine guard prevents wrong engine from clearing flag. - Dispose: unsubscribe node events before semaphore disposal. - Token redaction: mask auth payloads, sigToken previews, and signed payloads in diagnostics logs. - Connector: move StateChanged subscription after fast-path checks, disconnect manager on timeout. - State machine: wrap SetNodeEnabled in semaphore, clear stale OperatorPairingRequestId on transition. - UI: re-enable approve/reject buttons on false return, move auto-approve guard after confirmed success. - DisposeActiveClient: sync-wait for node disconnect (2s timeout). - Remove diagnostics window auto-open on startup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove 6 working/aspirational documents (moved to external copilotdocs): - connection-architecture-northstar.md - connection-implementation-audit.md - gateway-node-integration.md - wsl-owner-open-issues.md - wsl-owner-validation.md - WINDOWS_NODE_TESTING.md Replace with one concise document describing the current architecture as implemented: components, state machine, credential resolution, setup flow integration, and what remains in App.xaml.cs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Merges Repo Assist PR openclaw#303 after local Windows validation.\n\nValidation passed:\n- .\\build.ps1\n- dotnet test .\\tests\\OpenClaw.Shared.Tests\\OpenClaw.Shared.Tests.csproj --no-restore\n- dotnet test .\\tests\\OpenClaw.Tray.Tests\\OpenClaw.Tray.Tests.csproj --no-restore\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Merges Repo Assist PR openclaw#300 after local Windows validation and green GitHub checks. Validation passed: - .\build.ps1 - dotnet test .\tests\OpenClaw.Shared.Tests\OpenClaw.Shared.Tests.csproj --no-restore - dotnet test .\tests\OpenClaw.Tray.Tests\OpenClaw.Tray.Tests.csproj --no-restore Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Merges Repo Assist PR openclaw#294 after local Windows validation and green GitHub checks. Validation passed: - .\build.ps1 - dotnet test .\tests\OpenClaw.Shared.Tests\OpenClaw.Shared.Tests.csproj --no-restore - dotnet test .\tests\OpenClaw.Tray.Tests\OpenClaw.Tray.Tests.csproj --no-restore Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix wizard RadioButtons selection not sticking (FunctionalUI) - Validate SSH config before direct connect; revert registry on failure - Log identity file errors instead of silent catch blocks - Unsubscribe ConnectionPage from StateChanged on Unloaded - Fix ReconnectAsync/SwitchGatewayAsync race condition (atomic semaphore) - Fix sync-over-async deadlock risk in DisposeActiveClient - Add timeout to tunnel stop in Dispose - Persist Cancelled status on setup cancellation - Classify permanent setup failures as non-retryable - Handle corrupt setup-state.json gracefully - Fix chat using DeviceToken instead of SharedGatewayToken for HTTP auth Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ranjeshj
pushed a commit
that referenced
this pull request
May 16, 2026
Four items flagged by @shanselman after re-reviewing 488c94f. All four addressed in this commit: 1. _engineStarted permanent-block bug (V2Bridge.EnsureEngineStarted) The flag was set BEFORE the existing-config guard, so a guarded early return left _engineStarted=true. A later call after the user confirmed replacement would hit if (_engineStarted) return; and never construct the engine — permanently locking the user out of local setup. Fix: move _engineStarted = true to AFTER all preflight guards pass. Also reset _engineStarted to false in the engineFactory catch path so transient construction failures are recoverable via Try-again. Mark the synthetic existing-config block as retryable so the user can confirm replace and retry without restarting onboarding. 2. Stale retry continuation race (V2Bridge.OnRetryRequested + EnsureEngineStarted) Added monotonic _engineGeneration counter. Bumped when retry resets engine state. Captured in the RunLocalOnlyAsync().ContinueWith(...) before the new run starts. Continuation no-ops if generation has been bumped — preventing an old run's final state from auto-advancing the V2 flow (LocalSetupProgress → GatewayWelcome) after the user clicked "Try again". 3. Try-again rendered for terminal/blocked failures (V2State + Bridge + Page) Added OnboardingV2State.LocalSetupCanRetry (default false). Bridge OnEngineStateChanged sets it true ONLY for FailedRetryable; terminal and blocked failures clear it. LocalSetupProgressPage.BuildErrorCard accepts a nullable Action? onTryAgain and omits the button when null; single-column grid layout when no retry button is shown. Bridge OnRetryRequested is gated on LocalSetupCanRetry as defense-in-depth so a stale UI event from before the page re-rendered cannot restart the engine on a terminal failure. 4. New source projects added to slnx Added OpenClawTray.OnboardingV2 (no platform mapping) and OpenClaw.SetupPreview (with x64/ARM64 mapping like Tray.WinUI, since it's WindowsAppSDKSelfContained=true and AnyCPU would need a RID). Now visible in VS/Rider solution view. Tests added (OpenClawTray.OnboardingV2.Tests): - LocalSetupCanRetry_DefaultsToFalse - LocalSetupCanRetry_SetTrue_FiresStateChanged - LocalSetupCanRetry_SetSameValue_DoesNotFireStateChanged Bridge-level integration tests for fixes #1 and #2 (the simulation Scott asked for) require constructing OnboardingV2Bridge with a mock engine factory — but the bridge depends on App.xaml.cs (gateway client reseeding), which can't be easily unit-tested without WinUI runtime. The fixes are covered with strong inline contracts/comments documenting the invariants. Validation (worktree, OPENCLAW_REPO_ROOT set): - ./build.ps1 ✅ - Shared.Tests — 1548 passed / 28 skipped / 0 failed ✅ - Tray.Tests — 1197 passed / 0 failed ✅ (was 1178; +19 from master merge) - OpenClawTray.OnboardingV2.Tests — 7 passed / 0 failed ✅ (was 4; +3 CanRetry) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
shanselman
pushed a commit
that referenced
this pull request
May 17, 2026
…reNodeConnectedAsync + NodeService.AttachClient safety Adversarial dual-model code review (Opus + Codex) on PR openclaw#413 surfaced 5 actionable findings. This commit fixes all 5: #1 NodeService.AttachClient/DisconnectAsync — event subscription bookkeeping was neither idempotent (Opus) nor thread-safe (Codex). Added a dedicated _clientLock that AttachClient and DisconnectAsync both take while reading/ writing _nodeClient and wiring/unwiring its handlers. Subscribe is now unconditional unsubscribe-then-subscribe so a re-attach of the same client reference after a DisconnectAsync (which nulled _nodeClient) doesn't double-subscribe. #2 ConnectionManagerWindowsNodeConnector defensive create — was hardcoding IsLocal=true. Derived from URL via LocalGatewayUrlClassifier.IsLocalGatewayUrl so a future remote-gateway caller isn't silently misclassified. openclaw#3 EnsureNodeConnectedAsync timeout — was applying CancelAfter(35s) even when the caller passed a longer-lived token, contradicting the docstring contract. Now only applies the default 35s when !cancellationToken.CanBeCanceled. openclaw#4 EnsureNodeConnectedAsync entry guard — added cancellationToken .ThrowIfCancellationRequested() before any side effects. openclaw#5 EnsureNodeConnectedAsync silent-hang on no-credential — Opus's symmetric-defect check flagged 3 silent-return paths in StartNodeConnectionAsync (null connector / missing gateway record / no node credential). EnsureNodeConnectedAsync now re-reads the snapshot after StartNodeConnectionAsync and throws InvalidOperationException immediately if NodeState is still Idle or Disabled — rather than waiting 35s for a misleading TimeoutException. The underlying diagnostic was already recorded via _diagnostics.Record('node', ...). Deferred (LOW-consensus, would need deeper changes): - Active gateway mismatch check in ConnectionManagerWindowsNodeConnector (Codex MEDIUM, Opus didn't flag). In practice the operator connector runs first in the same engine setup and switches the manager to the right record. Adding an explicit SwitchGatewayAsync here would race with the manager's own gateway-switching logic — defer until reproducer. - Post-onboarding OperatorState==Connecting snapshot race (Opus LOW, self-resolved as no-fix-needed — auto-reconnect timer covers the benign race). Tray.Tests 962/962 ✅, Connection.Tests 224/224 ✅, build green ✅ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes 12 bugs across setup flow, connection page, connection manager, and chat token resolution.
Wizard RadioButtons (user-reported)
ConnectionPage — Direct Connect
GatewayConnectionManager
Setup Flow
Chat Token
Testing