Skip to content

MSIX-only distribution hardening: manifest, permissions, .appinstaller updates, sunset Inno, recovery CLI, test plan#468

Draft
indierawk2k2 wants to merge 17 commits into
openclaw:masterfrom
indierawk2k2:feat/msix-e2e-hardening
Draft

MSIX-only distribution hardening: manifest, permissions, .appinstaller updates, sunset Inno, recovery CLI, test plan#468
indierawk2k2 wants to merge 17 commits into
openclaw:masterfrom
indierawk2k2:feat/msix-e2e-hardening

Conversation

@indierawk2k2
Copy link
Copy Markdown
Contributor

Draft. Sharing for early validation with a collaborator. Not ready to merge until the manual test pass on a clean cloud devbox is complete and any follow-up findings land.

Summary

End-to-end hardening of the MSIX-only distribution channel for OpenClaw Companion. This PR turns the half-built MSIX path into a coherent shipping channel: manifest hardening, the right permission-check APIs for packaged apps, the documented non-Store auto-update mechanism via .appinstaller, deletion of the Inno + Updatum legacy plumbing, a CLI recovery flag for orphan WSL state, and an automated + manual test plan that already caught several bugs (one of which, #467, is fixed in this PR).

Locked-in scope decisions (from the planning convo before implementation):

What's in this PR

Track 1 — MSIX manifest hardening (commit 9ec8ee7)

  • Added windows.startupTask extension (TaskId=OpenClawCompanionStartup) so "Launch when Windows starts" survives MSIX-only.
  • Fixed CommandPalette placeholder publisher (CN=Microsoft Corporation, A Lone Developer) — CI now patches the cmdpal manifest in lockstep with the tray manifest.
  • 9 manifest-assertion tests pin the audited capability set, identity name, publisher prefix, 4-part version, and StartupTask TaskId.

Track 2 — Permission API correctness for packaged apps (commit �6fd8fd)

  • PermissionChecker branches on PackageHelper.IsPackaged: packaged uses Windows.Security.Authorization.AppCapabilityAccess.AppCapability for webcam / microphone / location (the only API that reports per-package consent), unpackaged keeps DeviceAccessInformation / registry probes.
  • Subscribes to AppCapability.AccessChanged so the onboarding row strip updates live when the user toggles consent in Settings → Privacy.
  • 8 mapping tests pin the AppCapabilityAccessStatusPermissionStatus arms.

Track 3 — Sunset Inno + Updatum, add --purge-wsl-orphans (commit 4a119f1)

Deleted:

  • installer.iss, scripts/Uninstall-LocalGateway.ps1, tests/PackagingTests/Test-InnoUninstallOrdering.ps1, tests/OpenClaw.Tray.Tests/InstallerIssAssertionTests.cs
  • Both Updatum-coupled dialogs (DownloadProgressDialog, UpdateDialog)
  • Updatum PackageReference
  • Inno-related CI jobs

Added:

  • OpenClaw.WinNode.Cli --purge-wsl-orphans — dry-run by default, --confirm-destructive to apply, --json-output for machine consumption. Detects orphan WSL distros (case-insensitive openclaw* match including the legacy PascalCase OpenClawGateway), %APPDATA%/%LOCALAPPDATA% folders, both lowercase + PascalCase openclaw:// URI scheme registry keys, and the legacy HKCU Run autostart entry. Exit codes 0 (clean / all removed) / 1 (dirty dry-run) / 2 (removal failed).

Track 4 — .appinstaller auto-update pipeline (commit 072d6f)

  • installer/openclaw-companion.appinstaller.template + scripts/render-appinstaller.ps1 with strict input validation.
  • CI Render AppInstaller step produces both per-tag and stable latest.appinstaller filenames.
  • New AppInstallerUpdateService wraps PackageManager.AddPackageByAppInstallerFileAsync for the in-app "Check for updates" path.
  • App.xaml.cs branches on packaging: packaged builds use AppInstaller; unpackaged (dev) no-op the startup check.
  • docs/RELEASING.md documents the four AppInstaller update triggers and operator caveats.
  • 9 template + URL-alignment tests.

Track 5 — Worker-node CLI MSIX packaging investigation (commit c6a3cd9)

Doc-only: docs/WINNODE_CLI_MSIX_PACKAGING.md lays out three options, picks packaging the CLI inside the tray MSIX as a second <Application> with windows.appExecutionAlias=openclaw-winnode.exe, and pins acceptance criteria for the implementation PR.

Track 6 — Test automation + manual runbook (commit c1e79f4)

  • scripts/test-msix-install.ps1 — install/launch/named-pipe/uninstall smoke test.
  • scripts/test-appinstaller-update.ps1 — local HTTP server walks the vN → vN+1 upgrade end-to-end.
  • 9 source-text orphan-purger contract tests.
  • docs/MSIX_E2E_TEST_RUNBOOK.md — 10-scenario manual runbook (Win11 24H2 + ARM64) covering clean install, packaged consent prompts (with the "Settings → Privacy must list OpenClaw Companion, NOT 'Desktop apps'" assertion), permission revocation while running, StartupTask, gateway uninstall, clean uninstall + orphan check, dirty uninstall + recovery via --purge-wsl-orphans, .appinstaller updates across all 4 triggers, sideload trust on no-dev-mode box, ARM64 cross-check.

Bug fixes surfaced by the manual test pass

  • commit 6aea012OrphanPurger detects PascalCase OpenClawGateway + both URI key cases. Manual-test audit on Mike's box found the historical local-gateway installer registers the distro as OpenClawGateway (PascalCase, no dash), which the original case-sensitive openclaw- prefix missed. Same audit found HKCU\Software\Classes\openclaw AND \OpenClaw keys coexist on real boxes.
  • commit 6e34c90 — Privacy-list blue icon background + missing Settings → Notifications entry:
    • Generated Square44x44Logo.targetsize-{16,20,44}_altform-unplated.png (Windows requests these sizes; we had only 24/32/48/256, so it fell back to the plated tile and rendered the manifest BackgroundColor as the system accent — blue).
    • Declared windows.toastNotificationActivation + windows.comServer pair (CLSID D4E7F816-9D6A-4A49-B1BC-C1CE71282B04) so the per-app entry appears in Settings → Notifications immediately on install instead of waiting for the first toast. Added -ToastActivator short-circuit in App.OnLaunched so Windows-spawned activator instances exit cleanly instead of fighting the singleton mutex.
  • commit 9fdff05Closes Uninstall leaves behind empty directories in %LOCALAPPDATA% #467. LocalGatewayUninstall Step 5a removed the per-distro VHD directory but never pruned the empty wsl\ parent or the empty %LOCALAPPDATA%\OpenClawTray\ grandparent. Added Step 5b and 12a with empty-guards (won't touch a directory with files or a sibling distro). 4 regression tests including defensive "don't delete user files" + "don't wipe sibling distro" coverage.
  • commit de5e73e — Ghost Windows Terminal frame leak fix:
    • Cherry-picked the 4 WinAppSdkGhostWindowCleanup commits from your wsl-keepalive-lifecycle branch.
    • Added scripts/cleanup-ghost-windows.ps1 standalone manual-recovery tool — needed because the in-process cleanup only fires inside the testhost lifetime and can't catch ghosts created by msbuild MSIX packaging or by abnormally-killed testhost processes. Uses the proven close sequence (ShowWindow(SW_HIDE) → PostMessage(SYSCOMMAND,SC_CLOSE) → SendMessageTimeout(WM_CLOSE, SMTO_ABORTIFHUNG)).
    • build.ps1 invokes the cleanup at end-of-build so MSIX-packaging leaks get caught before the developer notices.
    • 8 contract tests pinning the filter constants are shared between the script and the in-process cleanup (so we never close real user Terminals by accident).
    • AGENTS.md documents the manual recovery path.

File map

\
Manifest / packaging
src/OpenClaw.Tray.WinUI/Package.appxmanifest (+ StartupTask, + ToastActivator, + uap5/com/desktop namespaces)
src/OpenClaw.CommandPalette/Package.appxmanifest (publisher fixed, device family trimmed)
src/OpenClaw.Tray.WinUI/Assets/Square44x44Logo.targetsize-{16,20,44}_altform-unplated.png (new)
src/OpenClaw.Tray.WinUI/Services/AutoStartManager.cs (branches on PackageHelper.IsPackaged → StartupTask)
src/OpenClaw.Tray.WinUI/Onboarding/Services/PermissionChecker.cs (packaged path uses AppCapability)
src/OpenClaw.Tray.WinUI/App.xaml.cs (-Updatum, -DownloadAndInstallUpdateAsync, + AppInstaller branches, + -ToastActivator exit)
src/OpenClaw.Tray.WinUI/Services/AppInstallerUpdateService.cs (new)
src/OpenClaw.Tray.WinUI/OpenClaw.Tray.WinUI.csproj (- Updatum)

Recovery / cleanup
src/OpenClaw.WinNode.Cli/OrphanPurger.cs (new — case-insensitive match, both URI key cases)
src/OpenClaw.WinNode.Cli/Program.cs (+ --purge-wsl-orphans dispatcher)
src/OpenClaw.Tray.WinUI/Services/LocalGatewaySetup/LocalGatewayUninstall.cs (+ Step 5b, + Step 12a)

Appinstaller pipeline
installer/openclaw-companion.appinstaller.template (new)
scripts/render-appinstaller.ps1 (new)
.github/workflows/ci.yml (+ Render AppInstaller, - Inno jobs, - portable ZIP)

Tests + scripts + docs
tests/OpenClaw.Tray.Tests/MsixManifestAssertionTests.cs (new)
tests/OpenClaw.Tray.Tests/PermissionCheckerPackagedMappingTests.cs (new)
tests/OpenClaw.Tray.Tests/AppInstallerTemplateAssertionTests.cs (new)
tests/OpenClaw.Tray.Tests/OrphanPurgerContractTests.cs (new)
tests/OpenClaw.Tray.Tests/GhostWindowCleanupScriptContractTests.cs (new)
tests/OpenClaw.Tray.Tests/LocalGatewayUninstallTests.cs (+ 4 issue #467 regression tests)
tests/OpenClaw.Tray.Tests/WinAppSdkGhostWindowCleanup.cs (cherry-picked from wsl-keepalive-lifecycle)
scripts/test-msix-install.ps1 (new)
scripts/test-appinstaller-update.ps1 (new)
scripts/cleanup-ghost-windows.ps1 (new)
docs/MSIX_E2E_TEST_RUNBOOK.md (new)
docs/WINNODE_CLI_MSIX_PACKAGING.md (new)
docs/RELEASING.md, docs/SETUP.md, docs/uninstall-msix.md, DEVELOPMENT.md, AGENTS.md (updated)

Deletions
installer.iss
scripts/Uninstall-LocalGateway.ps1
tests/OpenClaw.Tray.Tests/InstallerIssAssertionTests.cs
tests/PackagingTests/Test-InnoUninstallOrdering.ps1
src/OpenClaw.Tray.WinUI/Dialogs/DownloadProgressDialog.cs
src/OpenClaw.Tray.WinUI/Dialogs/UpdateDialog.cs
docs/uninstall-portable.md
\\

Manual test evidence

Manual install + uninstall validated on a clean cloud devbox via a test-signed MSIX (see msix-e2e-test-build-2026-05-19 release on the personal fork). Findings during testing landed as commits in this PR (Privacy icon, Notifications entry, #467 fix, ghost-window safety net) — or as new issues for the team:

Validation

  • ./build.ps1 — green on every commit
  • dotnet test ./tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj --no-restore1776 passed / 28 skipped
  • dotnet test ./tests/OpenClaw.Tray.Tests/OpenClaw.Tray.Tests.csproj --no-restore1139 passed (was 1100 on master; +39 across the new manifest / permission / appinstaller / orphan / ghost-script / issue-467 test suites)

Outstanding follow-ups (deliberately deferred)

These are documented in code comments / PR commits and tracked here so reviewers can see what's NOT in this PR on purpose:

  1. gh-pages publish of latest.appinstaller is currently a manual operator step. CI renders the file but doesn't push it to gh-pages. docs/RELEASING.md documents the workflow.
  2. Toast click handlers: the manifest declares the activator CLSID, but App.OnLaunched doesn't yet CoRegisterClassObject to consume callbacks. Toast clicks fall through to the standard tray activation (acceptable; documented in commit 6e34c90).
  3. Settings → "Reset & remove" UI promotion (the banner / single-button cleanup flow the plan called for) — the existing PR feat(uninstall): WSL gateway uninstall — engine, CLI, Inno hook, MSIX docs, Settings UI #310 LocalGatewayUninstall is reachable today from Settings → Local Gateway expander. The promote-to-top-level work is UX polish, not a blocker.
  4. Worker-node CLI packaging — decision committed in docs/WINNODE_CLI_MSIX_PACKAGING.md; implementation is a separate PR.
  5. Issues [msix-e2e-feedback] WinUI: trackpad two-finger scroll doesn't work in tray app on some devboxes #462 and [msix-e2e-feedback] Toggling 'Allow screen recording' in Settings briefly hides left-nav items #463 filed during the manual test pass — should land independently.

How to review

The commits are intentionally small and ordered. Suggested review order:

  1. 9ec8ee7 (T1 manifest) — sets the foundation
  2. e6fd8fd (T2 permissions) — proves the packaged/unpackaged branching pattern
  3. c6a3cd9 (T5 doc) — context for why we made the packaging recommendation we did
  4. f072d6f (T4 appinstaller) — the new auto-update mechanism
  5. 4a119f1 (T3 sunset) — biggest commit, deletes the old plumbing
  6. c1e79f4 (T6 tests) — test harness + runbook
  7. Cherry-picked 5e6b5ea/05b430a/2fccd24/5cfe67c — the WinAppSdk ghost cleanup from your wsl-keepalive-lifecycle branch
  8. Bug-fix commits 6aea012, 6e34c90, 9fdff05, de5e73e — all surfaced or hardened by the manual test pass

Things to specifically validate

  • Identity / Publisher: do the values in Package.appxmanifest still match the Trusted Signing certificate subject you intend to ship under? (We assert the prefix in tests, but a typo in middle fields would still pass.)
  • CLSID D4E7F816-9D6A-4A49-B1BC-C1CE71282B04 for the toast activator — happy with this value, or want to rotate before the first release that uses it?
  • The 4 cherry-picked ghost-cleanup commits are unchanged from wsl-keepalive-lifecycle. If you'd rather wait for them to land via their original branch and rebase, say the word and I'll drop them here.
  • The .appinstaller URL hard-codes https://openclaw.github.io/openclaw-windows-node/latest.appinstaller. If you want a different stable URL (CDN, etc.), it needs changing in 3 places: the in-app service, the CI render step, and the manifest tests.

Closes #467.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Mike Harsh and others added 15 commits May 18, 2026 16:59
- Tray Package.appxmanifest: add uap5 namespace and windows.startupTask extension
  with TaskId=OpenClawCompanionStartup (default disabled, user opts in via Settings).
- AutoStartManager: branch on PackageHelper.IsPackaged. Packaged path uses
  Windows.ApplicationModel.StartupTask.RequestEnableAsync/Disable so the user sees
  the proper one-time Windows consent dialog. Unpackaged path keeps the legacy
  HKCU\\...\\Run entry for dev/debug builds only.
- CommandPalette Package.appxmanifest: drop the VS-template placeholders
  (CN=Microsoft Corporation, `A Lone Developer`, Windows.Universal device family),
  default-publish under the tray's publisher subject and at MaxVersionTested 26100.0.
- CI: new `Patch CommandPalette MSIX manifest metadata` step that re-asserts the
  CommandPalette manifest is in lockstep with the tray (identity name, publisher,
  publisher display name, version) before the build/sign chain runs.
- tests/OpenClaw.Tray.Tests/MsixManifestAssertionTests.cs: new assertion suite
  pinning the audited capability set, openclaw protocol, StartupTask TaskId, 4-part
  version, Publisher prefix, plus CommandPalette manifest sanity (no Microsoft
  placeholder, namespaced under tray, matching publisher, desktop-only).

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped, Tray.Tests 1100 passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- PermissionChecker.CheckCameraAsync / CheckMicrophoneAsync / CheckLocation now
  branch on PackageHelper.IsPackaged. The packaged path goes through
  Windows.Security.Authorization.AppCapabilityAccess.AppCapability.Create(<name>),
  which is the only API that reports the per-package consent state Windows
  surfaces under our package name in Settings > Privacy. The previous registry +
  DeviceAccessInformation path returned the wrong answer for MSIX users (it
  reads the global `Desktop apps` bucket, not our package-specific state).
- SubscribeToAccessChanges likewise branches on PackageHelper. Packaged path
  subscribes AppCapability.AccessChanged for webcam, microphone, and location
  so the onboarding row strip live-updates when the user toggles consent in
  Settings > Privacy. Defense-in-depth: if any AppCapability.Create throws on
  an older Windows build, we unwind the partial subscription and hand back a
  no-op disposer (callers must not crash).
- New MapAppCapabilityAccessStatus internal helper centralizes the AccessStatus
  -> PermissionStatus mapping with explicit arms for Allowed,
  UserPromptRequired, DeniedByUser, DeniedBySystem and a safe Unknown default.
  Unknown is deliberate so a future SDK enum value never silently bypasses
  capability consent.
- tests/OpenClaw.Tray.Tests/PermissionCheckerPackagedMappingTests.cs: pins the
  packaged-branch contract via source-text assertions (the test target is
  net10.0 so cannot resolve WinRT types; we follow the InstallerIssAssertionTests
  precedent of reading the source and asserting structural invariants).

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped, Tray.Tests 1108 passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- docs/WINNODE_CLI_MSIX_PACKAGING.md: investigation result with the committed
  recommendation to package the worker-node CLI inside the tray MSIX as a
  second <Application> publishing a windows.appExecutionAlias
  (`openclaw-winnode.exe`). Documents why the current Environment.SpecialFolder.ApplicationData
  contract between tray and CLI only works by coincidence under MSIX, lays
  out the three options considered, includes the proposed Package.appxmanifest
  snippet, sketches the required CLI code changes, and pins acceptance criteria
  for the follow-up implementation PR.
- docs/WINDOWS_NODE_ARCHITECTURE.md: link to the new design note so future
  contributors land on it from the Windows-platform umbrella doc.

Doc-only change; build/tests unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- installer/openclaw-companion.appinstaller.template: AppInstaller XML template
  with placeholders for version, publisher, the two MSIX URIs, and the stable
  AppInstaller URL. UpdateSettings defaults: OnLaunch poll every 24h, ShowPrompt,
  non-blocking, ForceUpdateFromAnyVersion (rollback path), AutomaticBackgroundTask.
- scripts/render-appinstaller.ps1: renders the template per tag with strict
  input validation (4-part version, absolute https URIs) and post-render XML
  parsing so a bad substitution surfaces at CI time, not deploy time.
- .github/workflows/ci.yml: 'Render AppInstaller' step in the release job
  produces both a per-tag .appinstaller AND latest.appinstaller (stable filename
  for the gh-pages URL). Both are attached to the GitHub Release.
- src/OpenClaw.Tray.WinUI/Services/AppInstallerUpdateService.cs: new internal
  service that wraps PackageManager.AddPackageByAppInstallerFileAsync for the
  in-app 'Check for updates' path. Returns a typed UpdateResult so the caller
  can surface UpdateQueued / NoUpdateAvailable / Failed / NotPackaged.
- src/OpenClaw.Tray.WinUI/App.xaml.cs: branch CheckForUpdatesAsync and
  CheckForUpdatesUserInitiatedAsync on PackageHelper.IsPackaged. Packaged
  startup check no-ops (AppInstaller polls automatically); packaged manual check
  calls AppInstallerUpdateService directly. Unpackaged paths still go through
  Updatum until T3 deletes it.
- docs/RELEASING.md: new 'Non-Store auto-update via .appinstaller' section
  documenting the four AppInstaller update triggers and operator caveats.
- tests/OpenClaw.Tray.Tests/AppInstallerTemplateAssertionTests.cs: 9 new
  structural tests pinning template XML well-formedness, placeholder set,
  UpdateSettings values, MainBundle.Name matching production identity, and
  URL alignment between in-app service, CI, and docs.

Validation: ./build.ps1 OK, Tray.Tests 1117 passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
We're MSIX-only now. This commit retires every code path that existed only for
the Inno .exe distribution + the Updatum auto-update channel, and lands the
documented recovery path for the case where a user removed the MSIX without
first running the in-app Reset & remove (PR openclaw#310).

Deleted (Inno + portable ZIP + Updatum sunset):
- installer.iss
- scripts/Uninstall-LocalGateway.ps1
- src/OpenClaw.Tray.WinUI/Dialogs/DownloadProgressDialog.cs
- src/OpenClaw.Tray.WinUI/Dialogs/UpdateDialog.cs
- tests/OpenClaw.Tray.Tests/InstallerIssAssertionTests.cs
- tests/PackagingTests/Test-InnoUninstallOrdering.ps1
- docs/uninstall-portable.md
- Updatum PackageReference from src/OpenClaw.Tray.WinUI/OpenClaw.Tray.WinUI.csproj
- using Updatum + AppUpdater static + DownloadAndInstallUpdateAsync from App.xaml.cs

CI (.github/workflows/ci.yml):
- Dropped 'Install Inno Setup', 'Build x64 Installer', 'Build arm64 Installer',
  'Sign Installer', 'Create Release ZIPs' steps from the release job.
- Release-notes 'Quick Start' now points users at latest.appinstaller, not the
  raw .msix (raw .msix installs don't wire up auto-update).
- Release attached files reduced to the 4 MSIX-only assets:
  latest.appinstaller, OpenClawCompanion-<v>.appinstaller, the two .msix.

Added (recovery CLI):
- src/OpenClaw.WinNode.Cli/OrphanPurger.cs: detects orphan WSL distros
  (openclaw-* prefix), orphan %APPDATA%/%LOCALAPPDATA% folders, legacy
  openclaw:// URI scheme registration, legacy HKCU Run autostart key. Dry-run
  by default; --confirm-destructive applies. --json-output for machine
  consumption. Exit codes 0 (clean / removed) / 1 (dirty dry-run) / 2 (some
  removals failed).
- OpenClaw.WinNode.Cli --purge-wsl-orphans dispatches to OrphanPurger BEFORE
  the --command required-flag check.
- docs/uninstall-msix.md: replaces the manual recovery PowerShell with the
  CLI flag (one command instead of five) and includes the equivalent
  PowerShell fallback for the case where the CLI itself was removed.

Doc cleanup:
- docs/SETUP.md: install instructions rewritten around the .appinstaller URL.
- DEVELOPMENT.md: Release Process section rewritten for the MSIX-only pipeline.
- scripts/validate-msix-storage-paths.ps1: recovery guidance updated to point
  at the new --purge-wsl-orphans CLI.
- src/OpenClaw.Tray.WinUI/App.xaml.cs: mutex-name comment no longer references
  the deleted installer.iss AppMutex contract.

Note: SettingsManager.SkippedUpdateTag is left in place — the field is
harmless and removing it would force a settings.json migration. It will be
naturally retired when SettingsData gets its next breaking change.

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped,
Tray.Tests 1115 passed (-2 from deleted InstallerIssAssertionTests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Automation (T6a):
- scripts/test-msix-install.ps1: local-runnable smoke test that walks
  Add-AppxPackage -> assert package presence/publisher/version ->
  Start-Process activation -> wait for OpenClawTray-DeepLink named pipe ->
  send openclaw://health -> Remove-AppxPackage -> orphan check. Used as
  the automated counterpart to runbook scenarios 1 and 6.
- scripts/test-appinstaller-update.ps1: spins up a local HttpListener that
  serves vN.msix, vN+1.msix and the rendered .appinstaller; walks
  Add-AppxPackage -AppInstallerFile -> re-render to vN+1 ->
  PackageManager.AddPackageByAppInstallerFileAsync -> assert Get-AppxPackage
  reports vN+1. Catches .appinstaller XML / template regressions before
  they reach a real GitHub release.
- tests/OpenClaw.Tray.Tests/OrphanPurgerContractTests.cs: 9 source-text
  assertions (same precedent as the historical InstallerIssAssertionTests)
  pinning OrphanWslDistroPrefix, the five orphan-kind names, the exit-code
  policy (0/1/2), and the dry-run-is-default invariant. The recovery CLI
  in OpenClaw.WinNode.Cli is internal so we cannot link the assembly into
  the net10.0 tray-tests target; the source-text approach keeps the
  contract pinned without forcing the CLI to expose internals.

Manual runbook (T6b):
- docs/MSIX_E2E_TEST_RUNBOOK.md: 10-scenario release runbook covering
  clean install, packaged permission consent prompts (with the "package
  name appears in Settings > Privacy" assertion that catches an
  accidental fallback to the unpackaged DeviceAccessInformation surface),
  permission revocation while running (proves AppCapability.AccessChanged
  is wired), StartupTask (proves we did NOT regress to HKCU\\...\\Run),
  local-gateway install + clean uninstall, dirty-uninstall + recovery
  via --purge-wsl-orphans (proves Q1 mitigation works), .appinstaller
  auto-update across all four trigger paths, sideload trust on a
  no-dev-mode box, and an ARM64 cross-check.

Note: the existing build-msix CI job already runs on every PR (gated only
on the test job, not on tags), so PR-time MSIX build coverage is already
present in CI; only the *signing* step is tag-gated, which is correct.

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped,
Tray.Tests 1123 passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…both URI key cases

While prepping for the manual test pass on Mike's box we found two real bugs
in OrphanPurger that would have caused --purge-wsl-orphans to falsely report
'no orphans' against a real OpenClaw install:

1. WSL distro detection used a case-sensitive 'openclaw-' prefix. The
   historical local-gateway installer registers the distro as
   'OpenClawGateway' (PascalCase, no dash) and we miss it. Pivoted to a
   case-insensitive substring match against an OrphanWslDistroPatterns
   array — currently a single 'openclaw' entry that catches both the
   PascalCase legacy form and the newer kebab-case 'openclaw-local' /
   'openclaw-staging' variants.

2. URI scheme key detection only enumerated
   HKCU\Software\Classes\openclaw. Mike's box has both 'openclaw' AND
   'OpenClaw' keys present simultaneously (the registry is case-insensitive
   for lookup but stores both literals). Switched to an OrphanUriSchemeKeys
   array so both are scrubbed; RemoveAsync now derives the subkey from the
   detected OrphanItem.Name instead of hard-coding the lowercase form.

OrphanWslDistroPrefix const is retained for backward compatibility with
existing OrphanPurgerContractTests assertions and any external recipes
that grep for the symbol; the new code paths use OrphanWslDistroPatterns.

Tests: added two regressions in OrphanPurgerContractTests that pin the
case-insensitive WSL detection (including the 'OpenClawGateway' name in
the docstring) and the two-variant URI scheme coverage.

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped,
Tray.Tests 1125 passed (+2 from the new orphan tests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

# Conflicts:
#	tests/OpenClaw.Tray.Tests/WinAppSdkGhostWindowCleanup.cs
…ons-list missing entry

Two bugs surfaced during manual MSIX E2E testing on a clean cloud devbox.
Both are in scope for this MSIX-only-distribution branch; the other two
issues Mike found (trackpad scroll, screen-recording-toggle left-nav blip)
will be filed separately.

Bug 1 (Settings -> Privacy icon has blue background)
  Settings > Privacy > Camera/Microphone/Location renders the per-app icon
  at small sizes (16, 20 px). We shipped unplated variants for 24/32/48/256
  but were missing 16 and 20 (and the natural 44 unplated). When Windows
  cannot find a fitting unplated size it falls back to the plated tile with
  the manifest BackgroundColor, which renders as the system accent (blue)
  square behind the lobster.

  Fix: generate Square44x44Logo.targetsize-{16,20,44}_altform-unplated.png
  from the existing 256px master via high-quality bicubic downscale,
  preserving the corner alpha=0 transparency.

Bug 2 (no OpenClaw Companion entry in Settings -> Notifications)
  MSIX packaged apps that do NOT declare windows.toastNotificationActivation
  in their manifest only appear in Settings > Notifications AFTER the first
  toast is delivered under package identity (and even then it is often
  delayed by several minutes). Users who install but have not yet seen a
  toast cannot pre-configure notification preferences.

  Fix: declare the canonical activator pair in Package.appxmanifest:
    - <com:Extension Category="windows.comServer"> registering OpenClaw.Tray.WinUI.exe
      as a COM ExeServer with class id D4E7F816-9D6A-4A49-B1BC-C1CE71282B04
    - <desktop:Extension Category="windows.toastNotificationActivation">
      pointing at the same ToastActivatorCLSID

  App.OnLaunched gains a -ToastActivator short-circuit (Environment.Exit(0)
  before the singleton mutex check) so Windows-spawned activator instances
  do not fight the running tray. We do NOT currently consume toast click
  callbacks (no CoRegisterClassObject) — clicks fall through to the
  standard tray activation path, which is acceptable for now and tracked
  for a follow-up.

Tests (tests/OpenClaw.Tray.Tests/MsixManifestAssertionTests.cs):
  + Tray_DeclaresToastNotificationActivationExtension: pins both halves of
    the COM-server / toast-activation pair and asserts the two CLSIDs match,
    plus that App.xaml.cs has the -ToastActivator short-circuit.
  + Tray_PrivacyListIcon_HasAllRequiredUnplatedTargetSizes: pins the full
    set of unplated PNG variants Windows requests for the Privacy list
    (16, 20, 24, 32, 44, 48, 256). Regression here re-introduces the blue
    background.

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped,
Tray.Tests 1127 passed (+2 from the new manifest tests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t-uninstall (openclaw#467)

LocalGatewayUninstall Step 5a deleted %LOCALAPPDATA%\OpenClawTray\wsl\<distro>\
but never touched the parent wsl\ or the grandparent OpenClawTray\. After a
clean full uninstall (gateway-remove + MSIX-uninstall) the user sees two phantom
empty folders left behind. Bug reported as openclaw#467 with full repro and root cause
diagnosis.

Fix:
- New Step 5b: 'Prune empty wsl\ parent directory' fires right after the VHD
  cleanup. Empty-guard ensures a sibling distro (e.g., openclaw-staging) under
  the same wsl\ parent is NEVER wiped.
- New Step 12a: 'Prune empty %LOCALAPPDATA%\OpenClawTray\' fires at the very
  end (after Preserve mcp-token.txt, before Compute postconditions). Same
  empty-guard logic: only removes the directory if all per-artifact steps left
  it empty. If non-OpenClaw files remain, the step records Skipped with the
  remaining entry names in Detail so operators can see what's blocking the
  prune (catches new artifact-writers we forgot to wire into explicit deletes).

Both steps are idempotent and safe to re-run.

%APPDATA%\OpenClawTray\ (roaming) is intentionally NOT pruned because
mcp-token.txt lives there and is preserved by design (Step 12).

Tests (LocalGatewayUninstallTests.cs +4 regression tests):
- FullUninstall_PrunesEmptyWslParent_Issue467
- FullUninstall_PrunesEmptyLocalAppDataRoot_Issue467
- FullUninstall_PreservesLocalAppDataRoot_WhenNonEmpty_Issue467 (defensive)
- FullUninstall_PreservesWslParent_WhenSiblingDistroPresent_Issue467 (defensive)

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped,
Tray.Tests 1131 passed (+4).

Closes openclaw#467.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… build.ps1

Mike observed a new ghost Terminal frame appearing during the MSIX-E2E test
pass even though the cherry-picked WinAppSdkGhostWindowCleanup (4 commits from
wsl-keepalive-lifecycle) is in place. Root cause:

  The in-process cleanup only fires inside the testhost lifetime. It does NOT
  catch ghosts created by:
    - msbuild MSIX packaging (MakeAppx.exe, signtool.exe, WindowsAppSDK XAML
      markup compiler) — these run outside the testhost
    - testhost processes killed abnormally (SIGKILL, hung Ctrl+C, OOM) — the
      ProcessExit / AssemblyLoadContext.Unloading hooks never fire

This adds a three-layer safety net so we never strand windows on a developer
workstation OR on a future CI step that builds the MSIX without running tests:

1. scripts/cleanup-ghost-windows.ps1
   - Standalone PowerShell recovery tool. Mirrors the in-process C# cleanup
     filter EXACTLY (CASCADIA_HOSTING_WINDOW_CLASS + title == 'Terminal' +
     owner WindowsTerminal + size >= 1000x500). Uses the proven close
     sequence that survived the 2026-05-19 manual test: ShowWindow(SW_HIDE)
     -> PostMessage(SYSCOMMAND,SC_CLOSE) -> SendMessageTimeout(WM_CLOSE,
     SMTO_ABORTIFHUNG, 1000ms). A plain SendMessage WM_SYSCOMMAND alone
     did NOT close the orphans during testing — Windows Terminal swallows it.
   - Up to 5 passes with a 500ms delay between (mirrors the C#
     CleanupBlankFramesRepeatedly logic).
   - Supports -WhatIf for safe enumeration and -Quiet for use in scripts.

2. build.ps1 hook
   - On a successful build the script runs automatically (-Quiet). MSIX
     packaging ghosts get cleaned before the developer notices.

3. AGENTS.md note
   - Documents the manual recovery path for any future agent / developer who
     sees Terminal windows piling up after an interrupted test run.

Why we did not just make the in-process cleanup more aggressive: the baseline
exclusion is deliberate to protect the developer's REAL Terminal windows
(which start with title 'Terminal' until they type anything). Closing
baseline frames would risk false positives. The standalone script is the
right place for the looser-but-still-safe behavior because it's manually
invoked or build-gated, not automatically firing in every test run.

Tests (tests/OpenClaw.Tray.Tests/GhostWindowCleanupScriptContractTests.cs):
  + FilterMatchesProductionCleanup theory: pins the 4 filter constants that
    the script and in-process cleanup MUST share (class, owner, min size).
  + TitleFilter_IsExactlyTerminal_InBothImplementations: language-native
    quote-style match for the exact title comparison in each file.
  + CloseSequence_MatchesProvenMessageOrder: pins ShowWindow + PostMessage +
    SendMessageTimeout and the 4 message/flag constants by hex value.
  + Script_IsInvokedFromBuildPs1: pins the build.ps1 wiring so a refactor
    can't silently drop it.
  + Script_RejectsTinyWindowsToProtectUserTerminals: pins the 1000x500 min
    size as the safety guard against closing real user Terminals.

Validation: ./build.ps1 OK (auto-cleanup runs at end), Shared.Tests 1776
passed / 28 skipped, Tray.Tests 1139 passed (+8 new contract tests).

Manual verification on Mike's box: before fix, 1 leaked ghost; after running
scripts/cleanup-ghost-windows.ps1, 0 ghosts. Subsequent ./build.ps1 + test
runs left no leaks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ask modes

During PR creation Mike saw two new ghost Terminal frames appear, even though
the cherry-picked WinAppSdkGhostWindowCleanup is in place AND build.ps1 wires
the cleanup script at end-of-build. Root cause was a coverage gap, not a logic
bug:

  On Win11 with Windows Terminal as the default terminal app
  (HKCU\Console\%%Startup\DelegationConsole = WT CLSID — the new Win11
  default), EVERY console-spawning child process allocates a Cascadia
  hosting frame. Most close cleanly. A small fraction leak under timing
  conditions. Our cleanup only fires from triggers we wired:
    1. testhost lifetime (in-process [ModuleInitializer] + xUnit attribute)
    2. build.ps1 end-of-build

  Cascadia frames from gh / git / dotnet / pwsh invoked OUTSIDE of build.ps1
  (e.g., 'gh pr create' creating the PR for this branch) are NOT caught by
  either trigger and leak indefinitely until reboot or manual cleanup.

Fix: give scripts/cleanup-ghost-windows.ps1 two new modes for high-shell-
activity sessions where the wired triggers aren't enough.

  -Daemon [-PollSeconds N]
    Foreground watcher; polls every N seconds (default 30, range 5..3600)
    and cleans any ghosts found. Use this when you're about to do a lot
    of shell work; Ctrl+C to stop. Useful for agent-driven branches like
    this one.

  -InstallScheduledTask
    Registers a Windows scheduled task (OpenClaw-Ghost-Terminal-Cleanup)
    that runs the script every 5 minutes under the current user, hidden
    window, 2-minute execution timeout, no admin needed. Idempotent
    (drops any prior registration first). Survives reboot, no shell
    session needed. Uninstall with -UninstallScheduledTask.

Validated install/uninstall round-trip on Mike's box during commit prep:
  PT5M repetition, State=Ready, clean uninstall.

AGENTS.md documents the Win11-default-terminal-app trigger explicitly and
points future agents at the two escalation modes.

Tests (+5):
  + Script_ExposesEscalationModesForOutOfBandLeaks theory pins
    -Daemon, -PollSeconds, -InstallScheduledTask, -UninstallScheduledTask
    presence in the script source.
  + Script_ScheduledTaskName_IsStable pins the task name so the
    AGENTS.md / support-recipe references can't silently drift between
    installer and uninstaller.

This commit does NOT change the filter logic or close sequence — the
1000x500 + class + title + owner guards still protect real user Terminal
windows. Only adds new invocation modes.

Validation: ./build.ps1 OK, Shared.Tests 1776 passed / 28 skipped,
Tray.Tests 1144 passed (+5).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@indierawk2k2
Copy link
Copy Markdown
Contributor Author

Coverage gap found during PR creation flow — addressed in commit 9fdf999

Two new ghost Terminal frames appeared during the gh pr create invocation that opened this PR. Investigation:

Default terminal app on Win11 (the new default since 24H2) is Windows Terminal:

\
HKCU\Console%%Startup\DelegationConsole = {2EACA947-7F5F-4CFA-BA87-8F7FBEEFBE69} # Windows Terminal CLSID
\\

That means every console-spawning child process — gh, git, dotnet, fresh pwsh sessions, agent-driven tool invocations — allocates a Cascadia hosting frame. Most close cleanly when the child exits; a small fraction leak under timing conditions.

The cleanup we already shipped only fires from triggers we wired:

  1. testhost lifetime (WinAppSdkGhostWindowCleanup [ModuleInitializer] + xUnit before/after attribute)
  2. build.ps1 end-of-build hook

gh pr create matches neither. So those two ghosts were structurally outside our coverage — not a logic bug, a coverage gap.

Fix (commit 9fdf999)

Added two new modes to scripts/cleanup-ghost-windows.ps1 for sessions where the wired triggers aren't enough:

  • -Daemon [-PollSeconds N] — foreground watcher (default 30s, range 5..3600). Use when you're about to do a lot of shell work; Ctrl+C to stop.
  • -InstallScheduledTask — registers a hidden 5-minute task under the current user (no admin needed, idempotent). Uninstall with -UninstallScheduledTask.

Validated install/uninstall round-trip on the test box. Filter logic, close-message sequence, and the 1000×500 safety guard are unchanged — only invocation modes are new.

AGENTS.md updated to explicitly call out the Win11-default-terminal-app trigger and the new escalation paths. GhostWindowCleanupScriptContractTests gains 5 new assertions pinning all four switch names and the scheduled-task name.

Where this leaves the PR

The PR can't fix the underlying Windows Terminal / ConPTY leak (that's a Microsoft-owned bug). What it can do — and now does — is ship a thorough recovery story across the full leak surface:

Source of ghost Caught by
Tray test process lifetime WinAppSdkGhostWindowCleanup (in-process)
msbuild MSIX packaging build.ps1 end-of-build hook
gh / git / ad-hoc shell work -Daemon or -InstallScheduledTask (developer opt-in)
Killed-with-Ctrl+C testhost One-shot scripts/cleanup-ghost-windows.ps1 (manual)
CI runner CI VMs are ephemeral; build.ps1 hook keeps the runner clean for the job's lifetime

Validation after the fix: ./build.ps1 OK, Shared.Tests 1776 / 28 skipped, Tray.Tests 1144 passed (+5). PR is now at 15 commits.

Recommendation for reviewers / your collaborator who's about to do heavy shell work on this branch: ./scripts/cleanup-ghost-windows.ps1 -InstallScheduledTask once, then forget about it (or use -Daemon for the duration of a single session).

Mike Harsh and others added 2 commits May 19, 2026 20:55
Use architecture-specific AppInstaller metadata, embed update settings in the MSIX package, avoid force-shutdown updates by default, and add release-hosting validation coverage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Seed packaged notification settings with a suppressed toast, make tile plating use a stable non-accent color, initialize tray chrome before optional startup work, and add startup breadcrumbs for post-install launch diagnostics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Uninstall leaves behind empty directories in %LOCALAPPDATA%

2 participants