Skip to content

fix: improve error diagnostics, log exposure, and session error display (issues #420, #424, #425)#427

Merged
PureWeen merged 7 commits intomainfrom
PP----Logged-Issues-orchestrator-f282
Mar 25, 2026
Merged

fix: improve error diagnostics, log exposure, and session error display (issues #420, #424, #425)#427
PureWeen merged 7 commits intomainfrom
PP----Logged-Issues-orchestrator-f282

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

Summary

Fixes three issues filed by @roji against PolyPilot v1.0.11 (installed via Homebrew).

Fixes #420, Fixes #424, Fixes #425


Issue #420 — Crash on macOS 15.7.4 (FoundationModels missing)

Root cause: Microsoft.Maui.Essentials.AI v10.0.50-ci.main.26126.2 bundles a native framework with a hard (LC_LOAD_DYLIB) dependency on FoundationModels.framework, which only exists on macOS 26+. On macOS 15.x, dyld aborts at launch before any managed code runs.

Fix: Updated Microsoft.Maui.Essentials.AI to v10.0.50-ci.main.26157.2, which links FoundationModels as LC_LOAD_WEAK_DYLIB — the app loads on older macOS and gracefully degrades (PolyPilot doesn't call any EssentialsAI APIs directly).


Issue #424 — "Failed to start server" with no useful error info

Changes:

  • ServerManager.StartServerAsync now captures stderr into a LastError property instead of discarding it silently
  • IServerManager exposes string? LastError { get; }
  • All 8 FallbackNotice error assignments in CopilotService now use a shared BuildServerFallbackNotice(error, logPath, reason) helper that consistently surfaces: the error reason, the actual error text (when available), the path to ~/.polypilot/event-diagnostics.log, and Settings guidance
  • Settings page StartServer() button now shows the actual error with a 10s dismiss (e.g. "Failed to start server: Port 4321 already in use") instead of the generic message
  • Added "View Logs" button in both the Dashboard fallback banner and the Settings Persistent Server section — opens ~/.polypilot/ in Finder/Explorer

Issue #425 — Session creation error hidden on first attempt

Root cause: CreateSessionForm.TriggerCreate() reset isExpanded = false immediately after OnCreate.InvokeAsync(). Blazor parameter updates (CreateError) arrive in the next render cycle, so the form collapsed before the error could be shown. On the second click, the persisted error state became visible.

Fix:

  • Removed auto-collapse from TriggerCreate()
  • Added public void CollapseOnSuccess() method to CreateSessionForm
  • SessionSidebar.HandleCreateSession and HandleCreateGroup now call CollapseOnSuccess() only after verifying success — error leaves the form open and visible immediately

Tests

  • 4 new unit tests for BuildServerFallbackNotice: verifies error text, log path, empty-error handling, and custom reason string
  • Fixed pre-existing flaky test race (UrgencySortTests missing [Collection("BaseDir")] attribute)
  • All 2927 tests pass

@PureWeen
Copy link
Copy Markdown
Owner Author

🔍 PR Review Squad — Round 1

PR Summary

Three-fix PR addressing reported issues:

CI Status

⚠️ No checks configured on branch

Build & Tests

  • ✅ Build: 0 errors
  • BuildServerFallbackNotice unit tests pass (4 new)
  • UrgencySortTests flaky fix ([Collection("BaseDir")]) verified

Findings

# Severity Location Description Consensus
N1 🟡 MODERATE CopilotService.cs:905,1202,1227,1299,1320 BuildServerFallbackNotice() template always says "fell back to Embedded mode. Your sessions won't persist across restarts." — but 5 of 7 call sites do NOT switch to Embedded mode. Lines 905/1202/1227/1299/1320 set FallbackNotice while remaining in Persistent mode with IsInitialized = false. Only lines 845 and 914 actually do CurrentMode = ConnectionMode.Embedded. This misleads users into thinking their sessions are running in a degraded-but-alive Embedded fallback when the server is actually down. 4/5
N2 🟢 MINOR Dashboard.razor:OpenLogsFolder No else branch for Linux/GTK — silently does nothing on non-Mac/non-Windows. Settings.OpenLogsFolder already has an else { ShowStatus("Logs at: ...", ...) } fallback. Copy that pattern to Dashboard. 4/5
N3 🟢 MINOR ServerManager.cs:130 string.Join("\n", stderrLines) is called immediately after the 15s timeout without awaiting the stderr drain task (t2). The ConcurrentQueue is thread-safe (no crash), but late-arriving stderr lines may be missing from LastError. Acceptable for best-effort diagnostics — the most common error ("port in use") is typically written immediately at process start. 4/5

Cleared Concerns

  • Form field reset regression: ✅ NOT a bug — Collapse() (line 243) explicitly resets branchInput, prInput, sessionDirectory, initialPrompt, selectedWorktreeId, and nameManuallyEdited. All 5 models confirmed.
  • CollapseOnSuccess() null-safety: ✅ createSessionFormRef?.CollapseOnSuccess() null-conditional is correct.
  • Process disposal: ✅ Task.WhenAll(t1, t2).ContinueWith(_ => process.Dispose()) correctly disposes the OS handle when both streams close, while the detached server process keeps running.

Suggested Fix for N1

The simplest fix is to split the template — the reason parameter already differentiates the cases, so the "fell back to Embedded mode" clause can be conditional:

internal static string BuildServerFallbackNotice(string? serverError, string logPath, string reason = "couldn't start", bool embeddedFallback = true)
{
    var detail = string.IsNullOrEmpty(serverError) ? "" : $"\n\nError: {serverError}";
    var fallbackClause = embeddedFallback
        ? " — fell back to Embedded mode. Your sessions won't persist across restarts."
        : " — reconnection failed.";
    return $"Persistent server {reason}{fallbackClause}{detail}\n\nLogs: {logPath}\n\nGo to Settings → Save & Reconnect to fix.";
}

Then pass embeddedFallback: false at the 5 non-Embedded call sites.

Test Coverage

4 new unit tests cover BuildServerFallbackNotice. No test currently covers the "Embedded vs non-Embedded" message distinction — worth adding one per the fix above.

Verdict: ⚠️ Request changes

N1 is a genuine semantic bug in the error message for 5 of 7 call sites. N2 and N3 are non-blocking minor nits.

@PureWeen
Copy link
Copy Markdown
Owner Author

PR #427 -- Round 2 Re-Review (Aggregated)

New commit since Round 1: 0574dc30 -- "fix: address PR review feedback"

Round 1 Findings Status

# Finding Status
M1 🟡 BuildServerFallbackNotice misleading "fell back to Embedded mode" at 5/7 non-Embedded call sites FIXED -- new embeddedFallback param; 5 non-Embedded sites pass false, 2 actual fallback sites use true
M2 🟡 Dashboard.OpenLogsFolder missing Linux/GTK branch FIXED -- added else branch showing log path inline for 8s
N1 🟢 stderr snapshot race in ServerManager.StartServerAsync FIXED -- added await Task.WhenAny(t2, Task.Delay(500)) grace window

Build & Tests

  • Mac Catalyst: ✅ 25 warnings, 0 errors
  • Tests: ✅ All pass (6 new BuildServerFallbackNotice tests including EmbeddedFallback True/False variants)

New Findings: None

All Round 1 issues addressed correctly. No new issues found.

Verdict: ✅ Approve

Review by PR Review Squad (5 workers, multi-model consensus)

PureWeen and others added 5 commits March 25, 2026 08:49
…re, session create error

Issue #420 (crash on macOS 15 — FoundationModels missing):
- Upgrade Microsoft.Maui.Essentials.AI from 26126.2 to 26157.2
- The 26126.2 binary hard-links FoundationModels.framework (no 'weak' flag)
  so dyld aborts on macOS 15 before any managed code runs. The 26157.2 build
  (released March 7) already ships with FoundationModels as a weak-linked
  dependency, allowing the app to load on older macOS without Apple Intelligence.

Issue #424 ("Failed to start server" with no useful error info):
- ServerManager.StartServerAsync now captures process stderr into LastError
- IServerManager exposes LastError property
- FallbackNotice for persistent-server failures now includes the actual error
  message and the path to the event-diagnostics.log file so users can
  self-diagnose without filing a bug
- Dashboard shows a "View Logs" button on the fallback-notice card that opens
  the ~/.polypilot/ log directory in Finder/Explorer

Issue #425 ("Service not initialized" error hidden on first Create attempt):
- Root cause: CreateSessionForm.TriggerCreate() always set isExpanded=false
  after OnCreate.InvokeAsync(), collapsing the form before Blazor could render
  the CreateError that the parent just set. Error only appeared on second click.
- Fix: Remove auto-collapse from TriggerCreate; add public CollapseOnSuccess()
  method that the parent calls explicitly only after a successful session creation.
- HandleCreateSession and HandleCreateGroup now call createSessionFormRef
  .CollapseOnSuccess() on the success path.

Also:
- Fix flaky UrgencySortTests test class: add [Collection("BaseDir")] to prevent
  race with other BaseDir-mutating test classes running in parallel

All 2923 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…der; extract BuildServerFallbackNotice helper

- Settings.razor StartServer() now shows the actual error from ServerManager.LastError
  with a 10s dismiss and log path fallback (instead of plain 'Failed to start server')
- Add 'View Logs' button in Settings Persistent Server section → opens ~/.polypilot/ in Finder
- Extract CopilotService.BuildServerFallbackNotice() static helper used by all three
  FallbackNotice-setting sites (InitializeAsync, server recovery, server restart)
- Add 4 unit tests for BuildServerFallbackNotice covering error content, log path,
  empty error, and custom reason string

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ths to BuildServerFallbackNotice

- Remove duplicate 'Go to Settings' appended after BuildServerFallbackNotice (the
  helper already includes Settings guidance), fixing double-text user-visible bug
- Migrate all remaining hardcoded FallbackNotice error assignments to use the helper:
  * Version mismatch restart failed (line ~905) -- adds log path
  * Version mismatch fallback to Embedded (line ~914) -- adds log path
  * Recovery catch block (line ~1228) -- now includes ex.Message as error detail
  * Restart connection failure (line ~1322) -- now includes ex.Message as error detail
- Every error-path FallbackNotice now consistently shows: reason, optional error
  detail, log file path, and Settings guidance

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ard Linux fallback, stderr grace period

N1 (MODERATE): BuildServerFallbackNotice now takes embeddedFallback parameter (default true).
Five non-Embedded call sites now pass embeddedFallback: false so they no longer
incorrectly say 'fell back to Embedded mode. Your sessions won't persist across
restarts.' — those paths keep the server down but don't switch to Embedded mode.
Two Embedded sites (lines 845, 914) retain embeddedFallback: true.
Added 2 new tests covering embeddedFallback=true and embeddedFallback=false behavior.

N2 (MINOR): Dashboard.OpenLogsFolder now shows the log directory path inline for
8 seconds on Linux/GTK platforms (matching Settings.razor's ShowStatus fallback).
Added _openLogsStatus field; button markup shows the status string when set.

N3 (MINOR): ServerManager.StartServerAsync now awaits Task.WhenAny(t2, Task.Delay(500))
before reading stderrLines after the 15s timeout, giving the stderr drain task a
500ms grace window to capture any final output from the failing server process.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
crash.log had no size limit — it appends on every unhandled exception.
Now rotates at 5 MB: old file moved to crash.log.old (one backup kept),
matching the pattern used by event-diagnostics.log (10 MB) and
plugin.log (5 MB).

All other polypilot logs already have size management:
- event-diagnostics.log: deletes at 10 MB
- plugin.log: rotates at 5 MB with .old backup
- console.log: truncated by nohup redirect on each relaunch
- audit_logs/: daily rotation + 30-day purge on startup

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the PP----Logged-Issues-orchestrator-f282 branch from 57a17af to 2ce12dd Compare March 25, 2026 13:49
@PureWeen
Copy link
Copy Markdown
Owner Author

PR #427 — Round 3 Review

New commit since Round 2 (✅ Approve): 2ce12dd6 — "fix: add 5 MB rotation to crash.log to prevent unbounded growth"

New Commit Analysis

The new commit adds 8 lines to LogException() in MauiProgram.cs to rotate crash.log at 5 MB (move to .old, one backup kept). This matches the existing pattern used by plugin.log (5 MB rotation) and event-diagnostics.log (10 MB delete).

Potential edge cases checked:

  • Concurrent rotation: LogException is called from UnhandledException/UnobservedTaskException handlers (background threads). If two fire simultaneously, one File.Move succeeds and one fails silently — AppendAllText then creates a new file. No crash, no data loss.
  • Failed File.Delete on backup: If the .old file is locked, File.Move also fails silently, and the log continues growing past 5 MB. Acceptable "best effort" behavior — identical to the plugin.log pattern.
  • new FileInfo(CrashLogPath) overhead: Negligible since crash logging is rare.

Round 1 + Round 2 Findings: All ✅ FIXED (from prior reviews)

# Finding Status
M1 🟡 BuildServerFallbackNotice misleading "fell back to Embedded mode" at 5/7 non-Embedded sites ✅ FIXED
M2 🟡 Dashboard.OpenLogsFolder missing Linux/GTK branch ✅ FIXED
N1 🟢 stderr snapshot race in ServerManager.StartServerAsync ✅ FIXED

New Findings: None

CI: No checks configured on branch.

Verdict: ✅ Approve

New commit is clean and consistent with existing log management patterns.

Review by PR Review Squad — Worker 5

PureWeen and others added 2 commits March 25, 2026 09:10
- Removed View Logs button from Dashboard FallbackNotice banner
- Added new 'Diagnostics' section at bottom of Settings page
- Renamed button to 'Open Log Folder' with folder icon
- Added guidance text explaining each log file:
  - event-diagnostics.log: session & server diagnostics
  - crash.log: unhandled exceptions
  - console.log: general app output
- Searchable via Settings search (logs, diagnostics, troubleshoot, etc.)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner Author

PR #427 — Round 4 Re-Review

New commits since Round 3 (✅ Approved):

  • 6aa617e8 — move View Logs from Dashboard FallbackNotice banner → new Settings Diagnostics section
  • 4d9dde84 — add Diagnostics nav item to Settings sidebar

Tests: ✅ 2929 passing, 0 failing
CI: No checks configured on branch


Prior Findings Status

# Finding Status
N1 🟡 BuildServerFallbackNotice misleading "Embedded mode" at 5/7 non-Embedded sites ✅ FIXED
N2 🟡 Dashboard.OpenLogsFolder missing Linux/GTK branch ✅ FIXED (removed from Dashboard; Settings version has else branch)
N3 🟢 stderr snapshot race in ServerManager.StartServerAsync ✅ FIXED

New Findings

🟢 MINOR — console.log not created on Windows (5/5 models)

Settings.razor — Diagnostics log file guide

The guide lists console.log as "General app output — verbose debug logging from the running app." This file is created by relaunch.sh which redirects nohup ... > ~/.polypilot/console.log 2>&1. relaunch.sh is a macOS-only bash script — Windows users launch differently. The Diagnostics section is already PlatformHelper.IsDesktop-gated, but Windows is desktop. A user on Windows following this guide won't find console.log.

Suggested fix: Add "(macOS only — written by relaunch.sh)" qualifier, or conditionally render that list item only on macOS.


🟢 MINOR — GoToSettings() indentation regression (4/5 models)

Dashboard.razor:714

The method declaration has 8-space leading indent while its body and all surrounding methods use 4-space. Introduced as a diff artifact of the OpenLogsFolder removal. Cosmetic only — compiles fine.


🟢 MINOR (not consensus) — Scroll-spy ids missing settings-diagnostics

Settings.razor:812

var ids = ['settings-connection','settings-cli','settings-ui','settings-developer'];

settings-diagnostics is not observed by the IntersectionObserver, so the sidebar nav won't auto-highlight "Diagnostics" when the user scrolls there manually. However, this is pre-existing behaviorsettings-plugins is also absent from this list on main. Clicking the nav item works correctly (explicit activeCategory = "diagnostics" assignment). Not new debt.


Not Flagged / Confirmed Clean

  • UX: View Logs removed from banner — 3/5 models accepted this. The FallbackNotice text already includes the log path inline (Logs: ~/.polypilot/event-diagnostics.log), and the "Settings" button provides navigation. Acceptable tradeoff. ✅
  • Form field reset regression (Can't create session: "Service not initialized. Call InitializeAsync first." #425 fix)Collapse() resets all fields; CollapseOnSuccess() correctly delegates. ✅
  • crash.log rotation (5 MB) — consistent with plugin.log pattern. ✅
  • BuildServerFallbackNotice helper — correct embeddedFallback usage at all 7 call sites. ✅
  • Microsoft.Maui.Essentials.AI bump — weak link fix confirmed. ✅

Verdict: ✅ Approve

Two new commits are clean. The console.log documentation inaccuracy on Windows is the only consensus finding and it's minor — the file just won't be there, which is harmless. All substantive prior findings remain fixed. Good to merge.

Review by PR Review Squad — Round 4 (5 workers: 2× claude-opus-4.6, claude-sonnet-4.6, gemini-3-pro-preview, gpt-5.3-codex)

@PureWeen PureWeen merged commit a6bffb4 into main Mar 25, 2026
@PureWeen PureWeen deleted the PP----Logged-Issues-orchestrator-f282 branch March 25, 2026 14:42
PureWeen added a commit that referenced this pull request Mar 25, 2026
…tics in Release (#433)

## Summary

Fixes the "Failed to start server: ErrorStartingProcess, copilot, No
such file or directory" error reported by @roji when installing
PolyPilot via Homebrew. Also fixes `event-diagnostics.log` never being
created in Release builds (oversight from PR #427).

---

### Bug 1 — Bundled copilot CLI not found in AOT builds

**Root cause:** In Release/AOT Mac Catalyst builds, `Assembly.Location`
for `GitHub.Copilot.SDK` either returns empty (when the `.dll` is
stripped entirely) or points to a `.xamarin/maccatalyst-{arch}/`
subdirectory. The copilot binary lives in `MonoBundle/` root.
`ResolveBundledCliPath()` only checked paths relative to
`Assembly.Location`, so it missed the binary.

Both `ServerManager.FindCopilotBinary()` and `CreateClient()` then fell
back to bare `"copilot"` → ENOENT → server fails → embedded fallback
fails → `IsInitialized = false` → every session creation throws "Service
not initialized."

**Fix:** Added `AppContext.BaseDirectory` as a third fallback in
`ResolveBundledCliPath()`. It always points to `MonoBundle/` regardless
of build config or AOT stripping.

### Bug 2 — `event-diagnostics.log` missing in Release builds

**Root cause:** The diagnostic log writer in `Debug()` was wrapped in
`#if DEBUG` (added Feb 18 as a dev-only feature). PR #427 then added 8
`BuildServerFallbackNotice` references telling users to check this log
file — but it never exists in Release builds.

**Fix:** Removed the `#if DEBUG` guard. The log already has a 10MB
rotation guard. Also expanded the filter to capture `[ERROR]`,
`[ABORT]`, `[BRIDGE]`, and `"Failed to"` messages so initialization
failures are logged.

### Diagnostic logging

Added `Console.WriteLine` logging in `ServerManager.FindCopilotBinary()`
showing which path was selected and, when the bundled binary is not
found, logging `Assembly.Location` and `AppContext.BaseDirectory` for
diagnosis.

---

### Tests

- 3 new tests for `AppContext.BaseDirectory` fallback behavior
- All 2938 tests pass (1 pre-existing flaky timing test)

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant