Skip to content

fix: surface swallowed errors across crates and emit turn failures to activity log#628

Merged
wpfleger96 merged 2 commits into
mainfrom
wpfleger/fix-jim-agent-error-logging
May 20, 2026
Merged

fix: surface swallowed errors across crates and emit turn failures to activity log#628
wpfleger96 merged 2 commits into
mainfrom
wpfleger/fix-jim-agent-error-logging

Conversation

@wpfleger96
Copy link
Copy Markdown
Collaborator

@wpfleger96 wpfleger96 commented May 20, 2026

Fix a systemic pattern of swallowed or under-logged errors across sprout-acp, sprout-relay, and sprout-media, and introduce turn_error/agent_panic observer events so agent failures appear in the Activity Log UI.

The original trigger was Jim agent silently dead-lettering every batch dispatch because PromptOutcome::Error was logged at debug without the error value — completely invisible at the default sprout_acp=info log level. An audit revealed the same pattern across multiple crates: errors discarded with let _ =, logged at debug in critical paths, or logged without the actual error content.

Error logging fixes:

  • sprout-acp: upgrade AgentExited/Timeout and relay observer publish failures from debug!warn!; add error = %e to error logs that omitted it
  • sprout-acp: upgrade reaction sign/build failures from debug!warn! (code bugs, not transient network errors)
  • sprout-relay: upgrade channel reconciliation failures from debug!warn!; replace silent let _ = on DB and git config operations with logged errors; fix .unwrap_or_default() that hid DB query failures; capture discarded enforcement error in auth handler
  • sprout-relay: upgrade NIP-OA auth tag validation failures from debug!info! for security auditability
  • sprout-media: add missing error = %e to upload metadata warn!

Activity log integration:

  • Emit turn_error observer event when a prompt fails (PromptOutcome::Error, AgentExited, Timeout) so failures appear as red lifecycle entries in the Agent Activity Log panel
  • Emit agent_panic observer event when an agent task panics, with the JoinError message
  • Frontend handler in agentSessionTranscript.ts renders both as lifecycle items with error-flagged titles, which the existing LifecycleItem component styles in text-destructive

Application-class errors (IdleTimeout, HardTimeout, Json, etc.) were
logged at debug level without the error value, making them invisible at
the default sprout_acp=info log level. Transport errors had the right
level but also dropped the error content.
@wpfleger96 wpfleger96 marked this pull request as ready for review May 20, 2026 16:07
@wpfleger96 wpfleger96 requested a review from wesbillman as a code owner May 20, 2026 16:07
Errors across sprout-acp, sprout-relay, and sprout-media were logged at
debug level or silently discarded, making failures invisible at the
default info log level. Upgrades critical error paths to warn/info and
adds error values where they were missing.

Introduces turn_error and agent_panic observer events so prompt failures
and agent crashes appear as red lifecycle entries in the Agent Activity
Log panel, closing the gap where failed turns simply vanished from the
UI timeline.
@wpfleger96 wpfleger96 changed the title fix(sprout-acp): surface agent prompt errors in warn logs fix: surface swallowed errors across crates and emit turn failures to activity log May 20, 2026
@wpfleger96 wpfleger96 merged commit c8f99bb into main May 20, 2026
15 checks passed
@wpfleger96 wpfleger96 deleted the wpfleger/fix-jim-agent-error-logging branch May 20, 2026 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants