Skip to content

feat: capture and display ACP error responses in Discord#170

Merged
thepagent merged 5 commits intoopenabdev:mainfrom
the3mi:feat/error-handling
Apr 10, 2026
Merged

feat: capture and display ACP error responses in Discord#170
thepagent merged 5 commits intoopenabdev:mainfrom
the3mi:feat/error-handling

Conversation

@the3mi
Copy link
Copy Markdown
Contributor

@the3mi the3mi commented Apr 9, 2026

Summary

Closes #50 — Display meaningful error messages in Discord instead of _(no response)_.

This PR unifies error handling for both startup/connection errors and ACP response errors into a single display layer.

Changes

Two-tier error formatting

Error type Source Formatter
Startup / connection errors pool.rs, connection.rs format_user_error()
ACP response errors (coded) ACP notification (JSON-RPC/HTTP) format_coded_error()

format_user_error(message) — handles startup/connection errors

No error code available (anyhow errors), uses keyword matching on the message:

Keyword Output
timeout waiting for <method> Request Timeout — Timeout waiting for {method}, please try again.
connection closed / channel closed Connection Lost — The connection to the agent was lost, please try again.
failed to spawn / no such file Agent Not Found — Could not start the agent — please check your configuration.
pool exhausted Service Busy — All agent sessions are in use, please try again shortly.
invalid api key / unauthorized Unauthorized — Please check your API key configuration.

format_coded_error(code, message) — handles ACP response errors

Protocol-level codes only (JSON-RPC -326xx, HTTP 4xx/5xx). No provider-specific strings.
Provider-agnostic: message text passed through verbatim from upstream agent.

Code Output
400 Bad Request
401 Unauthorized
429 Rate Limited
500 Internal Server Error
503 Service Unavailable
-32602 Invalid Params
-32000 Connection Error
... ...

Startup error path — pool.get_or_create()

pool.rs errors (pool exhausted, spawn failures, timeouts) now pass through format_user_error() before displaying in Discord.

Design decisions

  • No string matching for coded errors — relies on JSON-RPC/HTTP status codes only.
  • No hardcoded provider strings — error messages are provider-agnostic.
  • Reusable formatterformat_user_error() can be called from any error site (startup, response, timeout).
  • Linked to improvement: show meaningful status messages instead of '...' #50 — aligns with the issue to have a unified error display strategy.

Related

Related: #50

@the3mi the3mi requested a review from thepagent as a code owner April 9, 2026 22:24
Copy link
Copy Markdown
Contributor

@JARVIS-coding-Agent JARVIS-coding-Agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — showing meaningful errors instead of _(no response)_ is definitely the right direction.

However, this PR has a few design concerns that I think we should address before merging. The main issues are:

  1. No linked issue — This is closely related to #50 ("show meaningful status messages instead of '...'"), which covers both startup failures and response errors. We should align the approach there before merging partial solutions.
  2. Provider coupling — The error formatting layer hardcodes Anthropic-specific strings, but openab is provider-agnostic.
  3. Fragile detection logic — String-based contains() checks are brittle and will silently break if upstream error formats change.

See inline comments for specifics. I'd suggest discussing the overall error display strategy in #50 first, then coming back with a more unified approach.

Comment thread src/discord.rs Outdated
fn format_error(message: &str, code: i64) -> String {
match code {
401 | -32602 => {
if message.contains("API_KEY") || message.contains("api_key")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This string matching is quite fragile. message.contains("401") will match any message that happens to contain "401" as a substring (e.g. a message referencing error count "1401 errors occurred"). Same concern for the other contains() checks — if the upstream provider changes their error message wording, this silently falls through to the generic Invalid Request branch.

Since you already have the error code available, you could just rely on the code for categorization and skip the string matching entirely:

401 => "**API Key / Auth Error**\nPlease check your API key configuration.".to_string(),
-32602 => format!("**Invalid Request**\n{}", message),

This is less clever but much more robust.

Comment thread src/discord.rs Outdated
format!("**Invalid Request**\n{}", message)
}
429 => "**Rate Limit**\nToo many requests, please try again later.".to_string(),
529 => "**Service Overloaded**\nClaude API is under high load, please try again later.".to_string(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding Claude API here couples the Discord display layer to a specific LLM provider. openab is an ACP broker that should work with any ACP-compatible agent — the error messages shouldn't assume Anthropic.

Consider using a generic description like:

529 => "**Service Overloaded**\nThe upstream API is under high load, please try again later.".to_string(),
503 => "**Service Unavailable**\nThe upstream API cannot handle requests right now, please try again later.".to_string(),

Comment thread src/discord.rs Outdated
|| message.contains("unauthorized") || message.contains("Invalid API")
{
return "**API Key Error**\nPlease ensure `ANTHROPIC_API_KEY` is set and valid.".to_string();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the return message also hardcodes ANTHROPIC_API_KEY. If someone is running openab with a different provider (e.g. OpenAI, local model), this message would be confusing.

Consider a generic message like:

"**API Key Error**\nPlease ensure your API key is set and valid."

Comment thread src/discord.rs Outdated
out
}

/// Format error from ACP agent for display in Discord.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function only covers ACP response errors, but #50 also identifies agent startup failures as a major UX gap (e.g. timeout waiting for session/new, connection closed, binary not found). If we merge this now, we'll end up with two separate error display paths that need to be reconciled later.

Worth considering: should this be a more general format_user_error() that can be reused for both startup and response errors?

When the ACP agent returns an error, display a user-friendly message
instead of '_(no response)_'.

- Capture response_error from ACP notification before loop exits
- Show error on empty response or append to existing content
- format_error() uses protocol-level codes (JSON-RPC / HTTP) - no provider-specific strings
- Provider-agnostic: message text passed through verbatim from upstream agent
@the3mi the3mi force-pushed the feat/error-handling branch from 030d400 to eb60981 Compare April 9, 2026 23:31
Two error formatters:
- format_user_error(message: &str): handles startup/connection errors from pool.rs
  - timeout waiting for session/new → Request Timeout
  - connection/channel closed → Connection Lost
  - failed to spawn/no such file → Agent Not Found
  - pool exhausted → Service Busy
  - invalid api key/unauthorized → Unauthorized

- format_coded_error(code, message): handles ACP response errors (JSON-RPC/HTTP codes)

Both paths now use format_user_error() for startup errors and
format_coded_error() for response errors. Link to openabdev#50.
Copy link
Copy Markdown
Contributor

@JARVIS-coding-Agent JARVIS-coding-Agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice revision — the main concerns from the previous review are all addressed:

  • ✅ Linked to #50
  • ✅ Provider-specific strings removed from format_coded_error (no more ANTHROPIC_API_KEY, Claude API, or code 529)
  • format_coded_error now uses code-only matching — no fragile contains() on upstream messages
  • ✅ Unified two-tier design: format_user_error() for startup, format_coded_error() for response
  • ✅ Startup path (pool.get_or_create) now uses format_user_error() instead of hardcoded string

One minor nit left (see inline comment), but not a blocker. LGTM overall.

Comment thread src/discord.rs Outdated
.unwrap_or("the agent");
return format!("**Agent Not Found**\nCould not start {} — please check your configuration.", cmd);
}
if msg_lower.contains("pool exhausted") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: there's still a w.contains("claude") here. This means if someone runs a non-Claude agent and the spawn fails, the error message won't extract the command name correctly (it'll fall back to "the agent", which is fine, but the claude check is dead weight in that case).

Consider just removing the claude check and keeping only w.contains("agent"):

.find(|w| w.contains("agent"))

Or even simpler — just always say "the agent" since the binary name is an implementation detail the Discord user doesn't care about.

Non-blocking, can be a follow-up.

Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the two-tier error handling design. A few suggestions:

Should fix before merge

  1. Remove "claude" hardcode in format_user_error — The .contains("claude") check in the "failed to spawn" branch contradicts the provider-agnostic goal. Suggest removing it and keeping only "agent" or using a generic fallback.

Suggestions (non-blocking)

  1. Clarify error + partial content behavior — When ACP returns an error but also has partial text, both are displayed together. This could confuse users. Worth confirming this case actually happens in practice, and if not, adding a comment explaining why.

  2. Make format_coded_error publicformat_user_error is pub but format_coded_error is private. For consistency and future reuse (e.g. Slack adapter), consider making it pub as well.

  3. Add unit tests — Both format_user_error and format_coded_error are pure functions, ideal candidates for unit tests.

  4. Cache regex in strip_mention (nice-to-have) — The regex is recompiled on every call. Could use LazyLock or lazy_static! to cache it.

Overall the design is clean and the code quality is solid. Just the "claude" string in item 1 should be addressed before merging. 👍

Address reviewer feedback:
- Remove .contains("claude") from failed_to_spawn branch - was contradicting provider-agnostic goal
- Make format_coded_error public for reuse by other adapters
- Add comment explaining error + partial content display behavior
- Add 17 unit tests for format_user_error and format_coded_error
Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall clean PR 👍 — one bug to fix (timeout method extraction case sensitivity), rest are suggestions.

Comment thread src/discord.rs Outdated

// Startup / connection errors (code == 0 from anyhow)
if msg_lower.contains("timeout waiting for") {
let method = message
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Bug: message.split("timeout waiting for ") uses the original casing, but the match above uses msg_lower. If the original message has different casing (e.g. "Timeout Waiting For session/new"), this split won't match and will fall back to "request".

Suggestion: split on msg_lower instead:

let method = msg_lower
    .split("timeout waiting for ")
    .nth(1)
    .unwrap_or("request");

Or use a case-insensitive approach to extract the method name from the original message.

Comment thread src/discord.rs Outdated
format!("**Error**\n{}", message)
}
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: Since both format_user_error and format_coded_error are marked pub and the doc comment mentions reuse by other adapters (e.g. Slack), consider extracting these into a separate module like src/error_display.rs. That way other adapters don't need to depend on the discord module.

Comment thread src/discord.rs Outdated
-32602 => "**Invalid Params**",
-32603 => "**Internal Error**",
-32000 => "**Connection Error**",
_ => "**Error**",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Nice-to-have: JSON-RPC spec reserves -32000 to -32099 for server errors. Could add a range match as a catch-all:

-32000..=-32099 => "**Server Error**",

so any server-defined error in that range gets a reasonable label instead of the generic **Error**.

Comment thread src/discord.rs
let final_content = compose_display(&tool_lines, &text_buf);
// If ACP returned both an error and partial text, show both.
// This can happen when the agent started producing content before hitting an error
// (e.g. context length limit, rate limit mid-stream). Showing both gives users
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Good comment explaining why both error and partial text are shown together. This is a thoughtful UX decision.

OpenClaw Bot added 2 commits April 10, 2026 21:06
Address reviewer feedback from openabdev#170:
- Fix case sensitivity bug in timeout method extraction (use msg_lower.find)
- Extract format_user_error + format_coded_error to separate module
- JSON-RPC -32099..=-32000 range catch-all for server errors
- Add mixed-case timeout test to verify fix
- Tests moved to error_display module (19 total, all pass)

The formatters are now reusable by other adapters (Slack, etc.)
without depending on the discord module.
Regex is now compiled once at startup via std::sync::LazyLock,
instead of on every call. Covers the nice-to-have reviewer item.
Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 All review feedback addressed cleanly — case sensitivity fix, module extraction, range match, and LazyLock. One tiny typo: "case-insistent" → "case-insensitive" in error_display.rs L14, but not blocking. Ship it! 🚀

@thepagent thepagent merged commit 3be10b0 into openabdev:main Apr 10, 2026
pahud pushed a commit that referenced this pull request Apr 11, 2026
* feat: capture and display ACP error responses in Discord

When the ACP agent returns an error, display a user-friendly message
instead of '_(no response)_'.

- Capture response_error from ACP notification before loop exits
- Show error on empty response or append to existing content
- format_error() uses protocol-level codes (JSON-RPC / HTTP) - no provider-specific strings
- Provider-agnostic: message text passed through verbatim from upstream agent

* feat: add format_user_error() for unified error display

Two error formatters:
- format_user_error(message: &str): handles startup/connection errors from pool.rs
  - timeout waiting for session/new → Request Timeout
  - connection/channel closed → Connection Lost
  - failed to spawn/no such file → Agent Not Found
  - pool exhausted → Service Busy
  - invalid api key/unauthorized → Unauthorized

- format_coded_error(code, message): handles ACP response errors (JSON-RPC/HTTP codes)

Both paths now use format_user_error() for startup errors and
format_coded_error() for response errors. Link to #50.

* fix: remove hardcoded 'claude' from format_user_error

Address reviewer feedback:
- Remove .contains("claude") from failed_to_spawn branch - was contradicting provider-agnostic goal
- Make format_coded_error public for reuse by other adapters
- Add comment explaining error + partial content display behavior
- Add 17 unit tests for format_user_error and format_coded_error

* refactor: extract error formatters to src/error_display.rs module

Address reviewer feedback from #170:
- Fix case sensitivity bug in timeout method extraction (use msg_lower.find)
- Extract format_user_error + format_coded_error to separate module
- JSON-RPC -32099..=-32000 range catch-all for server errors
- Add mixed-case timeout test to verify fix
- Tests moved to error_display module (19 total, all pass)

The formatters are now reusable by other adapters (Slack, etc.)
without depending on the discord module.

* perf: cache regex in strip_mention using LazyLock

Regex is now compiled once at startup via std::sync::LazyLock,
instead of on every call. Covers the nice-to-have reviewer item.

---------

Co-authored-by: OpenClaw Bot <bot@openclaw.dev>
Reese-max pushed a commit to Reese-max/openab that referenced this pull request Apr 12, 2026
* feat: capture and display ACP error responses in Discord

When the ACP agent returns an error, display a user-friendly message
instead of '_(no response)_'.

- Capture response_error from ACP notification before loop exits
- Show error on empty response or append to existing content
- format_error() uses protocol-level codes (JSON-RPC / HTTP) - no provider-specific strings
- Provider-agnostic: message text passed through verbatim from upstream agent

* feat: add format_user_error() for unified error display

Two error formatters:
- format_user_error(message: &str): handles startup/connection errors from pool.rs
  - timeout waiting for session/new → Request Timeout
  - connection/channel closed → Connection Lost
  - failed to spawn/no such file → Agent Not Found
  - pool exhausted → Service Busy
  - invalid api key/unauthorized → Unauthorized

- format_coded_error(code, message): handles ACP response errors (JSON-RPC/HTTP codes)

Both paths now use format_user_error() for startup errors and
format_coded_error() for response errors. Link to openabdev#50.

* fix: remove hardcoded 'claude' from format_user_error

Address reviewer feedback:
- Remove .contains("claude") from failed_to_spawn branch - was contradicting provider-agnostic goal
- Make format_coded_error public for reuse by other adapters
- Add comment explaining error + partial content display behavior
- Add 17 unit tests for format_user_error and format_coded_error

* refactor: extract error formatters to src/error_display.rs module

Address reviewer feedback from openabdev#170:
- Fix case sensitivity bug in timeout method extraction (use msg_lower.find)
- Extract format_user_error + format_coded_error to separate module
- JSON-RPC -32099..=-32000 range catch-all for server errors
- Add mixed-case timeout test to verify fix
- Tests moved to error_display module (19 total, all pass)

The formatters are now reusable by other adapters (Slack, etc.)
without depending on the discord module.

* perf: cache regex in strip_mention using LazyLock

Regex is now compiled once at startup via std::sync::LazyLock,
instead of on every call. Covers the nice-to-have reviewer item.

---------

Co-authored-by: OpenClaw Bot <bot@openclaw.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

improvement: show meaningful status messages instead of '...'

4 participants