feat(errors): clean error results, recovery hints, quieter logs by tony · Pull Request #64 · tmux-python/libtmux-mcp

tony · 2026-06-06T15:43:46Z

Summary

Bump the fastmcp floor to >=3.4.0, unlocking ToolError(log_level=...) (3.3) and ToolResult(is_error=True) (3.4)
Fix error-message mangling: failed tool calls passed through fastmcp's stock -32603 transform, so agents received Internal error: Pane not found: %5 — messages now arrive exactly as raised
Add structured _meta on error results: error_type (originating exception class), expected (agent-correctable vs operator fault), suggestion (recovery hint when present)
Add recovery hints on not-found errors pointing at the discovery tools (list_sessions / list_windows / list_panes) that resolve stale or guessed ids
Add synthesized hints for unrecognized arguments: the error names exactly which argument(s) to remove or fix, and flags wait_for_previous — a scheduling flag Gemini CLI leaks into batched tool calls — with the connected client's identity from the MCP initialize handshake
Demote expected, agent-correctable failures — unknown ids, invalid arguments (including schema-level rejections raised before tool code runs), tier denials, transient tmux errors — to WARNING in the log stream; a missing tmux binary and unexpected exceptions stay at ERROR

Changes by area

Error classification

src/libtmux_mcp/_utils.py: ExpectedToolError(ToolError) defaults log_level=WARNING and carries an optional suggestion. _map_exception_to_tool_error returns it for the expected libtmux branches and attaches discovery-tool suggestions on the not-found branches. Direct validation raises across the tool modules and the SafetyMiddleware tier denial use it.
src/libtmux_mcp/middleware.py: _is_schema_validation_error classifies fastmcp's pre-dispatch pydantic argument validation as expected/WARNING — it fires before the decorators can classify it, and output-shape failures can't be confused with it (fastmcp's tool layer converts those to error results itself).

Error results

src/libtmux_mcp/middleware.py: ToolErrorResultMiddleware(ErrorHandlingMiddleware) converts tool-call exceptions into ToolResult(is_error=True) with the clean message plus the _meta payload, and logs at the error's own log_level. The inherited on_message keeps the MCP -32002 resource-not-found transform for non-tool messages.
src/libtmux_mcp/middleware.py: _unexpected_kwargs + _client_label synthesize the unrecognized-argument suggestion; the client label comes from fastmcp_context.session.client_params.clientInfo, is None-safe at every hop, and only words the message — it never gates behavior.
src/libtmux_mcp/middleware.py: TailPreservingResponseLimitingMiddleware preserves is_error through truncation — the stock path rebuilds the result without it, turning an oversized error into an apparent success that the MCP SDK client then rejects against the tool's output schema.
src/libtmux_mcp/server.py: the middleware stack swaps the stock error middleware for the subclass; the ordering comment documents the invariant that error→result conversion must stay outside the audit/retry/safety trio.

Docs

docs/topics/logging.md: the fastmcp.errors channel and the WARNING-expected / ERROR-operator-fault level contract.
docs/topics/architecture.md: full middleware request flow and the three-boundary error design (tool classification, schema classification, conversion).
docs/topics/gotchas.md: Gemini CLI's wait_for_previous injection — symptom, self-recovery flow, and why rejection beats stripping.
Error-prose mentions across tool pages link to the autodoc'd ExpectedToolError entry via {exc} roles.

Design decisions

Convert at the boundary, not in the tools: errors keep propagating as exceptions through audit/retry/safety and only become results in the outermost error layer. Returning error results from inside tools would disable readonly retries (fastmcp's RetryMiddleware matches LibTmuxException via __cause__) and make the audit log record failures as outcome=ok.
WARNING/ERROR partition: WARNING means "the agent can correct course"; ERROR means "an operator must act" (missing tmux binary) or "possible server bug" (unexpected exception). A WARNING-heavy, ERROR-quiet stream is healthy.
No structured_content on error results: tools declare output schemas for their success payloads and clients validate structuredContent against them — an error-shaped payload there would fail validation on strict clients. Details ride in _meta instead.
Suggestion policy: raise-site hints only on not-found errors (discovery tools are the canonical fix for the most common agent mistake); synthesized hints on unrecognized-argument rejections (names the exact keys, plus the client-flag note for wait_for_previous). Unknown arguments are rejected, never silently stripped — MemPalace patched the same Gemini leak by popping the key (fix: Ignore wait_for_previous argument to support Gemini MCP clients MemPalace/mempalace#322) and later whitelisting against the schema, but on a server with mutating/destructive tools silent dropping would let a typo'd flag (enter on send_keys) run with defaults.

Before / After

Agent-visible result for a tool denied by the safety tier:

Before:

Internal error: Tool 'send_keys' is not available at the current safety level. Set LIBTMUX_SAFETY=destructive to enable destructive tools.

After:

Tool 'send_keys' is not available at the current safety level. Set LIBTMUX_SAFETY=destructive to enable destructive tools.

with _meta: {"error_type": "ExpectedToolError", "expected": true}.

Verification

Every direct raise in the tool modules is tier-classified — stock ToolError survives only as the ERROR-tier return inside the exception mapper:

rg -n "raise ToolError\(" src/

Test plan

Note: the floor stays at >=3.4.0 rather than >=3.4.1 (starlette CVE-2026-48710 floor) because 3.4.1 is inside the 3-day uv dependency cooldown until June 8 — follow-up commit planned.

why: Unlock fastmcp 3.3/3.4 server features adopted in follow-up commits: ToolError(log_level=...) for expected-failure log demotion (fastmcp#4036) and ToolResult(is_error=True) error results (fastmcp#4217). what: - pyproject: fastmcp floor >=3.2.4 -> >=3.4.0 (lock already at 3.4.0) - CHANGES: Dependencies entry covering passive wins of the bump (template resource title preservation, no exc_info on expected tool failures, MCP-compliant OTEL span attributes) note: floor stays at 3.4.0 for now — 3.4.1 (starlette CVE-2026-48710 floor) is inside the 3-day uv dependency cooldown until 2026-06-08; a follow-up will raise the floor once it clears.

…ToolError why: Every agent-correctable failure (unknown pane id, invalid arguments, tier denial) logged at ERROR with a traceback, drowning operator logs in noise that isn't actionable. fastmcp 3.3 added ToolError(log_level=...) support (fastmcp#4036). what: - Add ExpectedToolError(ToolError) defaulting log_level=WARNING in _utils.py - _map_exception_to_tool_error: expected libtmux failure branches return ExpectedToolError; TmuxCommandNotFound (broken operator environment) and the unexpected catch-all stay at ERROR - Sweep all direct validation/state raises in tool modules and SafetyMiddleware tier denial to ExpectedToolError - Doctests: traceback class names follow the raise sites - tests: parametrized level partition for the mapper; in-memory Client + caplog proof that fastmcp's server layer honors the WARNING level (fastmcp logger needs propagate=True under caplog)

…ultMiddleware why: fastmcp's stock ErrorHandlingMiddleware(transform_errors=True) funnels every tool failure through a -32603 catch-all, so agents received 'Internal error: Pane not found: %5' — the transform mangled every expected failure message on the wire. fastmcp 3.4 added ToolResult(is_error=True) (fastmcp#4217), letting the server shape error results directly. what: - ToolErrorResultMiddleware(ErrorHandlingMiddleware) intercepts tool-call exceptions and returns ToolResult(is_error=True) carrying the message exactly as raised plus structured meta: error_type (originating exception class via __cause__), expected (agent-correctable vs operator fault/bug), suggestion (recovery hint when present) - _log_error override honors FastMCPError.log_level — the stock implementation hardcoded logger.error, re-shouting failures the ExpectedToolError demotion had moved to WARNING - Inherited on_message keeps the MCP -32002 resource-not-found transform for non-tool messages - Ordering invariant documented in the server.py stack comment: conversion must stay outside Audit/Retry/Safety, which depend on exception semantics (audit catches to record outcome=error, retry matches LibTmuxException via __cause__) - ExpectedToolError grows an optional suggestion kwarg; the not-found mapper branches point agents at the discovery tools (list_sessions / list_windows / list_panes) - docs(topics/logging): document the fastmcp.errors channel and the WARNING-expected / ERROR-operator-fault level contract - tests: probe-server coverage for clean messages, meta payload, suggestion plumbing, and per-level logging; production stack-order assertions updated to the subclass

codecov-commenter · 2026-06-06T15:44:52Z

Codecov Report

❌ Patch coverage is 78.46154% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.59%. Comparing base (310132f) to head (5e3520d).

Files with missing lines	Patch %	Lines
src/libtmux_mcp/middleware.py	83.09%	9 Missing and 3 partials ⚠️
src/libtmux_mcp/tools/pane_tools/capture_since.py	33.33%	4 Missing ⚠️
src/libtmux_mcp/tools/wait_for_tools.py	40.00%	3 Missing ⚠️
src/libtmux_mcp/tools/buffer_tools.py	75.00%	2 Missing ⚠️
src/libtmux_mcp/tools/pane_tools/io.py	0.00%	2 Missing ⚠️
src/libtmux_mcp/tools/pane_tools/lifecycle.py	33.33%	2 Missing ⚠️
src/libtmux_mcp/tools/session_tools.py	60.00%	2 Missing ⚠️
src/libtmux_mcp/tools/window_tools.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #64      +/-   ##
==========================================
+ Coverage   84.31%   84.59%   +0.27%     
==========================================
  Files          42       42              
  Lines        2684     2758      +74     
  Branches      359      370      +11     
==========================================
+ Hits         2263     2333      +70     
- Misses        317      321       +4     
  Partials      104      104

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

why: Tighten the unreleased entries to product level now that the work has a pull request to link. what: - Trim upstream fastmcp PR numbers, OTEL attribute names, and fastmcp API mechanics from the Dependencies and What's new entries — that depth lives in the linked PR - Add the PR reference to both What's new entries - Lead What's new with the clean-error-results headline; point discovery-tool mentions through tooliconl roles

…_log_error why: AGENTS.md logging standard prescribes lazy %-formatting over f-strings in log calls — deferred interpolation when the level is filtered, and aggregator message-template grouping. The new _log_error pre-formatted its message with an f-string; flagged in the PR code review. what: - Pass the template and args to logger.log directly; the stock include_traceback if/else collapses into exc_info=<bool> - tests: convert the log-level test to the NamedTuple + test_id parametrize pattern and assert via record.getMessage(), which both interpolates lazily-formatted records and would fail on a literal %s leaking into the message

why: The module docstring listed ToolErrorResultMiddleware after ReadonlyRetryMiddleware, but the class is defined directly after SafetyMiddleware — a reader scanning the file by the docstring's order lands in the wrong place. Flagged in the PR code review. what: - Reorder the bullet list to match file definition order and say so in the lead-in (Safety, ToolErrorResult, Audit, ReadonlyRetry, TailPreserving)

why: The handle_tool_errors docstrings still said failures re-raise as plain ToolError; the mapper now classifies expected failures as ExpectedToolError at WARNING — the docstrings were silent on the level partition and on why the re-raise chains from e. Flagged in the PR code review. what: - Document the ExpectedToolError(WARNING) / ToolError(ERROR) partition on both decorators - State the single-level __cause__ contract at the raise site: ReadonlyRetryMiddleware inspects exactly one cause hop, so re-wrapping the mapped error would silently disable readonly retries

…-layer errors why: The request-flow diagram showed only SafetyMiddleware and the error-handling section described a single decorator-to-ToolError step; the branch's error redesign (classification in the decorator, conversion to is_error results in ToolErrorResultMiddleware) made both descriptions materially incomplete. what: - Request flow: list all six middleware outermost-first with a one-phrase role each; point at the server.py stack comment for the ordering rationale - Error handling: describe the classification/conversion split, the from-e cause chain retries depend on, and the exceptions-stay- exceptions invariant through the audit/retry/safety trio; link topics/logging for the level policy - Source layout: middleware.py one-liner covers all middleware, not just the safety tier

…se sites why: Seven docstrings still named ToolError where every raise site now raises ExpectedToolError — true only by subclassing, and the rendered API reference loses the expected/unexpected distinction the error redesign introduced. what: - Raises sections: _coerce_dict_arg and _apply_filters (_utils), _validate_channel_name and wait_for_channel (wait_for_tools), find_pane_by_position (pane_tools/lifecycle) - Prose: _raise_if_pane_lifecycle_changed one-liner (pane_tools/state), wait_for_content_change baseline-loss note (pane_tools/wait)

…nvariant why: The ordering-invariant comment opened with 'All three depend on exception semantics' but explained only audit and retry and concluded 'silently disable both' — an internally inconsistent count in the one comment future maintainers rely on to not reorder the stack. Flagged in the PR code review. what: - Name safety's dependency explicitly (tier denials must propagate as exceptions for audit to record them) and close with 'break all three' at both sites: the ToolErrorResultMiddleware docstring and the server.py stack comment

…on claim why: The error-handling section said the middleware catches the exception 'at the outer edge of the stack', but Timing and the response limiter sit outside it — imprecise in the one page whose purpose is stack-position accuracy. Flagged in the PR code review. what: - Describe the catch point as 'once it has cleared the audit/retry/safety trio', matching the closing sentence's 'outermost error layer' framing

why: The class documents its two keyword parameters only via prose and doctests; AGENTS.md mandates NumPy docstring style, and the codebase's parameterised classes (SafetyMiddleware, AuditMiddleware) all carry a Parameters section. Flagged in the PR code review. what: - Document *args, log_level, and suggestion in a NumPy Parameters block between the extended description and Examples

… expected why: fastmcp validates tool arguments against the input schema before tool code runs, raising a bare pydantic.ValidationError that carries no log_level — the handle_tool_errors decorators never get a chance to classify it. The middleware logged these at ERROR and marked them expected=false, contradicting the documented policy that invalid arguments are agent-correctable WARNINGs (fastmcp's own server layer already logs the same failure as a WARNING). Reported in PR review. what: - _is_schema_validation_error: recognize pydantic.ValidationError on the error or its __cause__; output-shape failures cannot be confused with this case — fastmcp's tool layer converts those to error results before the middleware sees them (verified) - _log_error: exceptions without log_level classify via the helper — WARNING for argument validation, ERROR otherwise - _error_tool_result: expected=True for schema validation failures - tests: takes_int probe tool; ErrorLogLevelFixture grows an arguments field with a schema_validation_logs_warning case; meta classification test asserting expected=True + ValidationError

…tion why: The limiter sits outside ToolErrorResultMiddleware, so oversized error results (e.g. a validation message echoing a huge argument on a response-limited tool) hit the truncation path, which rebuilds the result without the error flag. The truncated error became an apparent success, and the MCP SDK client rejected it against the tool's output schema — the agent received a transport-level RuntimeError instead of the tool error. Reported in PR review; reproduced against the reference client. what: - TailPreservingResponseLimitingMiddleware.on_call_tool captures the inner result and restores is_error=True when the base class truncation replaced an error result; meta already survives via the base class, and tail-preservation keeps the suggestion line at the end of the text - Truncated successes never gain the flag - tests: LimiterErrorFixture parametrize (oversized error keeps flag + suggestion tail, small error untouched, oversized success stays non-error); the failing probe returns a Pydantic model so the MCP SDK client's output-schema validation guards the regression note: truncated SUCCESS results from schema'd tools still trip the same client-side validation — a pre-existing gap in fastmcp's stock limiter (its meta workaround only bypasses server-side validation), out of scope here

…allback why: The previous commit's verification chain piped mypy through tail, masking a real arg-type error — the capture callback didn't satisfy fastmcp's CallNext protocol, which matches the parameter name ('context'), not just the callable shape. what: - Annotate _capture with MiddlewareContext[CallToolRequestParams] -> ToolResult and name the parameter 'context' to structurally match the protocol; cast the captured Any back to ToolResult

why: The branch now preserves error results through the outer response limiter, so stale ordering prose made the documented middleware contract misleading. what: - Describe response limiting as return-path truncation after error conversion - Split tool exception classification from FastMCP schema validation handling - Document that truncated oversized errors preserve is_error and _meta

why: The full suite can capture snapshot output before a burst of queued shell commands finishes, making the truncation assertions depend on tmux timing. what: - Reuse the wait-for helper for snapshot output setup - Wait for shell command completion before calling snapshot_pane

why: The branch distinguishes agent-correctable failures from operator faults, so stale ToolError wording obscured the intended client contract. what: - Name ExpectedToolError in internal docstrings and middleware comments - Use expected tool error language in client-facing tool docs

…idation errors why: Gemini CLI's scheduler leaks its wait_for_previous sequencing flag into MCP tool arguments when batching calls in one turn (near-constant in gemini -p runs). Strict schemas reject it and Gemini self-recovers from the generic pydantic message, but the recovery depends on the model inferring what to do. MemPalace fixed the same leak by stripping the key (MemPalace/mempalace#322) and later whitelisting arguments (#647) — silent dropping would mask argument typos on mutating/destructive tools, so the rejection stays; the error just explains itself now. what: - _unexpected_kwargs: names flagged unexpected_keyword_argument in a pydantic validation failure (error or its __cause__) - _client_label: client 'name version' from the MCP initialize handshake (fastmcp_context.session.client_params.clientInfo); None-safe at every hop, wording-only — never gates behavior - _error_tool_result(error, context=None): synthesizes a suggestion for unexpected-argument rejections — name the argument(s) to drop or fix, plus a client-scheduling-flag note for wait_for_previous that names the connected client when the handshake exposed it - docs(topics/gotchas): the leak, batching-vs-mode findings, the self-recovery flow, and why rejection beats stripping here - tests: parametrized suggestion coverage (client-labeled flag leak, default handshake, plain typo, missing-arg silence) plus a direct no-handshake unit case via pydantic.validate_call

why: Prose said 'raises an expected tool error' in plain words while the API reference renders ExpectedToolError with a stable autodoc anchor — the docs convention is to cite autodoc'd APIs via roles, never plain backticks, and readers get a one-click path to the exception's full contract (log level, suggestion field). what: - Convert seven prose mentions to {exc}`~libtmux_mcp._utils.ExpectedToolError`: wait-for-channel (crashed signaller), show-hook (unknown hook names), delete-buffer (paste after delete), safety (kill-tool pane_id requirement), wait-for-content-change (pane lifecycle), capture-since (cursor/pane mismatch and pane lifecycle) - architecture: switch the existing {class} role for ExpectedToolError to {exc}, matching the LibTmuxException precedent in topics/logging - Bare fastmcp ToolError literals stay as code text: fastmcp publishes no objects.inv, so there is no link target

why: The unrecognized-argument suggestion (with client detection) is user-visible product surface the unreleased notes didn't cover — the behavior a Gemini CLI user upgrading will actually notice. what: - What's new entry covering the do-this-next hint for rejected unknown arguments and the wait_for_previous client-flag note

why: The branch now synthesizes recovery suggestions for schema-validation failures, including unknown arguments and client scheduling flags, so the architecture page's discovery-only wording was too narrow. The Gemini gotcha page also carried brittle current-state claims about client behavior and external MCP server fixes. what: - Broaden error-result suggestion prose to cover recovery hints generally - Keep ExpectedToolError references linked with {exc} - Soften stale-prone Gemini and external-server claims

why: Middleware tests still documented safety denials and decorated libtmux failures with the older ToolError-only mental model, while production now uses ExpectedToolError for agent-correctable failures. what: - Update stack-order and audit-denial docstrings for ExpectedToolError - Make the audit-denial simulation match current SafetyMiddleware behavior - Clarify retry/error-probe comments around mapped expected tool errors

tony added 3 commits June 6, 2026 10:14

tony added 20 commits June 6, 2026 10:49

tony merged commit 00a5f66 into main Jun 6, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(errors): clean error results, recovery hints, quieter logs#64

feat(errors): clean error results, recovery hints, quieter logs#64
tony merged 23 commits into
mainfrom
fastmcp-v3.4.x

tony commented Jun 6, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jun 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tony commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes by area

Error classification

Error results

Docs

Design decisions

Before / After

Verification

Test plan

Uh oh!

codecov-commenter commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tony commented Jun 6, 2026 •

edited

Loading

codecov-commenter commented Jun 6, 2026 •

edited

Loading