Skip to content

feat(mcp): add structured error categories for agentic error handling#3753

Open
sneharathod7 wants to merge 1 commit into
knative:mainfrom
sneharathod7:feat-mcp-error-categories
Open

feat(mcp): add structured error categories for agentic error handling#3753
sneharathod7 wants to merge 1 commit into
knative:mainfrom
sneharathod7:feat-mcp-error-categories

Conversation

@sneharathod7
Copy link
Copy Markdown

# Changes

- :gift: Add structured `ErrorCategory` support for MCP tool failures
- :broom: Implement intelligent CLI error classification with remediation hints
- :broom: Resolve relative executable path issues using `os.Executable()`
- :broom: Update MCP `build` and `deploy` handlers to return structured error responses

/kind enhancement

Fixes #3750
Relates to #3752

---

## Summary

This PR introduces structured error handling for MCP tools to improve agent decision-making and remediation workflows.

Currently, MCP handlers primarily surface failures as raw CLI strings, making it difficult for agents to reliably determine whether a failure originated from:
- registry authentication
- Kubernetes connectivity
- build/runtime failures
- invalid local configuration

This change adds machine-readable error categories along with actionable remediation hints so agents can respond more intelligently instead of relying on fragile string parsing.

This work also complements the `check_prerequisites` MCP tool by ensuring that operational failures return high-confidence failure signals during runtime workflows.

---

## What This PR Adds

### Structured Error Responses

MCP tool failures now return structured JSON responses:

```json id="w6zcl0"
{
  "errorCategory": "REGISTRY_ERROR",
  "message": "unauthorized: authentication required",
  "hint": "Check your registry credentials. Try running 'docker login <registry>' first."
}

Error Categories Implemented

  • REGISTRY_ERROR
  • CLUSTER_ERROR
  • BUILD_ERROR
  • VALIDATION_ERROR
  • AUTH_ERROR
  • UNKNOWN_ERROR

Intelligent Error Classification

Introduced heuristic-based classification for common CLI failure patterns across:

  • registry operations
  • Kubernetes deployment flows
  • build failures
  • authentication issues

This allows MCP agents to provide targeted remediation guidance programmatically.


Absolute Path Execution

Updated MCP command execution to resolve the running binary path using:

os.Executable()

This fixes issues such as:

exec: "func": cannot run executable found relative to current directory

which can occur in newer Go versions and certain shell environments.


Internal Changes

  • Added ErrorCategory type and structured error response model
  • Updated build and deploy handlers to return categorized failures
  • Added remediation hint generation for common operational issues
  • Updated command execution flow to use absolute binary paths

Tests Added

  • TestCategorizeError
  • TestStructuredError_Error
  • Integration tests for build/deploy error handling paths

The test suite validates:

  • classification of multiple failure patterns
  • structured JSON marshaling
  • categorized MCP error responses

All existing tests continue to pass.


Release Note

MCP tool failures now include machine-readable error categories and actionable remediation hints. MCP command execution now resolves absolute binary paths for more reliable execution across environments.

@knative-prow knative-prow Bot added the kind/enhancement Feature additions or improvements to existing label May 15, 2026
@knative-prow knative-prow Bot requested review from dsimansk and jrangelramos May 15, 2026 19:46
@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented May 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sneharathod7
Once this PR has been reviewed and has the lgtm label, please assign dsimansk for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow Bot added the size/L 🤖 PR changes 100-499 lines, ignoring generated files. label May 15, 2026
@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented May 15, 2026

Hi @sneharathod7. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@knative-prow knative-prow Bot added the needs-ok-to-test 🤖 Needs an org member to approve testing label May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement Feature additions or improvements to existing needs-ok-to-test 🤖 Needs an org member to approve testing size/L 🤖 PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant