Python: skip hosted tools test on transient upstream MCP errors#5296
Python: skip hosted tools test on transient upstream MCP errors#5296moonbox3 merged 1 commit intomicrosoft:mainfrom
Conversation
The local MCP server can't be used for hosted tools tests because Anthropic's backend needs to reach the MCP URL from their infrastructure (not localhost on the CI runner). Revert to learn.microsoft.com/api/mcp but catch BadRequestError, InternalServerError, APIConnectionError, and APITimeoutError and pytest.skip so upstream outages don't block the merge queue.
There was a problem hiding this comment.
Pull request overview
Updates Anthropic Python integration testing to reduce CI failures caused by transient upstream MCP/hosted-tools errors when exercising the full hosted-tools path.
Changes:
- Switch the hosted-tools integration test from a
LOCAL_MCP_URL-based MCP server tohttps://learn.microsoft.com/api/mcp. - Wrap the hosted-tools request in exception handling that
pytest.skips on certain Anthropic SDK errors.
| except ( | ||
| anthropic.BadRequestError, |
There was a problem hiding this comment.
Catching anthropic.BadRequestError and skipping can mask real regressions (a 400 typically indicates an invalid request/tool schema rather than a transient upstream outage). Consider limiting the skip behavior to clearly-transient failures (e.g., connection/timeout/5xx) or inspecting the error details and only skipping when the 400 is explicitly caused by the hosted MCP call failing; otherwise re-raise so genuine test failures still fail CI.
| except ( | |
| anthropic.BadRequestError, | |
| except anthropic.BadRequestError as e: | |
| error_message = str(e) | |
| error_body = getattr(e, "body", None) | |
| if isinstance(error_body, dict): | |
| error_details = error_body.get("error") | |
| if isinstance(error_details, dict): | |
| detail_message = error_details.get("message") | |
| if isinstance(detail_message, str) and detail_message: | |
| error_message = f"{error_message} {detail_message}".strip() | |
| normalized_error_message = error_message.lower() | |
| is_hosted_mcp_failure = "mcp" in normalized_error_message and any( | |
| marker in normalized_error_message | |
| for marker in ("unavailable", "failed", "failure", "timeout", "timed out") | |
| ) | |
| if is_hosted_mcp_failure: | |
| pytest.skip(f"Upstream MCP server unavailable: {e}") | |
| raise | |
| except ( |
| import anthropic | ||
|
|
There was a problem hiding this comment.
import anthropic is added inside the test function, but this module is already required at import-time in this file (e.g., from anthropic.types.beta import ... near the top). Consider moving this import to the module import section (or importing only the needed exception types) for consistent import ordering and to avoid a redundant local import.
Summary
The
test_anthropic_client_integration_hosted_toolstest calls Anthropic's API with a hosted MCP tool pointed atlearn.microsoft.com/api/mcp. Anthropic's backend connects to that URL server-side, so when MS Learn rate-limits or returns 5xx, the test fails with errors we can't retry around — and it blocks the merge queue.This catches
BadRequestError,InternalServerError,APIConnectionError, andAPITimeoutErrorfrom the Anthropic SDK andpytest.skips so upstream outages don't break CI. The test still runs and validates the full hosted-tools path when the upstream MCP server is healthy.Also broadens the image integration test assertion to use word-boundary matching with common building synonyms, since the model legitimately describes the same image with varied vocabulary.
Test plan