fix(llma): stop capturing 4xx MCP API errors as exceptions#59231
Merged
rafaeelaudibert merged 2 commits intoMay 20, 2026
Conversation
handleToolError captured every non-permission tool error via captureException with a fingerprint of the tool name. Any LLM-induced 4xx (e.g. a placeholder UUID that returns 404) created a brand-new error tracking issue per tool, paging the team for noise rather than real bugs. Introduce a typed PostHogApiError that carries the HTTP status, and short-circuit 4xx + validation errors in handleToolError so the typed message is returned to the LLM (which can self-correct) without capturing an exception. 5xx and unexpected non-HTTP errors still capture as before. Generated-By: PostHog Code Task-Id: 22c240c9-b71a-4acf-b137-1187eb679ba8
Contributor
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
services/mcp/src/api/fetcher.ts:97-108
Variable shadowing: the inner `const body` declaration reuses the same name as the outer `body` (which holds the request body for POST/PUT/PATCH). TypeScript/eslint `no-shadow` rules will catch this, and it makes the two semantically different values harder to distinguish at a glance. Renaming the error-body variable avoids the ambiguity.
```suggestion
if (!response.ok) {
const errorResponse = await response.json()
const errorBody = JSON.stringify(errorResponse)
throw new PostHogApiError({
status: response.status,
statusText: response.statusText,
body: errorBody,
url: input.url.toString(),
method: input.method.toUpperCase(),
message: `Failed request: [${response.status}] ${errorBody}`,
})
}
```
### Issue 2 of 2
services/mcp/tests/unit/permission-errors.test.ts:409-462
**Prefer parameterised tests for status-code boundary cases**
The new test suite has four individual `it()` blocks that could be collapsed into two `it.each()` tables: one for the "short-circuit / no capture" family (400, 404, 422, 499) and one for the "capture as exception" family (500, 502, 503). The exact boundary between 4xx and 5xx is the core invariant being tested, so covering it with multiple representative values matters. For example, an off-by-one in the `status < 500` guard would only be caught if 499 and 500 are both in the table. The same applies to the `fetcher.test.ts` rewrites.
Reviews (1): Last reviewed commit: "fix(llma): stop capturing 4xx MCP API er..." | Re-trigger Greptile |
…ze boundary tests - fetcher.ts: rename `body` (inner JSON-string of error response) to `errorBody` so it no longer shadows the outer request `body` - permission-errors.test.ts: collapse the four `it()` blocks for 4xx short-circuit / 5xx capture into two `it.each()` tables, and add 499 + 500 explicitly so an off-by-one in the `status < 500` guard would be caught Generated-By: PostHog Code Task-Id: 22c240c9-b71a-4acf-b137-1187eb679ba8
rafaeelaudibert
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The PostHog team is being paged with new error tracking issues every time an LLM agent passes a bad parameter to an MCP tool — noise rather than real bugs.
Concretely, two new error tracking issues showed up within ~20 seconds of each other (
error-tracking-symbol-sets-retrieveanderror-tracking-symbol-sets-download-retrieve), both reportingMCPToolError: Request failed... 404 (Not Found)for…/symbol_sets/00000000-0000-0000-0000-000000000000/[download/]. Per the$exceptionevent stream, each fingerprint had exactly 1 occurrence over the last 30 days — but every additional MCP tool that gets handed an all-zeros placeholder UUID would create yet another distinct issue.The PostHog API was correctly returning 404 via
self.get_object()inproducts/error_tracking/backend/api/symbol_sets.py. The real issue lived in the MCP service:handleToolErrorinservices/mcp/src/lib/errors.tscalledgetPostHogClient().captureException(...)on every non-permission error, fingerprinted by$exception_fingerprint: mcpError.tool. So any agent-induced 4xx response got logged as a brand-new PostHog exception keyed by tool name.Changes
PostHogApiErrorinservices/mcp/src/lib/errors.tscarryingstatus,statusText,body,url,method.services/mcp/src/api/client.ts(text-response and JSON-response paths) andservices/mcp/src/api/fetcher.tsnow throwPostHogApiErrorinstead of genericErrors, preserving the existing message shapes for any string-matching consumers.handleToolErrorwalks theError.causechain forPostHogApiErrorandPostHogValidationError; if the underlying response is 4xx, it returns the typed message to the LLM without callingcaptureException. 5xx and unexpected non-HTTP errors still capture as before — this mirrors the existing recoverable-error short-circuit forMissingProjectContextError.No backend change is needed; the 404 behavior in
symbol_sets.pyis correct — the right move is to treat 4xx as agent input error (recoverable, expected) at the MCP layer.How did you test this code?
I am an agent (PostHog Code). I did not perform manual testing. I ran the following automated tests:
pnpm test run tests/unit/permission-errors.test.ts tests/unit/fetcher.test.tsfromservices/mcp/— 45 tests pass (26 + 19), including 5 new tests covering: 404 short-circuit, validation-error short-circuit, cause-chain unwrapping, 5xx still captures, unexpected non-HTTP errors still capture.pnpm exec tsc --noEmit— no new type errors in the changed files (pre-existing@posthog/quill/reactUI-app errors remain, unrelated to this change).pnpm exec oxlinton the changed files — clean (one pre-existing warning ontests/unit/fetcher.test.ts:199unrelated to this change).Publish to changelog?
no
Docs update
No docs change needed.
🤖 Agent context
services/mcp/src/lib/errors.ts:275as the captureException site (added in commit042d86b, made reachable for symbol-set tools ina70f94d).PostHogApiError(so future call sites get the same classification for free) and short-circuit 4xx inhandleToolError, mirroring the existingMissingProjectContextErrorshort-circuit pattern.Failed request: [404] {...}) is preserved via thePostHogApiErrormessageoverride so any string-matching consumers keep working, while the client.ts paths use the defaultRequest failed:\nURL: ...format that was already in place.Generated-By: PostHog Code
Task-Id: 22c240c9-b71a-4acf-b137-1187eb679ba8