A2: add sanitized error classification fields (additive)#390
Closed
MrTravisB wants to merge 2 commits intotravis/pilo-task-idfrom
Closed
A2: add sanitized error classification fields (additive)#390MrTravisB wants to merge 2 commits intotravis/pilo-task-idfrom
MrTravisB wants to merge 2 commits intotravis/pilo-task-idfrom
Conversation
6 tasks
lmorchard
approved these changes
Apr 21, 2026
5 tasks
MrTravisB
added a commit
that referenced
this pull request
Apr 27, 2026
**Replaces #390** (auto-closed when its base branch was deleted by the A1 squash-merge). ## Summary Second PR in Stack A. Stacks on the merged A1 (#389). Adds structured classification fields to the error response so dashboards and programmatic retry logic can reason about failures without parsing `message` strings. Additive only - no existing field is removed. ### What's added - `error.class` - error constructor name (e.g. `"TypeError"`, `"SyntaxError"`) - `error.reason` - coarse-grained enum, safe for metric labels: `INVALID_REQUEST`, `PROVIDER_UNAUTHORIZED`, `NAVIGATION_TIMEOUT`, `BROWSER_DISCONNECTED`, `MAX_ITERATIONS`, `MAX_ERRORS`, `TIMEOUT`, `INTERNAL_ERROR` - `error.recoverable` - boolean derived from `error instanceof RecoverableError` - `error.phase` - which pipeline phase produced the error: `"setup"` or `"execution"` ### What stays (unchanged for consumers) - `error.message` - human-readable string, still present - `error.timestamp` - ISO timestamp, still present - `error.code` - fine-grained semantic code, still present - `error.taskId` - from A1, still present ### The data-sensitivity invariant `error.message` must never contain raw `error.message` of a thrown value (agent errors can quote task text, URL selectors derived from page content, or extracted values). The new code enforces this two ways: 1. **Validation / setup errors** use hardcoded string literals at the callsite ("Task is required", "AI provider is not configured.", etc.). Server-controlled by construction. 2. **Task execution errors** go through `errorResponseFromError`, which sets `message` from a fixed `REASON_HINTS` map keyed by the `reason` enum. No code path reads `error.message` of the thrown value into the response. A canary test throws `new Error("DO NOT LOG THIS SENTINEL")` and asserts the sentinel never appears in the serialized response. ### Minor security improvements (bundled) Dropped two WS error paths that previously echoed client-provided strings back in the message: - `UNKNOWN_EVENT` no longer quotes `msg.event` - `UNKNOWN_REQUEST_ID` no longer quotes the requestId - `INVALID_SEARCH_PROVIDER` no longer echoes the user-provided value ### Changes - `packages/server/src/taskRunner.ts` - new `ErrorReason` / `ErrorPhase` types, extended `ErrorResponse`, new `createErrorResponse` (object-param signature) and `errorResponseFromError` helpers, `classifyError` mapper, `REASON_HINTS` map. `errorToString` removed (no callers left). - `packages/server/src/routes/pilo.ts` - validation errors use new shape via `validateTaskRequest`; setup-error path uses `createErrorResponse` with explicit `reason: INVALID_REQUEST`; task-execution path uses `errorResponseFromError`. - `packages/server/src/routes/piloWs.ts` - all `createErrorResponse` callsites migrated; task-execution error uses `errorResponseFromError`; dropped client-provided values from error messages. - Tests: `taskRunner.test.ts` / `pilo.test.ts` / `piloWs.test.ts` updated to assert new fields alongside existing `message`/`timestamp`/`code`. Added canary tests that assert no sentinel string ever appears in the serialized response regardless of what the thrown value's `.message` is. ## Test plan - [x] `pnpm --filter pilo-server run test` (78 passing) - [x] `pnpm --filter pilo-core run test` (658 passing) - [x] `pnpm --filter pilo-cli run test` (221 passing) - [x] `pnpm run typecheck` green - [x] `pnpm run format:check` green
MrTravisB
added a commit
that referenced
this pull request
Apr 27, 2026
…pans) (#391) ## Summary Third PR in Stack A. Stacked on #390 (A2). Adds a `recordSanitizedException(span, error, opts?)` helper to `packages/core/src/telemetry/tracing.ts` and migrates all 14 existing `span.recordException` callsites. The helper emits the OTel-standard `exception` event with **only** `exception.type` (the error's class name), deliberately omitting `exception.message` and `exception.stacktrace` - both of which routinely embed user input or page content in this codebase (agent errors quote the task, Playwright errors include selectors derived from the DOM, AI SDK errors can echo prompt fragments). Also bundles the related leak from `span.setStatus({ code: ERROR, message: error.message })`: at each migrated callsite, the status message is now `error.constructor.name` instead of `error.message`. Same leak class, same fix. ### New helper ```ts recordSanitizedException(span, error, { code?: string }) ``` - Emits `addEvent("exception", { "exception.type": errorClass })` (OTel semconv) - Sets `pilo.error.class` span attribute (fallback for tools that don't use OTel exception events) - Sets `pilo.error.code` span attribute when `opts.code` is provided - Never sets `exception.message`, `exception.stacktrace`, or the raw error object - Handles non-Error throws (class = `"Unknown"`) - Zero-overhead no-op when `@opentelemetry/api` is not installed ### Callsites migrated (14 total) - `packages/core/src/webAgent.ts` (6) - `packages/core/src/browser/playwrightBrowser.ts` (5) - `packages/core/src/tools/webActionTools.ts` (1) - `packages/core/src/tools/searchTools.ts` (1) - `packages/core/src/utils/retry.ts` (2) ### Tests - 8 new tests in `packages/core/test/telemetry/tracing.test.ts`: - Sets `pilo.error.class` from constructor name - Emits `exception` event with only `exception.type` - Never emits `exception.message` or `exception.stacktrace` - Never calls `span.recordException` (avoids OTel default message+stack emission) - Handles non-Error throws (class = `"Unknown"`) - Sets `pilo.error.code` when opts.code provided - Omits `pilo.error.code` when opts.code not provided - Works with no-op span when OTel is not installed - Canary: `new Error("DO NOT LOG THIS SENTINEL")` - sentinel must not appear in any setAttribute or addEvent call ### Span interface extension Added `addEvent(name, attributes?)` to the thin `Span` interface and the no-op span factory. No runtime behavior change when OTel is not installed. ## Test plan - [ ] `pnpm --filter pilo-core run test` (666 passing: 658 baseline + 8 new) - [ ] `pnpm --filter pilo-server run test` (78 passing) - [ ] `pnpm --filter pilo-cli run test` (221 passing) - [ ] `pnpm run typecheck` green - [ ] `pnpm run format:check` green - [ ] Manual: run a task with `OTEL_EXPORTER_OTLP_ENDPOINT` pointed at a local collector. Force an error (e.g. bad URL). Inspect the span: should have `pilo.error.class` attribute and an `exception` event with only `exception.type`. No `exception.message` or `exception.stacktrace`. ## Notes - Base: `travis/pilo-sanitized-error-shape` (stacks on #390) - A4 next: CI sentinel-leak test (spans the full request path: logs + Sentry + OTel trace export) - Not in this PR: setStatus message for NON-migrated code paths (none remain in core - all 14 callsites are now using the sanitized helper)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second PR in Stack A. Stacked on #389 (A1).
Additive change - no existing field is removed, no caller breaks. Adds
structured classification fields to the error response so dashboards and
programmatic retry logic can reason about failures without parsing
messagestrings.What's added
error.class- error constructor name (e.g."TypeError","SyntaxError")error.reason- coarse-grained enum, safe for metric labels:INVALID_REQUEST,PROVIDER_UNAUTHORIZED,NAVIGATION_TIMEOUT,BROWSER_DISCONNECTED,MAX_ITERATIONS,MAX_ERRORS,TIMEOUT,INTERNAL_ERRORerror.recoverable- boolean derived fromerror instanceof RecoverableErrorerror.phase- which pipeline phase produced the error:"setup"or"execution"What stays (unchanged for consumers)
error.message- human-readable string, still presenterror.timestamp- ISO timestamp, still presenterror.code- fine-grained semantic code, still presenterror.taskId- from A1, still presentThe data-sensitivity invariant
error.messagemust never contain rawerror.messageof a thrown value(agent errors can quote task text, URL selectors derived from page content,
or extracted values). The new code enforces this two ways:
("Task is required", "AI provider is not configured.", etc.). Server-controlled
by construction.
errorResponseFromError, which setsmessagefrom a fixedREASON_HINTSmap keyed by thereasonenum. No codepath reads
error.messageof the thrown value into the response.A canary test throws
new Error("DO NOT LOG THIS SENTINEL")and asserts thesentinel never appears in the serialized response.
Before / after
Before (old code, still passes through tabs-api to customers):
{ "success": false, "error": { "message": "Task is required", "code": "MISSING_TASK", "timestamp": "2026-04-21T12:00:00.000Z" } }After (additive, still readable by old consumers):
{ "success": false, "error": { "message": "Task is required", "code": "MISSING_TASK", "timestamp": "2026-04-21T12:00:00.000Z", "class": "Error", "reason": "INVALID_REQUEST", "recoverable": false, "phase": "setup", "taskId": "..." } }Minor security improvements (bundled)
Dropped two WS error paths that previously echoed client-provided strings
back in the message:
UNKNOWN_EVENTno longer quotesmsg.eventUNKNOWN_REQUEST_IDno longer quotes the requestIdINVALID_SEARCH_PROVIDERno longer echoes the user-provided valueChanges
packages/server/src/taskRunner.ts- newErrorReason/ErrorPhasetypes, extended
ErrorResponse, newcreateErrorResponse(object-paramsignature) and
errorResponseFromErrorhelpers,classifyErrormapper,REASON_HINTSmap.errorToStringremoved (no callers left).packages/server/src/routes/pilo.ts- validation errors use new shape viavalidateTaskRequest; setup-error path usescreateErrorResponsewithexplicit
reason: INVALID_REQUEST; task-execution path useserrorResponseFromError.packages/server/src/routes/piloWs.ts- allcreateErrorResponsecallsitesmigrated; task-execution error uses
errorResponseFromError; droppedclient-provided values from error messages.
taskRunner.test.ts/pilo.test.ts/piloWs.test.tsupdated toassert new fields alongside existing
message/timestamp/code. Addedcanary tests that assert no sentinel string ever appears in the serialized
response regardless of what the thrown value's
.messageis.Test plan
pnpm --filter pilo-server run test(78 passing)pnpm --filter pilo-core run test(658 passing)pnpm --filter pilo-cli run test(221 passing)pnpm run typecheckgreenpnpm run format:checkgreenError("DO NOT LOG THIS SENTINEL")never appears in response JSONcurl -X POST /pilo/run -d '{}'returns 400 with both the oldmessage: "Task is required"AND the newreason: "INVALID_REQUEST",phase: "setup",class: "Error",recoverable: falseNotes
travis/pilo-task-id(stacks on A1: propagate taskId through pilo server and core #389)message/timestampoutright - that removed visibility for consumers with no safety benefit
(both fields were already server-controlled). This revision keeps both,
enforces server-control at the helper layer, and adds classification
fields alongside.
recordSanitizedExceptionwrapper so spanpayloads also stop leaking raw error messages.