Skip to content

A2: add sanitized error classification fields (additive)#390

Closed
MrTravisB wants to merge 2 commits intotravis/pilo-task-idfrom
travis/pilo-sanitized-error-shape
Closed

A2: add sanitized error classification fields (additive)#390
MrTravisB wants to merge 2 commits intotravis/pilo-task-idfrom
travis/pilo-sanitized-error-shape

Conversation

@MrTravisB
Copy link
Copy Markdown
Collaborator

Summary

Second PR in Stack A. Stacked on #389 (A1).

Additive change - no existing field is removed, no caller breaks. Adds
structured classification fields to the error response so dashboards and
programmatic retry logic can reason about failures without parsing
message strings.

What's added

  • error.class - error constructor name (e.g. "TypeError", "SyntaxError")
  • error.reason - coarse-grained enum, safe for metric labels: INVALID_REQUEST,
    PROVIDER_UNAUTHORIZED, NAVIGATION_TIMEOUT, BROWSER_DISCONNECTED,
    MAX_ITERATIONS, MAX_ERRORS, TIMEOUT, INTERNAL_ERROR
  • error.recoverable - boolean derived from error instanceof RecoverableError
  • error.phase - which pipeline phase produced the error: "setup" or "execution"

What stays (unchanged for consumers)

  • error.message - human-readable string, still present
  • error.timestamp - ISO timestamp, still present
  • error.code - fine-grained semantic code, still present
  • error.taskId - from A1, still present

The data-sensitivity invariant

error.message must never contain raw error.message of a thrown value
(agent errors can quote task text, URL selectors derived from page content,
or extracted values). The new code enforces this two ways:

  1. Validation / setup errors use hardcoded string literals at the callsite
    ("Task is required", "AI provider is not configured.", etc.). Server-controlled
    by construction.
  2. Task execution errors go through errorResponseFromError, which sets
    message from a fixed REASON_HINTS map keyed by the reason enum. No code
    path reads error.message of the thrown value into the response.

A canary test throws new Error("DO NOT LOG THIS SENTINEL") and asserts the
sentinel never appears in the serialized response.

Before / after

Before (old code, still passes through tabs-api to customers):

{
  "success": false,
  "error": {
    "message": "Task is required",
    "code": "MISSING_TASK",
    "timestamp": "2026-04-21T12:00:00.000Z"
  }
}

After (additive, still readable by old consumers):

{
  "success": false,
  "error": {
    "message": "Task is required",
    "code": "MISSING_TASK",
    "timestamp": "2026-04-21T12:00:00.000Z",
    "class": "Error",
    "reason": "INVALID_REQUEST",
    "recoverable": false,
    "phase": "setup",
    "taskId": "..."
  }
}

Minor security improvements (bundled)

Dropped two WS error paths that previously echoed client-provided strings
back in the message:

  • UNKNOWN_EVENT no longer quotes msg.event
  • UNKNOWN_REQUEST_ID no longer quotes the requestId
  • INVALID_SEARCH_PROVIDER no longer echoes the user-provided value

Changes

  • packages/server/src/taskRunner.ts - new ErrorReason / ErrorPhase
    types, extended ErrorResponse, new createErrorResponse (object-param
    signature) and errorResponseFromError helpers, classifyError mapper,
    REASON_HINTS map. errorToString removed (no callers left).
  • packages/server/src/routes/pilo.ts - validation errors use new shape via
    validateTaskRequest; setup-error path uses createErrorResponse with
    explicit reason: INVALID_REQUEST; task-execution path uses
    errorResponseFromError.
  • packages/server/src/routes/piloWs.ts - all createErrorResponse callsites
    migrated; task-execution error uses errorResponseFromError; dropped
    client-provided values from error messages.
  • Tests: taskRunner.test.ts / pilo.test.ts / piloWs.test.ts updated to
    assert new fields alongside existing message/timestamp/code. Added
    canary tests that assert no sentinel string ever appears in the serialized
    response regardless of what the thrown value's .message is.

Test plan

  • pnpm --filter pilo-server run test (78 passing)
  • pnpm --filter pilo-core run test (658 passing)
  • pnpm --filter pilo-cli run test (221 passing)
  • pnpm run typecheck green
  • pnpm run format:check green
  • Canary: thrown Error("DO NOT LOG THIS SENTINEL") never appears in response JSON
  • Manual: curl -X POST /pilo/run -d '{}' returns 400 with both the old
    message: "Task is required" AND the new reason: "INVALID_REQUEST",
    phase: "setup", class: "Error", recoverable: false

Notes

  • Base: travis/pilo-task-id (stacks on A1: propagate taskId through pilo server and core #389)
  • Revised from an earlier internal design that dropped message/timestamp
    outright - that removed visibility for consumers with no safety benefit
    (both fields were already server-controlled). This revision keeps both,
    enforces server-control at the helper layer, and adds classification
    fields alongside.
  • A3 next: redacting OTel recordSanitizedException wrapper so span
    payloads also stop leaking raw error messages.

@MrTravisB MrTravisB marked this pull request as ready for review April 21, 2026 17:20
@MrTravisB MrTravisB deleted the branch travis/pilo-task-id April 27, 2026 16:13
@MrTravisB MrTravisB closed this Apr 27, 2026
MrTravisB added a commit that referenced this pull request Apr 27, 2026
**Replaces #390** (auto-closed when its base branch was deleted by the
A1 squash-merge).

## Summary
Second PR in Stack A. Stacks on the merged A1 (#389).

Adds structured classification fields to the error response so
dashboards and
programmatic retry logic can reason about failures without parsing
`message` strings. Additive only - no existing field is removed.

### What's added
- `error.class` - error constructor name (e.g. `"TypeError"`,
`"SyntaxError"`)
- `error.reason` - coarse-grained enum, safe for metric labels:
`INVALID_REQUEST`,
  `PROVIDER_UNAUTHORIZED`, `NAVIGATION_TIMEOUT`, `BROWSER_DISCONNECTED`,
  `MAX_ITERATIONS`, `MAX_ERRORS`, `TIMEOUT`, `INTERNAL_ERROR`
- `error.recoverable` - boolean derived from `error instanceof
RecoverableError`
- `error.phase` - which pipeline phase produced the error: `"setup"` or
`"execution"`

### What stays (unchanged for consumers)
- `error.message` - human-readable string, still present
- `error.timestamp` - ISO timestamp, still present
- `error.code` - fine-grained semantic code, still present
- `error.taskId` - from A1, still present

### The data-sensitivity invariant
`error.message` must never contain raw `error.message` of a thrown value
(agent errors can quote task text, URL selectors derived from page
content,
or extracted values). The new code enforces this two ways:

1. **Validation / setup errors** use hardcoded string literals at the
callsite
("Task is required", "AI provider is not configured.", etc.).
Server-controlled
   by construction.
2. **Task execution errors** go through `errorResponseFromError`, which
sets
`message` from a fixed `REASON_HINTS` map keyed by the `reason` enum. No
code
   path reads `error.message` of the thrown value into the response.

A canary test throws `new Error("DO NOT LOG THIS SENTINEL")` and asserts
the
sentinel never appears in the serialized response.

### Minor security improvements (bundled)
Dropped two WS error paths that previously echoed client-provided
strings
back in the message:
- `UNKNOWN_EVENT` no longer quotes `msg.event`
- `UNKNOWN_REQUEST_ID` no longer quotes the requestId
- `INVALID_SEARCH_PROVIDER` no longer echoes the user-provided value

### Changes
- `packages/server/src/taskRunner.ts` - new `ErrorReason` / `ErrorPhase`
types, extended `ErrorResponse`, new `createErrorResponse` (object-param
signature) and `errorResponseFromError` helpers, `classifyError` mapper,
  `REASON_HINTS` map. `errorToString` removed (no callers left).
- `packages/server/src/routes/pilo.ts` - validation errors use new shape
via
`validateTaskRequest`; setup-error path uses `createErrorResponse` with
  explicit `reason: INVALID_REQUEST`; task-execution path uses
  `errorResponseFromError`.
- `packages/server/src/routes/piloWs.ts` - all `createErrorResponse`
callsites
  migrated; task-execution error uses `errorResponseFromError`; dropped
  client-provided values from error messages.
- Tests: `taskRunner.test.ts` / `pilo.test.ts` / `piloWs.test.ts`
updated to
assert new fields alongside existing `message`/`timestamp`/`code`. Added
canary tests that assert no sentinel string ever appears in the
serialized
  response regardless of what the thrown value's `.message` is.

## Test plan
- [x] `pnpm --filter pilo-server run test` (78 passing)
- [x] `pnpm --filter pilo-core run test` (658 passing)
- [x] `pnpm --filter pilo-cli run test` (221 passing)
- [x] `pnpm run typecheck` green
- [x] `pnpm run format:check` green
MrTravisB added a commit that referenced this pull request Apr 27, 2026
…pans) (#391)

## Summary
Third PR in Stack A. Stacked on #390 (A2).

Adds a `recordSanitizedException(span, error, opts?)` helper to
`packages/core/src/telemetry/tracing.ts` and migrates all 14 existing
`span.recordException` callsites. The helper emits the OTel-standard
`exception` event with **only** `exception.type` (the error's class
name),
deliberately omitting `exception.message` and `exception.stacktrace` -
both
of which routinely embed user input or page content in this codebase
(agent errors quote the task, Playwright errors include selectors
derived
from the DOM, AI SDK errors can echo prompt fragments).

Also bundles the related leak from `span.setStatus({ code: ERROR,
message: error.message })`:
at each migrated callsite, the status message is now
`error.constructor.name`
instead of `error.message`. Same leak class, same fix.

### New helper
```ts
recordSanitizedException(span, error, { code?: string })
```
- Emits `addEvent("exception", { "exception.type": errorClass })` (OTel
semconv)
- Sets `pilo.error.class` span attribute (fallback for tools that don't
use
  OTel exception events)
- Sets `pilo.error.code` span attribute when `opts.code` is provided
- Never sets `exception.message`, `exception.stacktrace`, or the raw
error object
- Handles non-Error throws (class = `"Unknown"`)
- Zero-overhead no-op when `@opentelemetry/api` is not installed

### Callsites migrated (14 total)
- `packages/core/src/webAgent.ts` (6)
- `packages/core/src/browser/playwrightBrowser.ts` (5)
- `packages/core/src/tools/webActionTools.ts` (1)
- `packages/core/src/tools/searchTools.ts` (1)
- `packages/core/src/utils/retry.ts` (2)

### Tests
- 8 new tests in `packages/core/test/telemetry/tracing.test.ts`:
  - Sets `pilo.error.class` from constructor name
  - Emits `exception` event with only `exception.type`
  - Never emits `exception.message` or `exception.stacktrace`
- Never calls `span.recordException` (avoids OTel default message+stack
emission)
  - Handles non-Error throws (class = `"Unknown"`)
  - Sets `pilo.error.code` when opts.code provided
  - Omits `pilo.error.code` when opts.code not provided
  - Works with no-op span when OTel is not installed
- Canary: `new Error("DO NOT LOG THIS SENTINEL")` - sentinel must not
appear
  in any setAttribute or addEvent call

### Span interface extension
Added `addEvent(name, attributes?)` to the thin `Span` interface and the
no-op span factory. No runtime behavior change when OTel is not
installed.

## Test plan
- [ ] `pnpm --filter pilo-core run test` (666 passing: 658 baseline + 8
new)
- [ ] `pnpm --filter pilo-server run test` (78 passing)
- [ ] `pnpm --filter pilo-cli run test` (221 passing)
- [ ] `pnpm run typecheck` green
- [ ] `pnpm run format:check` green
- [ ] Manual: run a task with `OTEL_EXPORTER_OTLP_ENDPOINT` pointed at a
local
collector. Force an error (e.g. bad URL). Inspect the span: should have
  `pilo.error.class` attribute and an `exception` event with only
  `exception.type`. No `exception.message` or `exception.stacktrace`.

## Notes
- Base: `travis/pilo-sanitized-error-shape` (stacks on #390)
- A4 next: CI sentinel-leak test (spans the full request path: logs +
  Sentry + OTel trace export)
- Not in this PR: setStatus message for NON-migrated code paths (none
  remain in core - all 14 callsites are now using the sanitized helper)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants