Skip to content

fix(eval): surface provider failure cause and harden trajectory redaction#834

Merged
Yiminnn merged 1 commit into
mainfrom
fix/830-proxy-failure-detail
Jun 27, 2026
Merged

fix(eval): surface provider failure cause and harden trajectory redaction#834
Yiminnn merged 1 commit into
mainfrom
fix/830-proxy-failure-detail

Conversation

@Yiminnn

@Yiminnn Yiminnn commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Summary

Closes the remaining open half of #830 (fix #2fix #1 landed in #831): durably surface the upstream provider error cause for triage, and keep it — plus every newly-surfaced failure body — free of leaked secrets.

Provider failure cause (Option A)

The pi/litellm failure callback set error.message = str(response_obj), but on a deterministic provider reject (context-window overflow, etc.) litellm fires the hook with response_obj=None, so the message became the literal "None" and the real cause was lost on teardown. It now falls back to kwargs['exception'], so error.message carries the real upstream reason. This rides the existing redacted llm_trajectory.jsonl pipeline; result.json stays status-code-only, so the #546/#564 secret posture is untouched.

Redaction hardening

Option A now routes the full provider error body into error.message for all failure types, so redact_trajectory_text had to be hardened. Two adversarial verification passes drove these out (each empirically reproduced, then locked with a test):

  • Escaped-JSON bypass (was leaking): carriers now fire on json.dumps-escaped JSON (\"k\": \"v\") at any escape depth — the exact form Trajectory.to_jsonl produces — not just raw/dict-repr.
  • Catastrophic O(n²) ReDoS (pre-existing, exposed here): a base64 image field stalled the redactor ~45 s → now <0.5 s. Variable name prefixes are length-capped and the env/JSON carriers are left-anchored.
  • New coverage: URL userinfo for credential-bearing schemes (incl. un-encoded @ in passwords); sk-benchflow/svcacct/admin/or-v1, gsk_/xai-/r8_/hf_/fw_, and JWT key families; AWS_SECRET_ACCESS_KEY / *_ACCOUNT_KEY; GCP PEM private-key blocks; master_key/private_key carriers; AWS SigV4 / Azure SAS query params.
  • Over-redaction guards: kebab slugs (sk-proj-refactor-...), primary_key/foreign_key, ?key=name, short ?sig=<version>, vscode:// deep-links, eyJ-prefixed method chains.

Reproduction → Verification

# Before (issue #830): cause lost
error.message = "None"   (response_obj=None; real reason only in the deleted proxy stderr.log)

# After: cause preserved (redacted), result.json unchanged
error.message = "...Requested token count exceeds the model's maximum context length of 16384 tokens..."

Test plan

  • pytest tests/trajectories/test_redaction.py tests/test_litellm_logging.py — 108 passed (incl. escaped-JSON, double-escape, ReDoS timing guards, AWS/GCP key vectors, over-redaction guards)
  • Full suite: 4528 passed (4 pre-existing unrelated failures, confirmed on the clean base)
  • ruff check + ruff format --check + ty check clean
  • Two adversarial verification passes (leak / over-redaction / ReDoS), all findings fixed + re-verified

Related

Refs #830 (intentionally left open — a few LOW residuals noted below are deferred). fix #1: #831.

Consciously deferred (LOW / noted)

  • A secret value containing a literal & truncates at the & (query-sibling protection prioritized; real keys don't contain &).
  • SigV4 Signature= inside an Authorization header (over-redaction risk on the common word "signature").
  • Non-allowlisted credential URL schemes (e.g. clickhouse://, kafka://).

🤖 Generated with Claude Code

…tion

Issue #830 fix#2: durably surface the upstream provider error cause for
triage, and keep it (plus all newly-surfaced failure bodies) free of
leaked secrets.

Provider failure cause (Option A):
- The pi/litellm failure callback now falls back to kwargs['exception']
  when response_obj is None, so error.message carries the real upstream
  reason (e.g. a context-window overflow) instead of the literal 'None'.
  It rides the existing redacted llm_trajectory.jsonl pipeline; result.json
  stays status-code-only, so the #546/#564 posture is preserved.

Redaction hardening (Option A now surfaces full provider bodies for ALL
failure types, so redact_trajectory_text must cover them safely):
- Carriers fire on json.dumps-escaped JSON (\"k\": \"v\") at any escape
  depth -- the form Trajectory.to_jsonl produces -- not just raw/dict-repr.
- Fix catastrophic O(n^2) ReDoS on long alnum runs (a base64 image field
  stalled the redactor ~45s): length-cap the variable name prefixes and
  left-anchor the env/JSON carriers.
- New coverage: URL userinfo for credential-bearing schemes (incl.
  un-encoded '@' in passwords), sk-benchflow/svcacct/admin/or-v1 and
  gsk_/xai-/r8_/hf_/fw_ and JWT key families, AWS_SECRET_ACCESS_KEY /
  *_ACCOUNT_KEY, GCP PEM private-key blocks, master_key/private_key
  carriers, AWS SigV4 / Azure SAS query params.
- Avoid over-redaction: kebab slugs (sk-proj-refactor-...), primary_key /
  foreign_key, ?key=name, short ?sig=<version>, vscode:// deep-links, and
  eyJ-prefixed method chains.

Refs #830 (fix#1 landed in #831).

Verified: 108 redaction/logging tests incl. ReDoS timing guards; full
suite 4528 passed; two adversarial verification passes; ruff + ty clean.
@Yiminnn Yiminnn temporarily deployed to pypi-internal-preview June 27, 2026 02:55 — with GitHub Actions Inactive
@greptile-apps

greptile-apps Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR surfaces upstream provider failure reasons (e.g. context-window overflow) that were previously swallowed as the literal string \"None\" when litellm fires the failure callback with response_obj=None. It also substantially hardens the trajectory redaction layer against the new surface area introduced by routing full provider error bodies into error.message.

  • Provider failure fix: _failure_detail() falls back to kwargs['exception'] when response_obj is None, so error.type/message carry the real upstream reason. The existing response field is unchanged (stays null), and result.json remains status-code-only.
  • Redaction hardening: New and updated patterns cover escaped-JSON bypass (via _ESCQ), ReDoS on base64 image fields (via length-capped _NAME), URL userinfo credentials, curated query-param secrets, JWT tokens, PEM key blocks, and a dozen additional provider key families (Groq, xAI, Replicate, HuggingFace, Fireworks, AWS, Azure). Extensive parametrized tests—including ReDoS timing guards and over-redaction guards—are added for each.

Confidence Score: 4/5

Safe to merge; the functional change is a one-liner fallback and the redaction hardening is extensively tested with both leak and over-redaction guards.

The provider-failure fix is minimal and correct — _failure_detail uses is not None (not truthiness), so falsy-but-valid response objects are handled safely. The redaction expansion is complex but every new pattern is paired with a parametrized test that verifies both the positive (secret redacted) and negative (non-secret preserved) case, plus ReDoS timing guards. The two comments are style-level: a comment that overstates the escape-depth bound, and a scenario where traceback.format_exc() could diverge from error.message if litellm clears the exception context before invoking the async callback.

src/benchflow/trajectories/types.py — the redaction pattern list is now long and intricate; future additions should continue to follow the length-capped, left-anchored pattern established here to avoid re-introducing ReDoS.

Important Files Changed

Filename Overview
src/benchflow/providers/litellm_logging.py Adds _failure_detail() inside the embedded callback module source to fall back to kwargs['exception'] when response_obj is None; async_log_failure_event is updated to use it for error.type/message while keeping the response field unchanged.
src/benchflow/trajectories/types.py Adds _ESCQ / _SECVAL / _NAME helper atoms and substantially expands _REDACTION_PATTERNS with PEM key, JWT, URL userinfo, query-param, and provider key family patterns. ReDoS is addressed with length caps and left-anchors throughout.
tests/test_litellm_logging.py Adds _callback_namespace() helper and four new tests covering _failure_detail() unit behavior and an end-to-end failure-event write path; async test is correctly handled by the project's asyncio_mode=auto config.
tests/trajectories/test_redaction.py Adds comprehensive parametrized tests for all new redaction vectors (25+ leak cases, 20+ over-redaction guards, double-escape path, and ReDoS timing guards). Coverage appears exhaustive for the documented threat model.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant LiteLLM
    participant CB as async_log_failure_event
    participant FD as _failure_detail
    participant RD as redact_trajectory_text
    participant JL as llm_trajectory.jsonl

    LiteLLM->>CB: "response_obj=None, kwargs[exception]=ContextWindowError"
    CB->>FD: "response_obj=None, exception=ContextWindowError"
    FD-->>CB: "detail = ContextWindowError (fix: was None before)"
    CB->>JL: "write {error.type=ContextWindowError, error.message=...tokens...}"
    Note over CB,JL: "response field stays null"

    LiteLLM->>CB: "response_obj=ErrorResponse"
    CB->>FD: "response_obj=ErrorResponse"
    FD-->>CB: "detail = ErrorResponse (existing path unchanged)"
    CB->>JL: "write {error.type=ErrorResponse, ...}"

    JL->>RD: "json.dumps-escaped line with potential secrets"
    Note over RD: "_ESCQ handles backslash-escaped quotes"
    Note over RD: "_NAME cap prevents ReDoS"
    RD-->>JL: "line with ***REDACTED*** substitutions"
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant LiteLLM
    participant CB as async_log_failure_event
    participant FD as _failure_detail
    participant RD as redact_trajectory_text
    participant JL as llm_trajectory.jsonl

    LiteLLM->>CB: "response_obj=None, kwargs[exception]=ContextWindowError"
    CB->>FD: "response_obj=None, exception=ContextWindowError"
    FD-->>CB: "detail = ContextWindowError (fix: was None before)"
    CB->>JL: "write {error.type=ContextWindowError, error.message=...tokens...}"
    Note over CB,JL: "response field stays null"

    LiteLLM->>CB: "response_obj=ErrorResponse"
    CB->>FD: "response_obj=ErrorResponse"
    FD-->>CB: "detail = ErrorResponse (existing path unchanged)"
    CB->>JL: "write {error.type=ErrorResponse, ...}"

    JL->>RD: "json.dumps-escaped line with potential secrets"
    Note over RD: "_ESCQ handles backslash-escaped quotes"
    Note over RD: "_NAME cap prevents ReDoS"
    RD-->>JL: "line with ***REDACTED*** substitutions"
Loading

Comments Outside Diff (1)

  1. src/benchflow/providers/litellm_logging.py, line 225-239 (link)

    P2 traceback field may not align with the newly-surfaced error.message

    traceback.format_exc() captures the exception that is currently active at the moment the callback fires. For the primary case this PR fixes — litellm calling the hook from within a try/except with response_obj=None — the active exception is the same object stored in kwargs['exception'], so traceback and error.message will agree. However, if litellm's plumbing clears the exception context before invoking the callback (or routes through an async boundary that resets it), format_exc() returns "NoneType: None\n" while error.message now carries the real cause. A reader of the log then sees a meaningful error.message but a useless empty traceback. Adding traceback.format_exception(detail) as a fallback when format_exc() returns the None sentinel would make the two fields consistent.

Reviews (1): Last reviewed commit: "fix(eval): surface provider failure caus..." | Re-trigger Greptile

# form is how a provider error body embedded in ``error.message`` reaches the
# redactor after ``Trajectory.to_jsonl`` runs ``json.dumps`` over the record
# before redacting it (#830). The ``{0,8}`` cap keeps it ReDoS-bounded.
_ESCQ = r'\\{0,8}["\']?'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _ESCQ comment overstates escape-depth coverage

The comment says "at any escape depth" but \\{0,8} tops out at 8 backslashes, which reliably covers ~3 levels of json.dumps nesting (level 1 → 1 backslash per quote, level 2 → 3, level 3 → 7). A fourth json.dumps would produce 15 backslashes and would not be matched. In practice the deepest realistic nesting is a single to_jsonl pass over a record that already contains an escaped-JSON string (level 2), so the bound is perfectly adequate — the comment should say "up to ~3 nesting levels" rather than "any nesting depth" to avoid misleading future maintainers who widen the patterns.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@Yiminnn Yiminnn merged commit 35ebe39 into main Jun 27, 2026
8 checks passed
Yiminnn added a commit that referenced this pull request Jun 27, 2026
…age (#835)

Follow-up to #834 (Greptile P2). Add _failure_traceback(detail): when format_exc() is the 'NoneType: None' sentinel (litellm cleared the exception context) but detail is the cause recovered from kwargs['exception'], format that exception directly so the traceback isn't blank under a meaningful error.message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant