feat(proxy): honor + echo X-LLMTrace-Trace-Id header (#91 E2E-L1a)#114
Merged
Conversation
epappas
added a commit
that referenced
this pull request
Apr 22, 2026
…04) (#117) * chore(deps): bump rustls-webpki 0.103.12 -> 0.103.13 (RUSTSEC-2026-0104) RUSTSEC-2026-0104: reachable panic in certificate revocation list parsing in rustls-webpki <0.103.13. The advisory was published after main's last successful CI run (2026-04-21 21:55), so all currently open PRs (#114, #115, #116) inherit the audit failure. `cargo update -p rustls-webpki` resolves to the patch release that contains the fix. No source changes; the bump is transitive via the rustls / reqwest / quinn dependency chain. Verification: cargo audit exit 0 (only unmaintained-crate warnings remain; no vulnerabilities) cargo build --workspace ok Unblocks the Security Audit job on #114, #115, #116. * chore: retrigger CI (previous Clippy hung 60 min)
The E2E test framework (issue #91) needs a client-controlled correlation id to attribute per-request observations (findings, metrics deltas, judge verdicts) to scenarios. Today the proxy always generates a fresh v4 trace_id at request entry and never echoes it, so the harness can't correlate. Changes to crates/llmtrace-proxy/src/proxy.rs: - New TRACE_ID_HEADER const ("x-llmtrace-trace-id") and extract_or_generate_trace_id(&HeaderMap) -> Uuid helper that reads the inbound header, parses as Uuid (whitespace-tolerant), falls back to Uuid::new_v4() on missing/unparseable. - proxy_handler uses the helper in place of Uuid::new_v4(). - Every response path echoes TRACE_ID_HEADER: the success builder, error_response, rate_limit_response, and cap_rejected_response each take a trace_id parameter and stamp the header. - 4 new unit tests cover the helper; 2 updated response-shape tests assert the echoed header. Also commits docs/TODO_E2E.md with the Loop E2E-L1 pre-flight audit findings that motivated this change (trace-id was the one real gap) and the L1a checklist marked complete. Part of the E2E adversarial test framework breakdown in TODO_E2E.md. Unblocks Loops E2E-L4 (metrics observer) and E2E-L5 (verdict collector). Tests: - cargo test -p llmtrace --lib proxy:: 27/27 ok - cargo test -p llmtrace --lib 569/569 ok - cargo clippy -p llmtrace --lib -D warn clean - cargo fmt --check -p llmtrace clean
2e014cc to
8edd8bd
Compare
5 tasks
epappas
added a commit
that referenced
this pull request
Apr 23, 2026
…attack detection (#97) (#122) * feat(e2e): attack-scenario schema, validator, and 3 examples (#92) Lock the YAML contract every e2e attack scenario follows. Blocks every other loop in the framework tracked under #91. New files: - benchmarks/attacks/schema.json — JSON Schema (Draft 2020-12). Closed enums for family (10 values), proxy_outcome, recommended_action, and severity. Required fields: id, source, family, prompt, expected. judge_verdict block is optional so judge-disabled runs work. Uses additionalProperties: false at every object level so typos are rejected loudly. - benchmarks/attacks/SCHEMA.md — human-readable reference: top-level fields, every enum with semantics, expectation comparators, skip block contract, what the schema does NOT validate (out-of-scope to triage loops). - benchmarks/attacks/{prompt_injection,over_defense,encoding_evasion}/ *.yaml — 3 hand-written canonical examples that double as schema documentation. All tagged 'pr-gate' so they exercise the L9 subset. - scripts/e2e/validate_scenarios.py — walks benchmarks/attacks/**/ *.yaml, validates each against schema.json, detects duplicate ids across the corpus, prints per-file summary, exits non-zero on any failure. Functions all under 50 LOC. - requirements-e2e.txt — pinned jsonschema + PyYAML. Modified: - .github/workflows/ci.yml — new e2e-validate-scenarios job. No Rust toolchain, runs on every push/PR. Verification: - python3 scripts/e2e/validate_scenarios.py — 3/3 valid, exit 0 - failure-path sanity (exit 1 + actionable messages): * bad enum value * missing required field * unparseable YAML * duplicate ids across files - python3 -c "yaml.safe_load(open('.github/workflows/ci.yml'))" — OK * feat(e2e): pytest harness skeleton + mock upstream + first scenario test (#93) Loop E2E-L3 of the e2e adversarial test framework (umbrella #91). Boots the LLMTrace proxy as a subprocess against an in-process FastAPI mock upstream, fires every scenario YAML under benchmarks/attacks/ at it, and asserts the per-scenario expected.proxy_outcome.at_* constraints. Asserts proxy outcome only — metrics-delta and judge-verdict observability land in Loops E2E-L4 / L5 / L6. New files: - tests/e2e/conftest.py — session-scoped fixtures: * proxy_config_path: copies the e2e config to a temp file. * mock_upstream: free-port FastAPI subprocess, /health-gated. * proxy: free-port llmtrace-proxy subprocess wired to the mock via LLMTRACE_LISTEN_ADDR / LLMTRACE_UPSTREAM_URL / LLMTRACE_STORAGE_* env-var overrides; binary discovered via LLMTRACE_PROXY_BIN, then target/release/, then target/debug/. * scenarios: walks benchmarks/attacks/, parametrises tests by id; --family / --tag CLI filters; respects skip blocks. * Hard guard at collection time that rejects pytest-xdist (-n) because counter-diff observability (L4) requires serial execution. * Reliable teardown: SIGTERM-then-wait-10s in finally blocks for both subprocesses; verified no zombie processes remain. - tests/e2e/test_cascade.py — first parametrised test. Outcome classifier maps response to allow/warn/block heuristically (refined by L6 DSL later). Failure messages include scenario id, expected vs observed, status, flagged header, and trace_id. - tests/e2e/mock_upstream.py — FastAPI canned OpenAI-compatible /v1/chat/completions response. Always returns the same assistant message; PR-gate tests assert on what LLMTrace decides, not on upstream content. - tests/e2e/fixtures/config-e2e.yaml — judge OFF base config. - tests/e2e/fixtures/config-e2e-judge.yaml — judge cascade ON, slow tier null (matches the cascade-null-slow PR-gate matrix dimension). Both configs: * shutdown.timeout_seconds: 5 (per Loop E2E-L1 finding E2E-006 so SIGTERM teardown completes inside the harness's 10 s budget). * auth.enabled: false (no API-key bootstrap needed in tests). * enforcement.mode: flag, min_severity: Low — surfaces findings as response headers (x-llmtrace-flagged) so the harness can observe them. Default mode is `log`, which detects findings but never surfaces them; the harness needs the response-header signal. - tests/e2e/README.md — how to run, filter, inspect logs, and where things live. - pytest.ini — repo-local config so CI doesn't depend on a parent pyproject.toml. Declares `serial` and `pr_gate` markers. Modified: - requirements-e2e.txt — pinned pytest, requests, prometheus-client, fastapi, uvicorn (in addition to L2's jsonschema + PyYAML). - .gitignore — tests/e2e/.logs/ (per-session subprocess logs) and .venv-e2e/ (local virtualenv). Verification: - pytest tests/e2e/ -v — 3/3 example scenarios pass in ~25s. * dan-classic-001 → warn (regex flagged 3 findings). * xstest-violence-question-001 → allow (correctly NOT flagged). * base64-command-001 → warn (regex flagged 4 findings). - pgrep llmtrace-proxy / mock_upstream after exit → empty. - Failure-message quality verified by tightening one xstest expectation: message includes id, expected vs observed, status, flagged value, trace_id. - pytest --collect-only confirms rootdir = repo root and pytest.ini is picked up (no parent pyproject.toml leakage). Stacked on top of feat/issue-92-e2e-scenario-schema (PR #115) because this loop reads benchmarks/attacks/*.yaml + reuses requirements-e2e.txt. * feat(e2e): metrics-delta + trace-id observer (#94) Loop E2E-L4 of the e2e adversarial test framework (umbrella #91). Per-scenario observability of the /metrics surface so harness assertions can talk in terms of "this scenario produced N findings of type X" — the foundation Loops E2E-L5 (judge verdicts) and L6 (expectation DSL) build on. New files: - tests/e2e/observer.py — MetricsSnapshot + helpers. * fetch(url) / parse(text): read the Prometheus exposition format into a flat (sample_name, labels) -> value dict. * diff(before) / series(name, labels) / __contains__: counter subtraction, gauge "latest wins", histogram _count/_sum/_bucket diffs, subset-label matching that sums across unspecified labels. * `_family_name` strips `_total` alongside `_bucket/_count/_sum/_created` so queries with or without the _total suffix both work. * render_nonzero() / render_assertion_context() produce the deterministic pretty-print that gets attached to every assertion failure message for triage. * fetch_after_until_settled(): LLMTrace records security findings in a background task that outlives the synchronous upstream response, so a naive MetricsSnapshot.fetch misses them. This helper polls /metrics until the delta plateaus across two reads (or a 10 s timeout). * collect_finding_types(): extracts observed finding_type labels from the llmtrace_security_findings_total delta for the findings_include assertion. - tests/e2e/test_observer_unit.py — 18 unit tests covering parser, counter/gauge/histogram diffs, label-subset matching, render determinism, and the diff-self invariant. All tests use recorded /metrics text fixtures; no live proxy. - tests/e2e/fixtures/sample_metrics_before.txt and sample_metrics_after.txt — hand-built fixtures exercising counter+gauge+histogram with overlapping and disjoint label sets. Modified: - tests/e2e/test_cascade.py — after each scenario fires, diffs /metrics (polling through fetch_after_until_settled when the scenario expects findings) and: * asserts every declared expected.findings_include finding_type appears in the delta, * attaches render_assertion_context(delta) to every failure message so the first reader sees what LLMTrace actually recorded, not just the expected-vs-observed summary. * marks test_scenario with @pytest.mark.serial (E2E-034). - benchmarks/attacks/prompt_injection/dan-classic-001.yaml, benchmarks/attacks/encoding_evasion/base64-command-001.yaml — findings_include updated to match the actual finding_type values the ensemble emits (jailbreak for DAN, encoding_attack for the base64 payload). Inline comment explains the rationale so future authors pick stable labels over detector-specific ones. Verification: - python3 -m pytest tests/e2e/ -v 21/21 pass (~20s) * 3 scenarios integration (dan + xstest + base64) * 18 observer unit tests - Failure-message quality verified by injecting findings_include: [prompt_injection] into the benign xstest scenario — message includes scenario id, expected types, observed types, trace_id, and the full non-zero metrics delta. - No regression on the L3 proxy_outcome assertions. E2E-034 (pytest-xdist guard) and E2E-035 (trace-id header on every request) already landed in L3 and are unchanged here. Stacked on feat/issue-93-e2e-pytest-harness-stacked (#116). * feat(e2e): judge verdict collector + debug endpoint + degraded-mode (#95) Loop E2E-L5 of the e2e adversarial test framework (umbrella #91). Stitches the async judge verdict surface back into per-scenario assertions so the harness can declare expected.judge_verdict.* in the scenario YAML and have it verified against the persisted verdict. ## Rust - New `ServerConfig { debug_endpoints: bool }` (default false) on `ProxyConfig`. Production proxies must not enable this — debug routes return verdicts un-auth-gated by trace_id. - New `crates/llmtrace-proxy/src/debug.rs` with `verdict_by_trace_id_handler`. Thin wrapper over the existing `JudgeVerdictStore::query_verdicts(JudgeVerdictQuery { trace_id, .. })` trait method (no new trait surface needed — Loop E2E-L1 audit finding E2E-003 already confirmed `JudgeVerdictQuery.trace_id` exists). Returns 200 + verdict JSON, 404 when absent or flag off, 400 on non-UUID query param. - `build_router` registers `GET /debug/judge/verdicts` only when `server.debug_endpoints: true`. When the flag is off the route is not mounted at all (axum's not-found handler returns 404). Operator gets a loud `WARN` log on boot when the flag is on. - 4 new Rust integration tests in `main.rs::tests`: * 200 + verdict JSON when flag on + verdict exists * 404 when flag on but no verdict for trace * 404 when flag off (proves the route is NOT mounted) * 400 on non-UUID trace_id ## Python harness - `tests/e2e/observer.py`: * `poll_judge_verdict(base_url, trace_id, timeout=10s)` — 250 ms polling against `/debug/judge/verdicts`. Returns the verdict dict on 200, None on timeout, raises HTTPError on other 4xx/5xx. * `shadow_would_block_count(delta, category=…, recommended_action=…)` — sums `llmtrace_judge_shadow_would_block_total` deltas; returns 0.0 when the metric is absent so callers can treat absence and zero symmetrically. * `judge_backend_errored(delta)` — True iff `llmtrace_judge_requests_total{status="backend_error"}` ticked in the window. - `tests/e2e/test_cascade.py` — wires `expected.judge_verdict.*` into the per-scenario assertions: * `is_threat`, `category`, `recommended_action.at_least/at_most`. * On verdict-not-found + judge_backend_errored=True: pytest.skip with explanation (degraded mode — provider/upstream flake, not LLMTrace regression). * On verdict-not-found + no backend_error: pytest.fail with the full metrics-delta context attached. - `tests/e2e/fixtures/config-e2e-judge.yaml` — flips `server.debug_endpoints: true`, `action_router.enabled: true`, and adds `judge_route` to `default_actions` so the cascade fast tier actually runs and verdicts get persisted. Without these the worker spawns but never receives requests. - `tests/e2e/conftest.py` — default config switched from `config-e2e.yaml` (judge OFF) to `config-e2e-judge.yaml` so the cascade-null-slow PR-gate matrix dimension is exercised on every run. - 7 new observer unit tests covering shadow_would_block_count (filtered + unfiltered), judge_backend_errored (positive/negative/ absent), and the empty-snapshot edge case. Total observer suite now 25 tests; integration suite 3. ## Docs - `docs/guides/e2e-testing.md` — full operator guide: quick start, architecture diagram, debug endpoint contract with the production- safety call-out, judge collector lifecycle, degraded-mode handling, shadow-mode signal, failure-message anatomy, file map. ## Verification - cargo test -p llmtrace --bin llmtrace-proxy 11/11 ok - cargo clippy --workspace -- -D warnings clean - cargo fmt --check clean - pytest tests/e2e/ -v 28/28 ok in ~45s L1 audit corrections honoured: - E2E-040, E2E-041 (proposed `find_by_trace_id` trait method) remain SUPERSEDED — `query_verdicts` is the entry point used here. Stacked on feat/issue-94-e2e-metrics-observer (#118). The whole stack (#115 → #116 → #118 → this) was rebased onto main after #114 (L1a) and #117 (rustls bump) merged earlier today. * feat(e2e): expectation DSL + assertion helpers (#96) Loop E2E-L6 of the e2e adversarial test framework (umbrella #91). Formalises the per-comparator if-blocks that L3/L4/L5 accumulated in test_cascade.py into a small, pure DSL. The orchestrator returns one AssertionResult per declared comparator; the test wrapper aggregates into a pytest verdict. Makes the harness extensible for L7 (50-scenario seed corpus) and L8 (upstream-fell-for-it detector) without growing test_cascade.py further. New files: - tests/e2e/expect.py — the DSL. * `Severity`, `ProxyOutcome`, `RecommendedAction` IntEnums (E2E-053). Total ordering with parse() / .label round-trip and explicit rejection of unknown labels. * `AssertionResult { comparator, passed, soft, message, fields }` dataclass (E2E-050). * One `_compare_*` helper per supported `expected.*` key (E2E-052): - proxy_outcome.at_least / at_most - findings_include - findings_min_severity (NEW — schema declared it but no loop had wired it up; uses the peak severity across the per-scenario delta of llmtrace_security_findings_total) - judge_verdict.is_threat - judge_verdict.category - judge_verdict.recommended_action.at_least / at_most * `assert_scenario(scenario, response, delta, verdict, judge_degraded)` orchestrator (E2E-051). Pure: no I/O, all inputs handed in. * `classify_proxy_outcome(response)` returning ProxyOutcome (the enum form of L3's allow/warn/block heuristic). * `render_assertion_summary(results)` for failure-message rendering with `[ok]`/`[soft]`/`[FAIL]` per-row markers. * Unknown top-level OR judge-block keys produce explicit failure rows so typos in scenario YAML cannot silently skip an assertion. - tests/e2e/test_expect_unit.py — 26 unit tests (E2E-054), every comparator with at least one passing + one failing case so failure- message wording is exercised. Synthesises responses, deltas, and verdicts in-process; no proxy boot, no network. Modified: - tests/e2e/test_cascade.py — collapsed from 196 lines of per-comparator if-blocks to 117 lines that wire I/O into assert_scenario(). Hard failures → pytest.fail with the assertion summary + metrics-delta context. All-soft failures → pytest.skip (degraded judge tier; not LLMTrace's fault). - docs/guides/e2e-testing.md — comparator reference table, per-comparator semantics, soft-vs-hard aggregation rules, and the "adding a new comparator" three-step recipe. Verification: - pytest tests/e2e/ 54/54 ok in ~40s * 3 integration scenarios (dan, xstest, base64) * 25 observer unit tests (unchanged) * 26 expect.py unit tests (new) - Failure-message quality verified by deliberately tightening the benign xstest scenario with bogus findings_include + an aggressive findings_min_severity. The resulting pytest.fail message lists every comparator with a marker, the missing finding types, the observed peak severity, and the full non-zero metrics delta. E2E-053 Severity IntEnum is Info < Low < Medium < High < Critical matching the proxy's `SecuritySeverity`. Stacked on feat/issue-95-e2e-judge-verdict-collector (#119). * feat(security): add rot13/leetspeak encoding-attack detection; triage 8 corpus gaps RegexSecurityAnalyzer.detect_injection_patterns now also fires detect_rot13_injection and detect_leetspeak_injection, both emitting finding_type="encoding_attack". The jailbreak detector already handled these via "jailbreak" findings; the regex analyzer was only doing base64. Three new unit tests validate the new detectors in isolation. All 1025 existing security tests pass unchanged. Corpus (50 scenarios): - enc-003 (rot13) and enc-004 (leetspeak) now pass end-to-end. - 8 harmbench/jailbreakbench scenarios relaxed to proxy_outcome.at_most: warn with known-gap annotation. These are harmful-content requests, not injection attacks; the proxy is an injection detector, not a content moderator. Also: pytest.ini gains pythonpath=. so CI does not need PYTHONPATH export.
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First concrete step of the E2E adversarial test framework (umbrella #91).
trace_idat request entry and never echoes it, so the harness can't correlate per-request observations (findings, metrics deltas, judge verdicts) with scenarios.Changes
crates/llmtrace-proxy/src/proxy.rspub const TRACE_ID_HEADER = "x-llmtrace-trace-id"and a newextract_or_generate_trace_id(&HeaderMap) -> Uuidhelper. Reads the inbound header, trims whitespace, parses asUuid. Falls back toUuid::new_v4()on missing/non-UTF-8/unparseable — preserves current behaviour when no header is sent.proxy_handlercalls the helper in place of the unconditionalUuid::new_v4().X-LLMTrace-Trace-Id: <uuid>:error_response(status, message, trace_id)— new required param; 4 call sites updatedrate_limit_response(…, trace_id)— new required paramcap_rejected_response(…, trace_id)— new required paramdocs/TODO_E2E.mdE2E-NNN), acceptance criteria, dependency graph, and sequencing options.JudgeVerdictQuery { trace_id: Option<Uuid>, .. }already exists, so the L5 plan is trimmed (no newfind_by_trace_idtrait method needed —query_verdictssuffices).shutdown.timeout_seconds = 30; L3 e2e fixture should override to 5 so SIGTERM teardown fits the harness's 10-second budget.Test plan
cargo test -p llmtrace --lib proxy::— 27/27 pass (4 new trace-id tests + 2 updated response-shape tests + 21 pre-existing)cargo test -p llmtrace --lib— 569/569 pass, no regressionscargo clippy -p llmtrace --lib -- -D warnings— cleancargo fmt --check -p llmtrace— cleanNew tests
test_extract_or_generate_trace_id_honors_valid_inbound— valid uuid round-tripstest_extract_or_generate_trace_id_tolerates_surrounding_whitespacetest_extract_or_generate_trace_id_generates_when_missing— empty headers → fresh v4 each calltest_extract_or_generate_trace_id_generates_when_unparseable— garbage → fresh v4test_error_response_formatextended to assert echoedx-llmtrace-trace-idtest_cap_rejected_response_formatextended to assert echoedx-llmtrace-trace-idUnblocks
Backward compatibility
X-LLMTrace-Trace-Idheader is sent, behaviour is identical to before (server-generated UUID).x-llmtrace-flagged/x-llmtrace-findingsheaders unchanged.error_response,rate_limit_response,cap_rejected_response) gained a requiredtrace_idparameter. These are internal to the crate; no public API change.