Sanitize HTTP error responses — no exception class/message leakage#26
Sanitize HTTP error responses — no exception class/message leakage#26
Conversation
Three handlers interpolated `f"{type(e).__name__}: {e}"` into the JSON
error body, exposing internal field names, file paths, and user values
to any client.
- api/sessions.py: GET /api/sessions/<p>/<id> and .../stats now return
a stable generic message; full traceback goes to the server log via
current_app.logger.exception.
- api/projects.py: per-session error card drops the `error_detail`
field; `error: True` already drives the dashboard's error state.
- tests/test_error_propagation.py: 6 regression tests, including a
source-level guard against re-introducing the offending pattern.
Closes #25
📝 WalkthroughWalkthroughThis PR sanitizes three HTTP error handlers to stop returning exception class names or messages to clients, switching to server-side logging via current_app.logger.exception and generic error responses. It also removes an exposed ChangesSecurity: Error Response Sanitization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
tests/test_error_propagation.py (2)
205-214: ⚡ Quick winMake the static guard pattern-based to catch common variants
The current literal checks can miss forms like
type(exc).__name__or{err}. A regex-based guard would better enforce the anti-leak policy.Suggested patch
+import re ... - offending_patterns = [ - "type(e).__name__", # the class-name expose - "{e}\"", # bare {e} ending an f-string - "{e},", # bare {e} in a dict-value f-string - ] - for pat in offending_patterns: - assert pat not in src, ( - f"{py_file.name} contains forbidden pattern {pat!r} " - f"— see issue `#25`" - ) + forbidden_regexes = [ + r"type\(\w+\)\.__name__", + r"\{\s*\w+\s*\}", # catches bare {e}/{err}/{exc} interpolation + ] + for rx in forbidden_regexes: + assert not re.search(rx, src), ( + f"{py_file.name} matches forbidden pattern {rx!r} " + f"— see issue `#25`" + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_error_propagation.py` around lines 205 - 214, The current literal checks in offending_patterns miss variants; replace them with regex-based guards: import re and build a list of compiled patterns (e.g., r"type\s*\(\s*\w+\s*\)\.__name__", r"\{\s*\w+\s*\}", and r"\{\s*\w+\s*,", etc.) then iterate those regexes (instead of the string literals) and assert that re.search(pattern, src) is None for each, keeping the failure message using py_file.name and the reference to issue `#25`; update variable offending_patterns to hold compiled regexes or pattern strings and use re.search to detect common variants like type(exc).__name__ and {err}.
176-181: ⚡ Quick winTighten the endpoint contract assertion in this regression test
Line 177-Line 179 currently allows nearly any response shape/status, so a broken endpoint could still pass this test as long as it doesn’t leak tokens.
Suggested patch
resp = client.get("/api/projects/myproj/sessions") - # Response shape varies — accept 200 with a list, 200 with a dict, - # or any error code as long as it doesn't leak class names. + # Contract: endpoint should return a 200 with a list payload. + assert resp.status_code == 200 body = resp.get_json() + assert isinstance(body, list) body_text = json.dumps(body) if body is not None else "" _assert_no_class_name_leak(body_text)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_error_propagation.py` around lines 176 - 181, The test currently accepts any status or body shape which hides regressions; update the assertion after calling client.get("/api/projects/myproj/sessions") to require resp.status_code == 200 and assert that body is either a list or a dict (e.g. isinstance(body, (list, dict))) before running _assert_no_class_name_leak on body_text; ensure you reference resp, body, body_text and _assert_no_class_name_leak so the test fails for unexpected statuses or shapes rather than only on token/class-name leaks.api/projects.py (1)
80-86: ⚡ Quick winHarden the error path against secondary KeyError
Line 85 accesses
s["id"]inside anexceptblock. If a malformed session entry lacksid, this can raise again and break the whole endpoint instead of returning a per-row error card.Suggested patch
except Exception: + session_id = s.get("id", "<unknown>") # Full detail (class, message, traceback) to the server log via # logger.exception. The per-session card carries only `error: True` # — the class-name+message string was a leak (issue `#25`). The # operator looks at the server log for triage. - current_app.logger.exception("Failed to parse session %s", s["id"]) + current_app.logger.exception("Failed to parse session %s", session_id) result.append({**s, "title": "Error parsing session", "error": True})🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/projects.py` around lines 80 - 86, The exception handler logs current_app.logger.exception("Failed to parse session %s", s["id"]) which can raise a secondary KeyError if s lacks "id"; change it to use a safe lookup (e.g., session_id = s.get("id", "<unknown>") or use s.get("id") inline) and log that variable instead, ensuring the handler never raises and still appends the per-session error card via result.append({**s, "title": "Error parsing session", "error": True}) in the same except block (reference the variable s and the current_app.logger.exception call to locate and update the code).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@api/projects.py`:
- Around line 80-86: The exception handler logs
current_app.logger.exception("Failed to parse session %s", s["id"]) which can
raise a secondary KeyError if s lacks "id"; change it to use a safe lookup
(e.g., session_id = s.get("id", "<unknown>") or use s.get("id") inline) and log
that variable instead, ensuring the handler never raises and still appends the
per-session error card via result.append({**s, "title": "Error parsing session",
"error": True}) in the same except block (reference the variable s and the
current_app.logger.exception call to locate and update the code).
In `@tests/test_error_propagation.py`:
- Around line 205-214: The current literal checks in offending_patterns miss
variants; replace them with regex-based guards: import re and build a list of
compiled patterns (e.g., r"type\s*\(\s*\w+\s*\)\.__name__", r"\{\s*\w+\s*\}",
and r"\{\s*\w+\s*,", etc.) then iterate those regexes (instead of the string
literals) and assert that re.search(pattern, src) is None for each, keeping the
failure message using py_file.name and the reference to issue `#25`; update
variable offending_patterns to hold compiled regexes or pattern strings and use
re.search to detect common variants like type(exc).__name__ and {err}.
- Around line 176-181: The test currently accepts any status or body shape which
hides regressions; update the assertion after calling
client.get("/api/projects/myproj/sessions") to require resp.status_code == 200
and assert that body is either a list or a dict (e.g. isinstance(body, (list,
dict))) before running _assert_no_class_name_leak on body_text; ensure you
reference resp, body, body_text and _assert_no_class_name_leak so the test fails
for unexpected statuses or shapes rather than only on token/class-name leaks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 35228ff0-8c37-4292-8fa6-384df4304cb1
📒 Files selected for processing (3)
api/projects.pyapi/sessions.pytests/test_error_propagation.py
|
Should fix Nice to have tests/test_error_propagation.py:170-187 (test_per_session_error_card_omits_error_detail) — The assertion is conditional (if isinstance(body, list): for row in body: if row.get("error"): ...). If the response shape ever silently changes (e.g. {"sessions": [...]} wrapper) the test will pass without checking anything. Adding assert resp.status_code == 200 and assert isinstance(body, list) up front would make the test fail loudly on shape drift instead of going green by accident. |
- Docstring listed GET /api/projects but the leak was on GET /api/projects/<project>/sessions; corrected so the next reader doesn't grep the wrong route. - test_per_session_error_card_omits_error_detail was passing without exercising anything: parse_session is tolerant of malformed lines, so the malformed-jsonl input never triggered the except branch and the conditional `if/for` walked an empty list. Switched to monkeypatching parse_session to raise (mirrors the session-level tests above) and pinned the response shape — assert status == 200 and body is a list, assert at least one error card exists, and verify the leaked field name doesn't appear in the body.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_error_propagation.py (1)
124-139: ⚡ Quick winUse raw response text for leak scanning to avoid false negatives.
These two tests can silently pass if
resp.get_json()isNone(e.g., non-JSON 404/400), becausejson.dumps(None)becomes"null". Scanresp.get_data(as_text=True)and assert JSON shape when expected.Suggested hardening
def test_404_on_missing_file_keeps_session_id_safe(self, tmp_path, client): @@ resp = client.get("/api/sessions/proj/nope-doesnt-exist") assert resp.status_code == 404 - body = resp.get_json() - _assert_no_class_name_leak(json.dumps(body)) + raw_body = resp.get_data(as_text=True) + body = resp.get_json(silent=True) + assert isinstance(body, dict) + _assert_no_class_name_leak(raw_body) def test_400_on_path_traversal_attempt(self, client): @@ resp = client.get("/api/sessions/..%2Fevil/abc") assert resp.status_code in (400, 404) - body = resp.get_json() - _assert_no_class_name_leak(json.dumps(body)) + raw_body = resp.get_data(as_text=True) + _assert_no_class_name_leak(raw_body) + if resp.is_json: + assert isinstance(resp.get_json(silent=True), dict)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_error_propagation.py` around lines 124 - 139, In tests test_404_on_missing_file_keeps_session_id_safe and test_400_on_path_traversal_attempt, stop calling resp.get_json() for leak-scanning and instead use the raw response text (resp.get_data(as_text=True)) to avoid false negatives when the response isn't JSON; also, when you expect JSON structure assert the parsed JSON shape (e.g., via json.loads on resp.get_data(as_text=True) and explicit key checks) before calling _assert_no_class_name_leak so the leak scan sees the real response payload.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_error_propagation.py`:
- Around line 124-139: In tests test_404_on_missing_file_keeps_session_id_safe
and test_400_on_path_traversal_attempt, stop calling resp.get_json() for
leak-scanning and instead use the raw response text
(resp.get_data(as_text=True)) to avoid false negatives when the response isn't
JSON; also, when you expect JSON structure assert the parsed JSON shape (e.g.,
via json.loads on resp.get_data(as_text=True) and explicit key checks) before
calling _assert_no_class_name_leak so the leak scan sees the real response
payload.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2a14ba4f-488c-4117-8694-0643e0f36179
📒 Files selected for processing (1)
tests/test_error_propagation.py
|
Problem
Three handlers in
api/interpolated raw exception details into JSON error bodies:GET /api/sessions/<p>/<id>—f"Failed to parse session: {type(e).__name__}: {e}"GET /api/sessions/<p>/<id>/stats— same shapeGET /api/projects/.../sessions— per-session card carriederror_detail: f"{type(e).__name__}: {e}"That exposes internal field names (e.g.
KeyError: 'id'), filesystem paths fromOSError, and user values fromValueError-style messages to any client and to anything that captures error responses.Change
api/sessions.py: both handlers nowcurrent_app.logger.exception(...)and return a generic stable message.api/projects.py: drop theerror_detailfield —error: Truealready drives the dashboard's error state, and the operator gets the full detail from the server log.tests/test_error_propagation.py: 6 regression tests usingmonkeypatchto force the exception path; defensive blocklist of class-name tokens; source-level guard that fails any future PR re-introducingtype(e).__name__/ bare{e}patterns inapi/.Test plan
pytest— 81/81 pass (75 existing + 6 new)Closes #25
Summary by CodeRabbit