Skip to content

fix(eod): defensive JSON extraction for LLM rationale synthesis (L1248/L2669)#198

Merged
cipher813 merged 3 commits into
mainfrom
feat/eod-rationale-defensive-json-parse
May 20, 2026
Merged

fix(eod): defensive JSON extraction for LLM rationale synthesis (L1248/L2669)#198
cipher813 merged 3 commits into
mainfrom
feat/eod-rationale-defensive-json-parse

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

`_synthesize_rationales` parsed Haiku's response via bare `json.loads(response.content[0].text)` which silently fell back to the template path whenever Haiku returned anything but a clean top-level JSON object. EOD logs have shown the recurrence (`Expecting value: line 1 column 1 (char 0) — using template fallback`) across multiple runs over the last month — promoted P3 → P2 in the 2026-05-20 curation pass per the entry's own "if recurrence confirmed" trigger.

What changed

New `_extract_json_object` defensively handles three observed LLM anomalies before raising:

  1. Markdown code fences (````json\n{...}\n```` or ````\n{...}\n````)
  2. Conversational preamble (`"Sure, here's the JSON:\n{...}"`)
  3. Trailing text after the closing brace

Clean JSON still goes through the fast path. On total parse failure, log a bounded sample (`raw[:400]`) so the next recurrence surfaces a concrete failure mode instead of the opaque "char 0" message, and re-raise into the existing template-fallback handler so the EOD email still ships.

Tests

9 new tests in `tests/test_eod_reconcile_logic.py::TestExtractJsonObject`:

  • clean JSON / clean narratives shape
  • markdown fences with + without language tag
  • preamble alone / trailing text alone / preamble + fence + trailing combined
  • garbage input raises / empty string raises

Suite: 905 → 913, all green.

Closes

L1248 + L2669 P3 → P2 promotions (canonical entry + cross-ref from the Executor side) — same fix lands both checkboxes.

Test plan

  • Targeted: pytest tests/test_eod_reconcile_logic.py → 36 passed (was 27 before this PR's 9 new tests)
  • Full ae suite: 913 passed
  • Next EOD reconcile run on ae-trading exercises the new path; on a Haiku-returns-fenced-JSON day, expect "LLM rationale synthesis failed" log lines to disappear and per-position narratives to land in the EOD email body again.

🤖 Generated with Claude Code

cipher813 and others added 2 commits May 20, 2026 10:42
…8/L2669)

`_synthesize_rationales` parsed Haiku's response via bare
`json.loads(response.content[0].text)` which silently fell back to
the template path whenever Haiku returned anything but a clean
top-level JSON object. EOD logs have shown the recurrence
("Expecting value: line 1 column 1 (char 0) — using template
fallback") across multiple runs over the last month — promoted P3
→ P2 in the 2026-05-20 curation pass per the entry's own "if
recurrence confirmed" trigger.

New `_extract_json_object` defensively handles three observed LLM
anomalies before raising:
  1. Markdown code fences (```json\n{...}\n``` or ```\n{...}\n```)
  2. Conversational preamble ("Sure, here's the JSON:\n{...}")
  3. Trailing text after the closing brace

Clean JSON still goes through the fast path. On total parse
failure, log a bounded sample (`raw[:400]`) so the next recurrence
surfaces a concrete failure mode instead of the opaque "char 0"
message, and re-raise into the existing template-fallback handler
so the EOD email still ships.

9 new tests cover the clean / fenced (with + without language tag)
/ preamble / trailing / combined / garbage / empty paths. Suite
905 → 913, all green.

Closes both L1248 + L2669 P3 → P2 promotions (canonical entry +
cross-ref from Executor side) — same fix lands both checkboxes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…TA, no shortcuts)

Replaces the prior `json.loads` + defensive-string-extraction
helper with Anthropic's tool-use + Pydantic validation — the
institutional approach. Parse failures from text-formatting
anomalies (markdown fences, conversational preamble, trailing
text) are now structurally impossible: Haiku is forced to emit a
typed `tool_use` block whose `input` is SDK-validated against the
declared schema BEFORE it lands here; Pydantic re-validates at the
boundary for type safety + strict field enforcement.

Pattern mirrors alpha-engine-research's `with_structured_output`
discipline (e.g. `agents/macro_agent.py:380`) without adding the
langchain dependency the Executor doesn't otherwise carry — uses
the raw Anthropic SDK's `tools=[...]` + `tool_choice` primitive
directly. Pydantic + Anthropic SDK 0.92 already in
`requirements.txt`.

What changed:
- New `_Narrative` + `_RationalesResponse` Pydantic models define
  the contract. The JSON Schema is derived via
  `model_json_schema()` and registered as `_RATIONALES_TOOL`.
- `_synthesize_rationales` calls Haiku with
  `tools=[_RATIONALES_TOOL]` + `tool_choice={"type": "tool",
  "name": "emit_rationales"}` — forced tool emission.
- The tool_use block is picked explicitly (Anthropic may emit a
  text block alongside it); its `input` is Pydantic-validated
  before being converted to `{ticker: narrative}`.
- Any failure (missing tool_use block, validation error, SDK
  error) falls through to the existing template fallback path —
  unchanged contract; only the LLM happy path is upgraded.
- Removed the `_extract_json_object` heuristic helper entirely;
  it's no longer needed.

Tests rewritten:
- `TestExtractJsonObject` (9 tests of the heuristic) deleted.
- `TestRationalesResponsePydantic` (6): valid payload / empty
  list / missing field / wrong type — Pydantic boundary
  correctness.
- `TestSynthesizeRationalesToolUse` (5): happy path with
  assertions on the SDK call kwargs (`tools`, `tool_choice`) /
  text-block-before-tool-block / missing tool_use → template
  fallback / malformed input → template fallback / empty contexts
  short-circuit.
- All existing `TestSynthesizeRationales` template-fallback
  tests still pass (the fallback contract is unchanged).

Suite: 913 → 915 (net +2 after removing the 9 string-extraction
tests and adding 11 contract tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813
Copy link
Copy Markdown
Owner Author

Pushed SOTA upgrade as follow-up commit 32f3597 per the no-shortcuts directive.

What changed from the initial heuristic:

  • Removed _extract_json_object string-fence-stripping helper entirely.
  • New _Narrative + _RationalesResponse Pydantic models define the contract; JSON Schema derived via model_json_schema() and registered as _RATIONALES_TOOL.
  • _synthesize_rationales calls Haiku with tools=[_RATIONALES_TOOL] + tool_choice={"type": "tool", "name": "emit_rationales"} — forces a typed tool_use response. SDK validates the input shape against the schema before it lands here; Pydantic re-validates at the boundary.
  • Parse failures from text anomalies (fences / preamble / trailing text) are now structurally impossible, not handled-after-the-fact.
  • Pattern matches alpha-engine-research's with_structured_output discipline (agents/macro_agent.py:380) without adding langchain — uses the raw Anthropic SDK primitive directly.

Tests rewritten: deleted 9 string-extraction tests, added 11 contract tests (Pydantic boundary correctness + tool-use happy/sad paths with SDK-kwargs assertions). Suite 913 → 915. Both commits remain in the PR for review-trail clarity.

Anyone reviewing: feel free to squash on merge — the heuristic commit is obsolete.

@cipher813 cipher813 merged commit dd102a9 into main May 20, 2026
1 check passed
@cipher813 cipher813 deleted the feat/eod-rationale-defensive-json-parse branch May 20, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant