Skip to content

Add an LLM repair pass for malformed review outputs that miss the strict schema #175

@wwind123

Description

@wwind123

Summary

When an agent produces a review or follow-up response that is substantively clear but fails the strict grammar/schema check, the orchestrator could invoke a smaller analyzer LLM to normalize the result into the expected structured shape.

The analyzer would read a small input: the raw agent output plus the exact schema contract and the current unresolved-item context. It would then return a strict structured object indicating:

  • approved or blocking
  • addressed items
  • remaining items
  • any new items introduced
  • optionally a short normalized summary or quoted evidence

Motivation

Recent failures show the main issue is often not the underlying review content, but the model drifting away from the required wrapper format after a long turn. A focused repair pass may recover those near-miss outputs much more reliably than retrying the full review prompt.

Guardrails

  • The primary contract should remain structured output from the original agent.
  • The analyzer is only a fallback when strict parsing fails.
  • If the analyzer cannot confidently justify a repair, the orchestrator should treat the result as blocking rather than inventing approval.
  • The analyzer should not be able to override a clearly blocking review into approval without evidence from the raw output.

Relationship to #140

This is a follow-up idea to the structured-response migration in #140, not a replacement for it. Structured output should remain the primary path; the analyzer is only a recovery layer for malformed near-miss responses.

Example

On recent llm-dialectic runs, some reviews clearly contained the needed disposition information, but the model wrapped the JSON in fences or mixed in prose, causing the strict parser to fail. A repair pass could salvage those cases and reduce unnecessary reruns.

-- OpenAI Codex

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions