Skip to content

fix: add retry-with-feedback loop to oo learn for invalid TOML #76

@randomm

Description

@randomm

Task Description

oo learn frequently generates invalid TOML. Research shows the most reliable cross-provider fix is a validation + retry loop: if the LLM returns invalid TOML, send the parse error back to the LLM and ask it to fix only the broken part.

Root cause: LLMs occasionally generate almost-valid TOML but fail at specific lines (trailing commentary, malformed sections). This is a known problem with all LLM-based structured output generation.

Research findings (2026-03-05):

  • Anthropic structured outputs (JSON schema constraint) is 100% reliable but JSON-only — requires JSON→TOML conversion
  • Retry with error feedback works across ALL providers (Anthropic, OpenAI, Cerebras)
  • Temperature 0 reduces variability
  • Stop sequences prevent trailing commentary

Proposed Solution

In src/learn.rs, in run_learn(), after calling the LLM:

  1. Validate the TOML response
  2. If invalid, send a follow-up message: "Your previous TOML was invalid: {error}. Here is what you returned: {toml}. Output ONLY the corrected TOML, nothing else."
  3. Retry up to 2 times
  4. If all retries fail, write to learn-status.log and return Err

Also add temperature: 0 to API calls for consistency.

Quality Gates (Non-Negotiable)

  • TDD: Write tests before implementation
  • Coverage: 80%+ test coverage for new code
  • Linting: All code passes project linting rules
  • Local Verification: oo learn cargo test produces a valid pattern file

Acceptance Criteria

  • run_learn retries up to 2 times on invalid TOML with error feedback
  • Temperature set to 0 on all LLM API calls
  • oo learn cargo test reliably produces a valid cargo-test.toml
  • Tests cover the retry logic (with mock LLM responses)
  • No changes to provider routing — works for Anthropic, OpenAI, Cerebras

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions