Skip to content

Brain: confidence-gated escalation ladder (schema-repair + tier bump) to replace zero-retry parse→fallback (#2419 family) #2432

Description

@rysweet

Parent / context: #2431 (research findings from BGML / TPAMI 2024). Directly addresses the #2419 family.

Idea (from BGML's progress-aware module, §IV-C)

BGML spends extra compute only where coarse confidence is low: a fast coarse judgment is escalated to finer processing only for the weakest cases, instead of failing or defaulting. Map this onto Simard's brain decision path.

Problem in Simard (grounded)

All three brains parse LLM output via parse_decision_from_response() — find first { / last }, then serde_json — with zero retry; any error maps immediately to a deterministic fallback:

  • src/ooda_brain/rustyclawd.rs:85 (decide_engineer_lifecycle / act)
  • src/ooda_brain/decide.rs:179 (judge_decision)
  • src/ooda_brain/orient.rs:215 (judge_orientation)
  • fallbacks: src/ooda_brain/fallback.rs:14

Only OrientJudgment has a native confidence: f64 + validate() (src/ooda_brain/orient.rs:70,85). DecideJudgment (src/ooda_brain/decide.rs:57) and EngineerLifecycleDecision (src/ooda_brain/mod.rs:86) expose only rationale — no confidence, no validation. This is the mechanism behind #2419 (decide_engineer_lifecycle defaulting to ContinueSkipping ~99.6%).

Proposal (bounded)

Replace the binary parse-or-fallback with a confidence-gated escalation ladder:

  1. Cheap parse (as today).
  2. On parse-failure or low self-reported confidence: escalate, bounded to 1–2 attempts:
    • schema-repair re-prompt — feed the malformed payload back asking for valid JSON against the known schema; and/or
    • a higher-effort model/prompt tier.
  3. Only after escalation is exhausted, fall to the deterministic default.

Supporting changes:

Acceptance

Effort / Impact

Effort M (few PRs). Impact H — restores the core reasoning function the brain is supposed to provide. Related: #2419, #2421, #2429, #2430, #2046, #1748.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions