Parent / context: #2431 (research findings from BGML / TPAMI 2024). Directly addresses the #2419 family.
Idea (from BGML's progress-aware module, §IV-C)
BGML spends extra compute only where coarse confidence is low: a fast coarse judgment is escalated to finer processing only for the weakest cases, instead of failing or defaulting. Map this onto Simard's brain decision path.
Problem in Simard (grounded)
All three brains parse LLM output via parse_decision_from_response() — find first { / last }, then serde_json — with zero retry; any error maps immediately to a deterministic fallback:
src/ooda_brain/rustyclawd.rs:85 (decide_engineer_lifecycle / act)
src/ooda_brain/decide.rs:179 (judge_decision)
src/ooda_brain/orient.rs:215 (judge_orientation)
- fallbacks:
src/ooda_brain/fallback.rs:14
Only OrientJudgment has a native confidence: f64 + validate() (src/ooda_brain/orient.rs:70,85). DecideJudgment (src/ooda_brain/decide.rs:57) and EngineerLifecycleDecision (src/ooda_brain/mod.rs:86) expose only rationale — no confidence, no validation. This is the mechanism behind #2419 (decide_engineer_lifecycle defaulting to ContinueSkipping ~99.6%).
Proposal (bounded)
Replace the binary parse-or-fallback with a confidence-gated escalation ladder:
- Cheap parse (as today).
- On parse-failure or low self-reported confidence: escalate, bounded to 1–2 attempts:
- schema-repair re-prompt — feed the malformed payload back asking for valid JSON against the known schema; and/or
- a higher-effort model/prompt tier.
- Only after escalation is exhausted, fall to the deterministic default.
Supporting changes:
Acceptance
Effort / Impact
Effort M (few PRs). Impact H — restores the core reasoning function the brain is supposed to provide. Related: #2419, #2421, #2429, #2430, #2046, #1748.
Parent / context: #2431 (research findings from BGML / TPAMI 2024). Directly addresses the #2419 family.
Idea (from BGML's progress-aware module, §IV-C)
BGML spends extra compute only where coarse confidence is low: a fast coarse judgment is escalated to finer processing only for the weakest cases, instead of failing or defaulting. Map this onto Simard's brain decision path.
Problem in Simard (grounded)
All three brains parse LLM output via
parse_decision_from_response()— find first{/ last}, thenserde_json— with zero retry; any error maps immediately to a deterministic fallback:src/ooda_brain/rustyclawd.rs:85(decide_engineer_lifecycle/ act)src/ooda_brain/decide.rs:179(judge_decision)src/ooda_brain/orient.rs:215(judge_orientation)src/ooda_brain/fallback.rs:14Only
OrientJudgmenthas a nativeconfidence: f64+validate()(src/ooda_brain/orient.rs:70,85).DecideJudgment(src/ooda_brain/decide.rs:57) andEngineerLifecycleDecision(src/ooda_brain/mod.rs:86) expose onlyrationale— no confidence, no validation. This is the mechanism behind #2419 (decide_engineer_lifecycledefaulting toContinueSkipping~99.6%).Proposal (bounded)
Replace the binary parse-or-fallback with a confidence-gated escalation ladder:
Supporting changes:
confidencefield + avalidate()toDecideJudgmentandEngineerLifecycleDecision(parity withOrientJudgment).brain_lifecycle_decisionviaself_metrics::record_metricwithoutcome ∈ {parsed, default_empty, default_malformed, repaired, escalated, error_*}.BrainJudgmentRecordalready carriesconfidence+fallback(src/ooda_brain/judgment_record.rs:41).Acceptance
decide_engineer_lifecycle(and the Decide/Orient phases) drops substantially from the ~99.6% baseline in Brain: decide_engineer_lifecycle defaults to continue_skipping ~99.6% of invocations (instrument + fix) #2419.brain_decision_parse_failure_ratemetric exists and is gated in CI/self-metrics.Effort / Impact
Effort M (few PRs). Impact H — restores the core reasoning function the brain is supposed to provide. Related: #2419, #2421, #2429, #2430, #2046, #1748.