Source loops: guardrails-learning-loop + reflexion-debug-loop (loops.elorm.xyz)
What the loops do
- guardrails-learning-loop: when a check fails twice the same way, append a "guardrail sign" to
.ralph/guardrails.md; every later iteration reads it first and must not repeat that approach.
- reflexion-debug-loop: after each failed repro, append a short reflection to disk, then retry with that memory so the next attempt doesn't re-walk the dead end.
Both implement intra-run failure memory: persist why the last attempt failed between iterations of the same task.
Why valuable for MAP
MAP's learning loop (map-learn → .claude/rules/learned/) is cross-session / post-hoc. It does nothing to stop the Actor from re-trying the same broken approach within a single subtask run. The "fail-twice-same-way → write a hard sign → consume at top of next iteration" trigger is tighter and immediate — anti-thrash memory that prevents wasted iterations and token burn inside one run.
What MAP has today
map-learn post-hoc learned rules (cross-session).
ralph-iteration-logger + ralph-context-pruner hooks (log/prune, but no "repeat-failure → binding constraint fed back next iteration").
Proposed scope
- Within an Actor/retry loop, on a repeated identical failure signature, write a durable in-run note (e.g.
.map/run/<branch>/anti-repeat.md) and inject it at the top of the next iteration's context as a hard constraint.
- Failure-signature dedup (normalize error text) to detect "same way twice".
- Bridge: at run close, promote durable in-run signs into
map-learn candidates (feeds existing cross-session loop).
Notes
Complements, does not duplicate, ralph-iteration-logger. Keep the in-run store branch-scoped and gitignored.
Part of #251
Source loops:
guardrails-learning-loop+reflexion-debug-loop(loops.elorm.xyz)What the loops do
.ralph/guardrails.md; every later iteration reads it first and must not repeat that approach.Both implement intra-run failure memory: persist why the last attempt failed between iterations of the same task.
Why valuable for MAP
MAP's learning loop (
map-learn→.claude/rules/learned/) is cross-session / post-hoc. It does nothing to stop the Actor from re-trying the same broken approach within a single subtask run. The "fail-twice-same-way → write a hard sign → consume at top of next iteration" trigger is tighter and immediate — anti-thrash memory that prevents wasted iterations and token burn inside one run.What MAP has today
map-learnpost-hoc learned rules (cross-session).ralph-iteration-logger+ralph-context-prunerhooks (log/prune, but no "repeat-failure → binding constraint fed back next iteration").Proposed scope
.map/run/<branch>/anti-repeat.md) and inject it at the top of the next iteration's context as a hard constraint.map-learncandidates (feeds existing cross-session loop).Notes
Complements, does not duplicate,
ralph-iteration-logger. Keep the in-run store branch-scoped and gitignored.Part of #251