Skip to content

Weave 5 Whys into harness failure response paths#81

Merged
tkellogg merged 1 commit into
mainfrom
strix/five-whys-integration
Apr 11, 2026
Merged

Weave 5 Whys into harness failure response paths#81
tkellogg merged 1 commit into
mainfrom
strix/five-whys-integration

Conversation

@strix-tkellogg
Copy link
Copy Markdown
Collaborator

Summary

  • System prompt: Routes introspection → 5 Whys when patterns emerge, and routes prediction misses to structured reflection
  • Post-turn failure context: When a turn ends with error or circuit breaker, the next turn's prompt includes reflection guidance pointing to 5 Whys
  • Prediction-review → 5 Whys bridge: SKILL.md now guides agents to decompose surprising misses instead of just logging prediction_true: false
  • Chat history scan: Prediction-review scheduled job now also scans recent chat for corrections, error reactions, and repeated attempts — catches failures that predictions miss

PR #80 already covers cycle detection (integration point #2). This PR covers the remaining four.

The core problem: the 5 Whys skill existed as a callable tool but nothing in the harness actually routed agents toward it when failure happened. Like a fire extinguisher in a locked cabinet with no sign. Now four paths converge on it.

Files changed

  • prompts.py — Two new skill routing bullets in system prompt (introspection→5Whys, prediction miss→5Whys)
  • app.py_last_turn_failure field captures error/circuit-breaker context, injected into next turn via render_turn_prompt
  • config.py — Default scheduler job expanded: prediction misses trigger 5 Whys, plus chat history failure scan
  • prediction-review/SKILL.md — New "When a Prediction Misses" section with 5 Whys routing criteria

Test plan

  • 197 tests pass, 1 skipped
  • render_turn_prompt accepts optional last_turn_failure param (backward compatible, defaults to None)
  • Builtin skills tests pass (8/8)
  • Manual verification: trigger a circuit breaker loop and confirm next turn gets failure context

🤖 Generated with Claude Code

Four integration points (cycle detection was PR #80):

1. System prompt: route introspection → 5 Whys when patterns emerge,
   and route prediction misses to structured reflection
2. Post-turn failure context: when a turn ends with an error or circuit
   breaker, inject reflection guidance into the next turn's prompt
3. Prediction-review → 5 Whys bridge: SKILL.md now guides agents to
   decompose surprising misses via 5 Whys instead of just logging false
4. Chat history scan: prediction-review scheduled job now also scans
   recent chat for corrections, error reactions, and repeated attempts

The 5 Whys skill existed but nothing routed agents toward it when
failure happened. Now four paths converge on it: introspection findings,
prediction misses, circuit breaker events, and chat history patterns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tkellogg tkellogg merged commit efc6a63 into main Apr 11, 2026
@tkellogg tkellogg deleted the strix/five-whys-integration branch April 11, 2026 01:10
jptreen pushed a commit to jptreen/open-strix that referenced this pull request Apr 11, 2026
Merged upstream changes:
- PR tkellogg#80: Cycle detection prompts reflection instead of just stopping
- PR tkellogg#81: Five-whys integration into harness failure paths
- PR tkellogg#82: Five-whys chainlink docs
- PR tkellogg#83: Phone book → JSONL migration + alias enrichment

Conflict resolution:
- app.py: Kept both _withhold_final_text (ours) and _last_turn_failure (upstream)
- prompts.py: Preserved auto-send model while adding aliases/failure sections
- tools.py: Added missing file tools to tools list (read_file, glob_files, edit_file, write_file)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@strix-tkellogg strix-tkellogg mentioned this pull request Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants