fix: P0 critical safety fixes — retry budgets, context diffing, parallel capture#48
fix: P0 critical safety fixes — retry budgets, context diffing, parallel capture#48clintecker merged 24 commits intomainfrom
Conversation
…effect capture - Add DiffFrom() to PipelineContext for detecting side-effect writes - Parallel handler now captures pctx.Set() calls, not just Outcome.ContextUpdates - Goal-gate retry checks budget before looping; routes to fallback when exhausted - Emit EventStageRetrying/EventStageFailed events for goal-gate retries - Add comprehensive tests for all three fixes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
WalkthroughThis PR implements P0 critical safety fixes for the pipeline engine, including: adding context diffing to capture and merge side-effect writes from parallel branches, enforcing per-goal-gate retry budgets with fallback routing and checkpoint-persisted fallback tracking, adding corresponding tests, and refining interview handler logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Engine as Engine
participant Checkpoint as Checkpoint
participant Handler as HandleExitNode
participant EventBus as Event System
loop Until retry budget exhausted or retry succeeds
Engine->>Engine: Evaluate goal-gate node outcome
Note over Engine: Check if retries exhausted<br/>cp.RetryCount(goalGateNodeID) >= maxRetries
alt Retries remain
Engine->>Checkpoint: Check FallbackTaken[goalGateNodeID]
Checkpoint-->>Engine: false (no prior fallback)
Engine->>Engine: Select retry target<br/>(fallback_target/retry_target/...)
Engine->>Handler: Route to retry target
Handler->>Checkpoint: Increment RetryCount(goalGateNodeID)
Handler->>EventBus: Emit EventStageRetrying<br/>(with attempt/limit from retry count)
Handler->>Checkpoint: SaveCheckpoint()
Handler-->>Engine: Continue with new target
else Retries exhausted
Engine->>Checkpoint: Check FallbackTaken[goalGateNodeID]
Checkpoint-->>Engine: false (available)
Engine->>Engine: Attempt one-shot fallback<br/>(node/graph fallback_target)
alt Fallback configured & valid
Engine->>Checkpoint: Set FallbackTaken[goalGateNodeID] = true
Engine->>Handler: Route to fallback target
Handler->>EventBus: Emit EventStageFailed<br/>(routing to fallback)
Handler->>Checkpoint: SaveCheckpoint()
Handler-->>Engine: Continue with fallback
else No fallback or already taken
Engine->>Handler: Terminal failure
Handler->>EventBus: Emit EventStageFailed<br/>(unsatisfied)
Handler-->>Engine: Return OutcomeFail
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e7edb0c598
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
pipeline/engine_checkpoint.go
Outdated
| continue | ||
| } | ||
| if _, ok := e.graph.Nodes[fb]; ok { | ||
| return fb, nodeID, true, true |
There was a problem hiding this comment.
Stop re-entering fallback loop after goal-gate retries exhaust
When retries are exhausted, this branch still returns shouldRetry=true for a fallback target, so handleExitNode treats escalation as another retry (increments retry count and clears only the fallback subtree). That leaves the failed goal-gate node completed and unsatisfied, so reaching exit again selects the same fallback repeatedly. In graphs where fallback does not re-run/satisfy the gate (for example fallback goes straight to done), the pipeline can loop indefinitely instead of terminating. This path needs a one-time fallback guard (or a non-retry return state) so subsequent exits fail once escalation has already been attempted.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (7)
docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.md (2)
1393-1404: Add language identifier to the dependency graph code block.The ASCII art dependency graph would benefit from a language identifier (e.g.,
textorplaintext) to satisfy linting and improve rendering consistency.-``` +```text Task 1 (commit existing work)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.md` around lines 1393 - 1404, The ASCII-art dependency graph code block lacks a language identifier; update the opening triple-backtick that begins the graph (the block starting with "Task 1 (commit existing work)") to include a plain language tag such as ```text or ```plaintext so the block becomes a fenced code block with a language identifier and satisfies linters and consistent rendering.
14-15: Minor markdown structure issue.The document jumps from h1 (
#) directly to h3 (###). Consider using h2 (##) for the task sections, or adding an h2 section header before the tasks. This is a minor documentation consistency issue flagged by markdownlint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.md` around lines 14 - 15, The markdown jumps from an h1 to h3 (e.g., the existing "### Task 1: Commit existing parallel context fix (`#20`) and goal-gate retry fix (`#15`)"); change those task headings to h2 (replace "###" with "##") or add an intermediate h2 section header before the task list so the structure goes H1 -> H2 -> H3 as appropriate (update the "### Task 1..." and any subsequent "### Task n..." headings to "##" or insert a descriptive h2 preceding them to satisfy markdownlint).pipeline/engine_goal_gate_test.go (3)
83-92: Same issue: check error fromengine.Run.engine := NewEngine(g, reg) - result, _ := engine.Run(context.Background()) + result, err := engine.Run(context.Background()) + if err != nil { + t.Fatalf("unexpected engine.Run error: %v", err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/engine_goal_gate_test.go` around lines 83 - 92, The test currently ignores the error return from engine.Run; change the call to capture both result and err (e.g., result, err := engine.Run(...)) and assert err is nil before inspecting result.Status and workAttempts, failing the test with t.Fatalf including err if non-nil; reference NewEngine and Engine.Run to locate the call to update.
41-50: Check error fromengine.Runto avoid masking failures.The test ignores the error returned by
engine.Run. While the test focuses on retry termination, an unexpected error would be silently ignored, potentially masking bugs.engine := NewEngine(g, reg) - result, _ := engine.Run(context.Background()) + result, err := engine.Run(context.Background()) + if err != nil { + t.Fatalf("unexpected engine.Run error: %v", err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/engine_goal_gate_test.go` around lines 41 - 50, The test currently ignores the error returned by engine.Run; call engine.Run(ctx) and check the returned error (the error from engine.Run) before asserting on result.Status and workAttempts, failing the test (t.Fatalf) if err != nil to avoid masking failures; update the block around NewEngine, engine.Run, result and workAttempts so any non-nil err aborts the test with a clear message referencing engine.Run.
131-142: Test appears incomplete — result and error are ignored.The test assigns
resultanderrbut then explicitly ignores them with_ = errand_ = result. The comment on lines 134-136 indicates uncertainty about expected behavior ("it depends on whether clearDownstream clears work").Consider completing the test by:
- Determining expected final status (likely
OutcomeFailsince the goal-gate remains unsatisfied)- Adding assertions for
result.StatusThis aligns with the implementation plan Task 2, Step 6 which specifies updating this test to assert
result.Status == OutcomeFail.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/engine_goal_gate_test.go` around lines 131 - 142, The test currently ignores result and err after calling engine := NewEngine(g, reg) and result, err := engine.Run(...); update the test to assert the final pipeline outcome: check err is nil (or as expected) and assert result.Status == OutcomeFail (import/reference OutcomeFail) because the goal-gate remains unsatisfied; keep the existing escalateVisited check but replace the lines "_ = err" and "_ = result" with concrete assertions on err and result.Status to validate the expected failure outcome from Run.docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md (2)
168-169: Document the risk ofallow_unsafe_interpolation: all.The blanket override
allow_unsafe_interpolation: allcould be misused to bypass all tainted interpolation protections. While this is acceptable given the stated threat model (line 24: "pipeline author is the trust boundary"), consider documenting in CLAUDE.md that:
- Per-variable allowlisting is strongly preferred:
allow_unsafe_interpolation: last_response- The
allkeyword should only be used when the pipeline author has manually audited the tool_command for injection safety- The DIP130 lint warning still fires to ensure conscious opt-in
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md` around lines 168 - 169, Add documentation to CLAUDE.md describing the risks of using the blanket allow_unsafe_interpolation: all override and recommended practices: prefer per-variable allowlisting (e.g., allow_unsafe_interpolation: last_response), only use the all keyword after manually auditing the tool_command for injection safety, and note that the DIP130 lint warning will still trigger to enforce conscious opt-in; update the section that references allow_unsafe_interpolation and the threat model wording so readers see these three points adjacent to the existing pipeline-author trust-boundary guidance.
143-177: Two expansion systems add complexity but fix is comprehensive.The doc correctly identifies that having two independent variable expansion paths (
ExpandGraphVariablesinprepareExecNodeandExpandVariablesinToolHandler.Execute) creates a larger attack surface. Both must be gated to prevent bypasses.The proposed solution addresses both systems:
ExpandVariablesgets tainted-key checks for${ctx.*}patternsExpandGraphVariablesexcludes tainted graph vars for tool_commandLine 174 correctly identifies the cross-contamination risk: an LLM node could write
pctx.Set("graph.foo", malicious), so both layers need protection.Consider adding a follow-up task to consolidate to a single expansion system with context-aware safety checks, reducing the dual-layer complexity long-term.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md` around lines 143 - 177, The review points out unsafe variable expansion for tool_command: add tainted-key checks and an allowlist to both expansion systems; in ExpandVariables (used by ToolHandler.Execute) add an opts/wrapper (e.g., ExpandToolCommand) that rejects ${ctx.<tainted_key>} using the taintedContextKeys map and returns the prescribed error message unless the node's allow_unsafe_interpolation attribute (comma-separated or "all") explicitly allows the key; in prepareExecNode/ExpandGraphVariables (when building GraphVarMap) detect graph.<tainted_key> sources, build a taintedGraphVars set and exclude those vars when expanding the "tool_command" attribute (leave other attributes unchanged); reference functions/types: ExpandVariables, ExpandToolCommand (new), ToolHandler.Execute, prepareExecNode, ExpandGraphVariables, GraphVarMap, and the taintedContextKeys allowlist.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md`:
- Around line 190-196: Add a language specifier to the fenced code block titled
"Safe pattern for LLM output in tool commands" so markdownlint MD040 is
satisfied: change the opening triple backticks to include the shell language
(e.g., "bash") for the block that starts with "### Safe pattern for LLM output
in tool commands" and ensure the block remains properly closed with triple
backticks; this affects the code block surrounding the lines that start with "Do
NOT interpolate ${ctx.last_response}..." and the numbered steps.
- Around line 87-95: Add a FallbackTaken map[string]bool field (with
`json:"fallbackTaken"` tag) to the Checkpoint struct in pipeline/checkpoint.go,
ensure it's initialized during checkpoint creation/restore (so code that
constructs or unmarshals Checkpoint sets cp.FallbackTaken = map[string]bool{}
when nil), and update any checkpoint constructors/restore paths to preserve this
map across saves so the escalation-loop fix survives restarts; then implement
the TestGoalGateEscalationSurvivesCheckpointRestore test to create a Checkpoint,
mark a gate in cp.FallbackTaken, serialize/deserialize (or round-trip via the
existing checkpoint save/restore helpers) and assert the map value is preserved.
In `@pipeline/engine_checkpoint.go`:
- Around line 162-181: Add a FallbackTaken guard to the Checkpoint state: extend
the Checkpoint struct with a FallbackTaken map[string]bool; in the
fallback-return logic inside the retry-exhaustion branch (the loop that iterates
node.Attrs["fallback_target"], node.Attrs["fallback_retry_target"], etc.) check
cp.FallbackTaken[nodeID] before returning a fallback (skip that
fallback/escalation if true) and when you decide to return a fallback mark
cp.FallbackTaken[nodeID] = true so the same fallback won’t be re-taken
repeatedly; keep the existing behavior for returning the escalation/unsatisfied
path unchanged.
---
Nitpick comments:
In `@docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.md`:
- Around line 1393-1404: The ASCII-art dependency graph code block lacks a
language identifier; update the opening triple-backtick that begins the graph
(the block starting with "Task 1 (commit existing work)") to include a plain
language tag such as ```text or ```plaintext so the block becomes a fenced code
block with a language identifier and satisfies linters and consistent rendering.
- Around line 14-15: The markdown jumps from an h1 to h3 (e.g., the existing
"### Task 1: Commit existing parallel context fix (`#20`) and goal-gate retry fix
(`#15`)"); change those task headings to h2 (replace "###" with "##") or add an
intermediate h2 section header before the task list so the structure goes H1 ->
H2 -> H3 as appropriate (update the "### Task 1..." and any subsequent "### Task
n..." headings to "##" or insert a descriptive h2 preceding them to satisfy
markdownlint).
In `@docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md`:
- Around line 168-169: Add documentation to CLAUDE.md describing the risks of
using the blanket allow_unsafe_interpolation: all override and recommended
practices: prefer per-variable allowlisting (e.g., allow_unsafe_interpolation:
last_response), only use the all keyword after manually auditing the
tool_command for injection safety, and note that the DIP130 lint warning will
still trigger to enforce conscious opt-in; update the section that references
allow_unsafe_interpolation and the threat model wording so readers see these
three points adjacent to the existing pipeline-author trust-boundary guidance.
- Around line 143-177: The review points out unsafe variable expansion for
tool_command: add tainted-key checks and an allowlist to both expansion systems;
in ExpandVariables (used by ToolHandler.Execute) add an opts/wrapper (e.g.,
ExpandToolCommand) that rejects ${ctx.<tainted_key>} using the
taintedContextKeys map and returns the prescribed error message unless the
node's allow_unsafe_interpolation attribute (comma-separated or "all")
explicitly allows the key; in prepareExecNode/ExpandGraphVariables (when
building GraphVarMap) detect graph.<tainted_key> sources, build a
taintedGraphVars set and exclude those vars when expanding the "tool_command"
attribute (leave other attributes unchanged); reference functions/types:
ExpandVariables, ExpandToolCommand (new), ToolHandler.Execute, prepareExecNode,
ExpandGraphVariables, GraphVarMap, and the taintedContextKeys allowlist.
In `@pipeline/engine_goal_gate_test.go`:
- Around line 83-92: The test currently ignores the error return from
engine.Run; change the call to capture both result and err (e.g., result, err :=
engine.Run(...)) and assert err is nil before inspecting result.Status and
workAttempts, failing the test with t.Fatalf including err if non-nil; reference
NewEngine and Engine.Run to locate the call to update.
- Around line 41-50: The test currently ignores the error returned by
engine.Run; call engine.Run(ctx) and check the returned error (the error from
engine.Run) before asserting on result.Status and workAttempts, failing the test
(t.Fatalf) if err != nil to avoid masking failures; update the block around
NewEngine, engine.Run, result and workAttempts so any non-nil err aborts the
test with a clear message referencing engine.Run.
- Around line 131-142: The test currently ignores result and err after calling
engine := NewEngine(g, reg) and result, err := engine.Run(...); update the test
to assert the final pipeline outcome: check err is nil (or as expected) and
assert result.Status == OutcomeFail (import/reference OutcomeFail) because the
goal-gate remains unsatisfied; keep the existing escalateVisited check but
replace the lines "_ = err" and "_ = result" with concrete assertions on err and
result.Status to validate the expected failure outcome from Run.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4402dbcb-fcf7-4d6d-b95e-934e377dd785
📒 Files selected for processing (9)
docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.mddocs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.mdpipeline/context.gopipeline/context_test.gopipeline/engine_checkpoint.gopipeline/engine_goal_gate_test.gopipeline/engine_run.gopipeline/handlers/parallel.gopipeline/handlers/parallel_test.go
| ### Bug to Fix: Escalation Loop | ||
|
|
||
| **Problem:** When retries exhaust and the fallback path is taken, `goalGateRetryTarget` returns `retry=true`. This causes `handleExitNode` to call `IncrementRetry` and `clearDownstream` for the escalation target. If the escalation node also fails (or doesn't satisfy the goal gate), the pipeline bounces between exit and escalation forever — the retry counter goes past max but the `>=` check routes to fallback every time. | ||
|
|
||
| **Fix:** Track whether we've already taken the fallback path for a given gate. Add a `FallbackTaken map[string]bool` field to `Checkpoint`. When `goalGateRetryTarget` finds retries exhausted: | ||
| - If `cp.FallbackTaken[gateNodeID]` is true → return `(_, _, false, true)` (no retry, unsatisfied → pipeline fails) | ||
| - If false → set `cp.FallbackTaken[gateNodeID] = true`, return the fallback target | ||
|
|
||
| This guarantees at most one escalation attempt per gate. Serialized in checkpoint JSON for crash safety. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify FallbackTaken field exists in Checkpoint and is tested
# Check if FallbackTaken field is defined in Checkpoint struct
echo "=== Checking Checkpoint struct for FallbackTaken field ==="
ast-grep --pattern 'type Checkpoint struct {
$$$
}'
# Check for checkpoint serialization tests
echo -e "\n=== Checking for checkpoint round-trip tests ==="
rg -n "FallbackTaken" --type=go
# Verify test mentioned in line 122 exists
echo -e "\n=== Verifying TestGoalGateEscalationSurvivesCheckpointRestore ==="
rg -n "TestGoalGateEscalationSurvivesCheckpointRestore" --type=goRepository: 2389-research/tracker
Length of output: 1584
Add FallbackTaken field to Checkpoint struct — escalation loop fix will not survive restarts without it.
The design document claims at line 95 that FallbackTaken is "serialized in checkpoint JSON for crash safety," but verification reveals the field is missing from the Checkpoint struct in pipeline/checkpoint.go. Without persisting this map across checkpoints, the escalation loop prevention resets on restart and infinite bouncing can resume.
Required actions:
- Add
FallbackTaken map[string]boolfield to the Checkpoint struct with JSON serialization tag - Implement the test
TestGoalGateEscalationSurvivesCheckpointRestore(referenced at design line 122) to verify the map persists through checkpoint round-trips
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md` around
lines 87 - 95, Add a FallbackTaken map[string]bool field (with
`json:"fallbackTaken"` tag) to the Checkpoint struct in pipeline/checkpoint.go,
ensure it's initialized during checkpoint creation/restore (so code that
constructs or unmarshals Checkpoint sets cp.FallbackTaken = map[string]bool{}
when nil), and update any checkpoint constructors/restore paths to preserve this
map across saves so the escalation-loop fix survives restarts; then implement
the TestGoalGateEscalationSurvivesCheckpointRestore test to create a Checkpoint,
mark a gate in cp.FallbackTaken, serialize/deserialize (or round-trip via the
existing checkpoint save/restore helpers) and assert the map value is preserved.
docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR targets P0 safety issues in the pipeline engine by preventing silent loss of parallel branch context writes and adding a retry budget + event visibility for goal-gate retry loops.
Changes:
- Add
PipelineContext.DiffFrom()and use it in the parallel handler to capture side-effectpctx.Set()writes, merging them with explicitOutcome.ContextUpdates. - Introduce goal-gate retry budgeting / fallback routing in
goalGateRetryTarget, and emitEventStageRetrying/EventStageFailedevents for goal-gate retry behavior. - Add new/updated tests covering context diffing, parallel side-effect capture, and goal-gate retry budget behavior; add design/spec docs.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pipeline/handlers/parallel.go | Captures side-effect context writes via DiffFrom(snapshot) and merges with explicit ContextUpdates. |
| pipeline/handlers/parallel_test.go | Adds tests ensuring side-effect-only writes are captured and explicit updates override side effects. |
| pipeline/context.go | Adds PipelineContext.DiffFrom() for snapshot diffing of context values. |
| pipeline/context_test.go | Adds unit tests validating DiffFrom() behavior for new/changed/unchanged keys and empty cases. |
| pipeline/engine_checkpoint.go | Updates goal-gate retry target selection to enforce retry budgets and attempt fallback routing. |
| pipeline/engine_run.go | Wires new goal-gate retry tuple into exit handling; increments retry count and emits retry/failure events. |
| pipeline/engine_goal_gate_test.go | Introduces tests for goal-gate retry termination, node-level max retries, and fallback behavior. |
| docs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.md | Adds design spec describing the issues and intended fixes. |
| docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.md | Adds an implementation plan for remaining safety tasks and tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pipeline/engine_checkpoint.go
Outdated
| return fb, nodeID, true, true | ||
| } | ||
| } | ||
| // No fallback — signal unsatisfied without retry. |
There was a problem hiding this comment.
In the retries-exhausted path, returning the fallback target with shouldRetry=true causes the engine to treat fallback routing as another retry. Because the goal gate remains unsatisfied, this can create an infinite loop (exit → fallback → exit → fallback …). Consider tracking whether fallback has already been taken (e.g., a FallbackTaken flag in the checkpoint) or returning a distinct "fallback" action so the engine only escalates once and then fails if still unsatisfied.
| return fb, nodeID, true, true | |
| } | |
| } | |
| // No fallback — signal unsatisfied without retry. | |
| // Use checkpoint context to ensure we only take fallback once per node, | |
| // and do not treat fallback routing as another generic retry. | |
| if cp.Context == nil { | |
| cp.Context = map[string]string{} | |
| } | |
| fbKey := "fallback_taken:" + nodeID | |
| if cp.Context[fbKey] == "true" { | |
| // Fallback already used; avoid infinite exit→fallback loops. | |
| break | |
| } | |
| cp.Context[fbKey] = "true" | |
| // Fallback is a one-time escalation, not a retry. | |
| return fb, nodeID, false, true | |
| } | |
| } | |
| // No fallback — or fallback already used — signal unsatisfied without retry. |
| s.cp.IncrementRetry(gateNodeID) | ||
| gateNode := e.nodeOrDefault(gateNodeID) | ||
| e.emit(PipelineEvent{ | ||
| Type: EventStageRetrying, | ||
| Timestamp: time.Now(), | ||
| RunID: s.runID, | ||
| NodeID: gateNodeID, | ||
| Message: fmt.Sprintf("goal-gate retry for %q → %q (attempt %d/%d)", | ||
| gateNodeID, target, | ||
| s.cp.RetryCount(gateNodeID), e.maxRetries(gateNode)), | ||
| }) |
There was a problem hiding this comment.
handleExitNode increments retry counts and emits EventStageRetrying whenever goalGateRetryTarget returns retry=true. If retry=true is also used for fallback/escalation routing, this will over-count retries and mislabel fallback as a retry attempt (and contributes to exit↔fallback looping). Suggest returning a separate signal for fallback routing (or only increment/emit when within the retry budget).
| s.cp.IncrementRetry(gateNodeID) | |
| gateNode := e.nodeOrDefault(gateNodeID) | |
| e.emit(PipelineEvent{ | |
| Type: EventStageRetrying, | |
| Timestamp: time.Now(), | |
| RunID: s.runID, | |
| NodeID: gateNodeID, | |
| Message: fmt.Sprintf("goal-gate retry for %q → %q (attempt %d/%d)", | |
| gateNodeID, target, | |
| s.cp.RetryCount(gateNodeID), e.maxRetries(gateNode)), | |
| }) | |
| gateNode := e.nodeOrDefault(gateNodeID) | |
| currentRetries := s.cp.RetryCount(gateNodeID) | |
| maxRetries := e.maxRetries(gateNode) | |
| // Only count and emit a retry event while within the retry budget. | |
| if currentRetries < maxRetries { | |
| s.cp.IncrementRetry(gateNodeID) | |
| e.emit(PipelineEvent{ | |
| Type: EventStageRetrying, | |
| Timestamp: time.Now(), | |
| RunID: s.runID, | |
| NodeID: gateNodeID, | |
| Message: fmt.Sprintf("goal-gate retry for %q → %q (attempt %d/%d)", | |
| gateNodeID, target, | |
| s.cp.RetryCount(gateNodeID), maxRetries), | |
| }) | |
| } |
pipeline/engine_goal_gate_test.go
Outdated
| // The escalate node routes to done, so pipeline should succeed eventually. | ||
| // But done is the exit and work is still unsatisfied — so it depends on | ||
| // whether clearDownstream clears work. Let's check the result. | ||
| _ = err | ||
| _ = result |
There was a problem hiding this comment.
This test currently discards err/result (via _ = err / _ = result) and doesn't assert the expected engine status. That makes it easy for an infinite goal-gate fallback loop to slip through unnoticed, and it won't validate that fallback routing terminates correctly. Please assert err/result.Status and consider running the engine with a context timeout in this test to prevent hangs.
|
|
||
| **Architecture:** Four independent fixes plus one cross-cutting circuit breaker. Each fix is self-contained — commit after each task. The tool_command fix gates both variable expansion systems (`ExpandVariables` and `ExpandGraphVariables`) with a hardcoded tainted-key set rather than runtime taint tracking. | ||
|
|
||
| **Tech Stack:** Go 1.24, standard library only. No new dependencies. |
There was a problem hiding this comment.
The implementation plan states "Go 1.24", but the repository's go.mod currently specifies go 1.25.5. Please update the plan to match the actual toolchain version (or vice versa) so readers don't run the wrong Go version.
| **Tech Stack:** Go 1.24, standard library only. No new dependencies. | |
| **Tech Stack:** Go 1.25.5, standard library only. No new dependencies. |
…w mode Implements ParseQuestions(markdown string) []Question with support for numbered items (. and ) style), bullet questions, and imperative verb prompts. Extracts inline options from trailing parentheticals and detects yes/no questions by option values or interrogative text patterns. Fenced code blocks are skipped. 20 tests covering all specified cases pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements InterviewAnswer and InterviewResult structs with round-trip JSON serialization and a BuildMarkdownSummary helper for downstream agents. All 9 tests pass covering round-trip, invalid input, skipped answers, canceled flag, and markdown format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add QuestionsKey, AnswersKey, and Prompt extraction to extractHumanAttrs so interview-mode human nodes can carry their context key config through the IR-to-Graph adapter layer. Also upgrades dippin-lang to v0.15.0-beta.1 which defines these fields on ir.HumanConfig. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rview mode Adds the InterviewInterviewer interface, routes mode=interview nodes to executeInterview(), and implements the full handler: question parsing, last_response fallback, zero-questions freeform degradation, retry pre-fill, custom context keys, and JSON+markdown answer storage. Includes 7 TDD tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the InterviewInterviewer interface on AutoApproveFreeformInterviewer so --auto-approve mode and tests can drive interview-mode human gates. Picks first option for select questions, "yes" for yes/no, and "auto-approved" for open-ended. Adds compile-time interface assertion and unit test coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… forms Implements the TUI component shown when an interview-mode human gate fires. Supports radio select, yes/no toggle, and textarea fields with pagination, pre-fill from previous results, and JSON serialization via replyCh. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add MsgGateInterview message type to tui/messages.go, implement AskInterview on BubbleteaInterviewer (Mode 1 inline program + Mode 2 send-func delegation), and handle MsgGateInterview in app.go handleModalMsg to show the InterviewContent modal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…implementations Implements InterviewInterviewer on AutopilotInterviewer (native LLM), ClaudeCodeAutopilotInterviewer (claude CLI), and AutopilotTUIInterviewer (TUI wrapper). Adds buildInterviewPrompt and parseInterviewResponse helpers with markdown fence stripping, JSON extraction, and a one-retry-on-parse-fail policy (provider errors hard-fail per CLAUDE.md). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements InterviewInterviewer on ConsoleInterviewer so interview-mode human gates work in non-TUI (piped/testing) contexts. Adds compile-time assertion and three unit tests covering name match, numeric index, and blank-to-skip behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exercises the full pipeline: context → ParseQuestions → AskInterview → context updates, including zero-questions freeform fallback and retry pre-fill pass-through. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Change "N questions answered" to "N of M questions answered" so skipped questions don't inflate the count. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
12 fixes from four-angle code review: Must-fix: 1. Set Incomplete=true in collectAnswers() when any answer is empty 2. Show "Tab → Add details (optional)" hint on focused select/confirm fields 3. ConsoleInterviewer pre-fill uses ID-based lookup instead of position index 4. Esc in fields navigates back instead of hard-canceling (cancel only at Q1) 5. Cap parsed questions at 100 (maxQuestions) to prevent unbounded allocation 6. Strip markdown emphasis (*bold*, _italic_) from question text labels Should-fix: 7. InterviewContent.Cancel() now closes channel (consistent with all other modals) 8. interviewRunner.Update discards cmd on quit (consistent with other runners) 9. Replace flaky time.Sleep test with blocking channel read 10. Promote default context keys to constants (ContextKeyInterviewQuestions/Answers) 11. Document ClaudeCodeAutopilot silent fallback asymmetry vs native hard-fail 12. Document autopilot prompt injection risk in buildInterviewPrompt Also: replace custom contains/searchString helpers with strings.Contains. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eview workflow Interview mode UX: - One question at a time with progress bar, answered summary, selection feedback - Enter confirms and advances for select and yes/no fields - Text wraps to terminal width (questions, context, options, summaries) - Untouched select fields report empty answer (not first option) - "Other" variants auto-filtered from options (UI always provides its own) - Cancel returns OutcomeFail so pipelines can route on cancellation Structured output: - ParseStructuredQuestions validates JSON questions from agent output - extractJSON uses bracket-depth counting (not greedy first/last brace) - response_format plumbing: .dip → IR → adapter → codergen → session → LLM API - All three providers supported (Anthropic, OpenAI, Gemini) - dippin-lang v0.16.0 integration (ResponseFormat, ResponseSchema, Params) Reliability: - Empty API response detection with capped retry (2 attempts) and diagnostics - Anthropic adapter logs raw response on 0-token replies - Codergen empty-response guard fails node instead of silent success - AskFreeform now hard-fails on provider errors (was silently swallowing) - ClaudeCode autopilot retry-then-hardfail on parse failure - Goroutine leak fix in flashDecision with done channel - Context leak fix (explicit cancel() in retry loop) - Mode 1 tea.Cmd propagation in all three runners New content: - deep_review built-in workflow (3 interview gates, parallel analysis) - interview-loop.dip reusable subgraph - dippin-lang structured output request doc (resolved) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI caught formatting issues that the pre-commit hook missed because the hook was not installed (symlink to .git/hooks/pre-commit was missing). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- parseInterviewResponse fills missing answers with empty entries and sets Incomplete when LLM omits questions (Codex P1) - Console "Other" number (len+1) now prompts for freeform text instead of storing the literal number string (Codex P2) - Document unused prev param in ClaudeCodeAutopilotInterviewer.AskInterview - Detect real terminal size in Mode 1 interview via golang.org/x/term - Add language identifier to markdown code fence in plan doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ug logging - Pre-fill sets selected=true so retry answers survive collectAnswers - fieldAnswered checks otherInput is non-empty for Other selections - Cancel sends partial answers (Canceled=true + current state) before closing - parseInterviewResponse builds results in question order, validates options via matchChoice, normalizes yes/no, sets Incomplete for missing answers - Raw response body logging gated behind TRACKER_DEBUG env var Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Config validation:
- ResponseFormat must be json_object or json_schema
- json_schema requires non-empty valid JSON ResponseSchema
Session reliability:
- Exhausted empty response retries now hard-fail instead of stoppedNaturally
Workflow fixes (deep_review.dip):
- SaveGoal tool node preserves original developer goal before interviews
- Analysis agents read goal from .ai/review/goal.txt
- Empty priority questions allowed when no actionable findings
- Revise loop includes ReviewPlan feedback via ${ctx.human_response}
Interview UX:
- IsYesNo takes precedence over Options in all code paths
- Answered summary capped to viewport height
- Blank retry input preserves previous answer (console)
- Freeform fallback sets answersKey in ContextUpdates
- Autopilot includes previous answers in retry prompt
Other:
- HTML-escape Mermaid arrows in workflows.html
- interview-loop.dip prompts include ${ctx.interview_answers}
- Added ResponseFormat validation and blank-preserve tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…re-commit CI failed because examples/ had stale copies of the 3 existing workflows and deep_review.dip was missing from examples/ entirely. - Copied embedded workflows to examples/ (source of truth for CI) - Added deep_review.dip to Makefile sync-workflows and check-workflows targets - Added workflow sync check to .pre-commit hook (gate #8) - Added deep_review.dip to dippin lint list in .pre-commit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…kflow Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a fallback-taken guard in goalGateRetryTarget using checkpoint context keys to ensure fallback/escalation routing is one-shot. Separate fallback routing from retry logic in handleExitNode so retries aren't over-counted and events are correctly labeled. Add timeout and assertions to TestGoalGateRetryFallsBackToFallbackTarget. Fix Go version in plan doc (1.24→1.25) and add language specifier to fenced code block in design spec. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 57 out of 58 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Check retry budget before allowing another loop. | ||
| maxR := e.maxRetries(node) | ||
| if cp.RetryCount(nodeID) >= maxR { | ||
| // Retries exhausted — look for a fallback/escalation target. | ||
| // Guard: only take the fallback once per gate to prevent infinite loops. | ||
| fallbackKey := "_fallback_taken:" + nodeID | ||
| if cp.Context != nil && cp.Context[fallbackKey] == "true" { | ||
| // Fallback already taken — signal unsatisfied without retry. | ||
| return "", nodeID, false, true | ||
| } | ||
| for _, fb := range []string{ | ||
| node.Attrs["fallback_target"], | ||
| node.Attrs["fallback_retry_target"], | ||
| e.graph.Attrs["fallback_target"], | ||
| e.graph.Attrs["fallback_retry_target"], | ||
| } { | ||
| if fb == "" { | ||
| continue | ||
| } | ||
| if _, ok := e.graph.Nodes[fb]; ok { | ||
| // Mark fallback as taken so it won't loop. | ||
| if cp.Context == nil { | ||
| cp.Context = map[string]string{} | ||
| } | ||
| cp.Context[fallbackKey] = "true" | ||
| return fb, nodeID, false, true |
There was a problem hiding this comment.
The fallback-taken guard is stored in cp.Context (fallbackKey), but saveCheckpoint() overwrites cp.Context with pctx.Snapshot() on every save. That means the marker will be lost the next time a checkpoint is saved (including immediately after routing), so the engine can still loop between exit and fallback forever. It also pollutes the user-visible context on resume. Consider persisting this state in a dedicated Checkpoint field (e.g., FallbackTaken map[string]bool) or another checkpoint-owned structure that isn’t replaced by snapshotting the pipeline context.
pipeline/handlers/human.go
Outdated
| fmt.Fprintf(c.Writer, "\nQ%d: %s\n", q.Index, q.Text) | ||
|
|
||
| if len(q.Options) > 0 { | ||
| // Print numbered options | ||
| for j, opt := range q.Options { | ||
| fmt.Fprintf(c.Writer, " %d) %s\n", j+1, opt) | ||
| } | ||
| fmt.Fprintf(c.Writer, " %d) Other\n", len(q.Options)+1) | ||
|
|
||
| // Pre-fill hint | ||
| if prevAns.Answer != "" { | ||
| fmt.Fprintf(c.Writer, "Previous: %s\n", prevAns.Answer) | ||
| } | ||
| fmt.Fprintf(c.Writer, "Enter choice (name or number, blank to skip): ") | ||
|
|
||
| line, err := c.readLine() | ||
| if err != nil { | ||
| canceled = true | ||
| answers[i] = ans | ||
| // Fill remaining questions with empty answers. | ||
| for j := i + 1; j < len(questions); j++ { | ||
| answers[j] = InterviewAnswer{ | ||
| ID: fmt.Sprintf("q%d", questions[j].Index), | ||
| Text: questions[j].Text, | ||
| } | ||
| } | ||
| break | ||
| } | ||
| input := strings.TrimSpace(line) | ||
| if input != "" { | ||
| // Match by name (case-insensitive) or number | ||
| matched := false | ||
| for _, opt := range q.Options { | ||
| if strings.EqualFold(input, opt) { | ||
| ans.Answer = opt | ||
| matched = true | ||
| break | ||
| } | ||
| } | ||
| if !matched { | ||
| var idx int | ||
| if _, err := fmt.Sscanf(input, "%d", &idx); err == nil && idx >= 1 && idx <= len(q.Options) { | ||
| ans.Answer = q.Options[idx-1] | ||
| } else if _, err := fmt.Sscanf(input, "%d", &idx); err == nil && idx == len(q.Options)+1 { | ||
| // User selected "Other" by number — prompt for freeform text. | ||
| fmt.Fprintf(c.Writer, "Enter your answer: ") | ||
| otherLine, otherErr := c.readLine() | ||
| if otherErr == nil { | ||
| ans.Answer = strings.TrimSpace(otherLine) | ||
| } | ||
| } else { | ||
| // Treat as "Other" freeform | ||
| ans.Answer = input | ||
| } | ||
| } | ||
| } else if prevAns.Answer != "" { | ||
| // Blank input preserves the previous answer on retry. | ||
| ans.Answer = prevAns.Answer | ||
| } | ||
| } else if q.IsYesNo { | ||
| if prevAns.Answer != "" { |
There was a problem hiding this comment.
ConsoleInterviewer.AskInterview checks len(q.Options) before q.IsYesNo, so a yes/no question that has explicit options (e.g. parsed from “(yes, no)”) will be presented as an option list and will also add an “Other” choice. This is inconsistent with the TUI behavior (which prioritizes IsYesNo) and can change semantics. Consider handling q.IsYesNo before options (or special-casing yes/no options) to keep the UX consistent.
cmd/tracker/embed_test.go
Outdated
| func TestListBuiltinWorkflowsReturnsThree(t *testing.T) { | ||
| workflows := listBuiltinWorkflows() | ||
| if len(workflows) != 3 { | ||
| t.Errorf("listBuiltinWorkflows returned %d workflows, want 3", len(workflows)) | ||
| if len(workflows) != 4 { | ||
| t.Errorf("listBuiltinWorkflows returned %d workflows, want 4", len(workflows)) | ||
| } |
There was a problem hiding this comment.
Test name no longer matches behavior: this test now expects 4 workflows but is still named TestListBuiltinWorkflowsReturnsThree. Renaming it will avoid confusion when diagnosing failures.
…priority, rename test - Move fallback-taken state from cp.Context (overwritten by pctx.Snapshot) to dedicated Checkpoint.FallbackTaken map that survives save/restore. - Reorder ConsoleInterviewer.AskInterview to check IsYesNo before len(Options), matching TUI behavior for yes/no questions with explicit option lists. - Rename TestListBuiltinWorkflowsReturnsThree to ReturnsFour to match the current assertion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (5)
.pre-commit (1)
94-98: Consider preservingcheck-workflowsoutput on failureLine 94 suppresses mismatch details, so developers only get a generic error. Letting output through improves fix speed.
🔧 Minimal tweak
-if make check-workflows > /dev/null 2>&1; then +if make check-workflows; then pass "embedded workflows in sync" else fail "examples/*.dip out of sync with cmd/tracker/workflows/ — run 'make sync-workflows'" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.pre-commit around lines 94 - 98, The current pre-commit hook hides the output of the make check-workflows target by redirecting stdout/stderr to /dev/null, so on failure developers only see a generic message; update the conditional that runs check-workflows (the make check-workflows invocation in the if/else block) to not discard its output so that mismatch details are printed on failure—i.e., remove the ">/dev/null 2>&1" redirection or change the logic to capture output only on success and echo the captured output when the command fails.docs/requests/dippin-lang-structured-output.md (1)
136-141: Add language specifier to fenced code block.The fenced code block should have a language identifier for consistent formatting and linting compliance. Since these are validation rule identifiers,
textorplaintextwould be appropriate.📝 Proposed fix
-``` +```text DIP_NEW_001: response_format must be "json_object" or "json_schema" DIP_NEW_002: response_schema requires response_format: json_schema DIP_NEW_003: response_schema must be valid JSON DIP_LINT_NEW: warning if json_schema without response_schema +```🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/requests/dippin-lang-structured-output.md` around lines 136 - 141, The fenced code block containing the validation rules (DIP_NEW_001, DIP_NEW_002, DIP_NEW_003, DIP_LINT_NEW) lacks a language specifier; update the opening triple-backtick to include a language such as "text" or "plaintext" (e.g., change ``` to ```text) so the block is consistently formatted and linting-compliant.pipeline/handlers/human_test.go (1)
659-747: Good console interviewer test coverage.Tests verify:
- Multi-question flow with different question types
- Selection by numeric index
- Blank input skipping behavior
- Previous answer preservation on blank input
Consider adding a test for "Other" selection by number (entering the
len(options)+1index) to complete coverage of that code path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/handlers/human_test.go` around lines 659 - 747, Add a test that covers selecting "Other" by numeric index for ConsoleInterviewer.AskInterview: create a new test (e.g., TestConsoleInterviewer_AskInterview_OtherByNumber) that supplies input selecting the options length+1 (e.g., "3\nCustom value\n"), constructs a ConsoleInterviewer with that Reader/Writer, calls AskInterview with a Question that has two options, and asserts the returned InterviewResult contains the freeform "Custom value" as the answer; this ensures the code path handling the numeric "Other" choice (len(options)+1) is exercised and validated.pipeline/handlers/human.go (1)
285-299: Duplicatefmt.Sscanfparsing can be consolidated.The code parses the input twice with
fmt.Sscanf(lines 287 and 289). This works but is inefficient and slightly confusing.♻️ Suggested consolidation
if !matched { var idx int - if _, err := fmt.Sscanf(input, "%d", &idx); err == nil && idx >= 1 && idx <= len(q.Options) { - ans.Answer = q.Options[idx-1] - } else if _, err := fmt.Sscanf(input, "%d", &idx); err == nil && idx == len(q.Options)+1 { - // User selected "Other" by number — prompt for freeform text. - fmt.Fprintf(c.Writer, "Enter your answer: ") - otherLine, otherErr := c.readLine() - if otherErr == nil { - ans.Answer = strings.TrimSpace(otherLine) + if _, err := fmt.Sscanf(input, "%d", &idx); err == nil { + if idx >= 1 && idx <= len(q.Options) { + ans.Answer = q.Options[idx-1] + } else if idx == len(q.Options)+1 { + // User selected "Other" by number — prompt for freeform text. + fmt.Fprintf(c.Writer, "Enter your answer: ") + otherLine, otherErr := c.readLine() + if otherErr == nil { + ans.Answer = strings.TrimSpace(otherLine) + } + } else { + // Invalid index — treat as freeform "Other" + ans.Answer = input } } else { // Treat as "Other" freeform ans.Answer = input } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/handlers/human.go` around lines 285 - 299, Consolidate the duplicate fmt.Sscanf calls by parsing input into idx exactly once and reusing the result: in the branch where !matched, call fmt.Sscanf(input, "%d", &idx) a single time, check err==nil and then handle three cases (1) idx between 1 and len(q.Options) -> set ans.Answer = q.Options[idx-1], (2) idx == len(q.Options)+1 -> prompt via fmt.Fprintf(c.Writer, "Enter your answer: "), call otherLine, otherErr := c.readLine() and set ans.Answer = strings.TrimSpace(otherLine) on no error, and (3) otherwise -> fall back to treating input as freeform ans.Answer = input; keep variable names matched, input, q.Options, ans.Answer and c.readLine as in the diff.tui/interview_content.go (1)
635-636: Potential mismatch between question Index and total count display.The label uses
f.question.Index(which comes from parsed question data) alongsidelen(ic.fields)for the total. If question indices aren't sequential starting from 1, or if some questions were filtered, this could display confusing progress like "Q5 of 3". Consider usingic.cursor + 1instead off.question.Indexfor consistent "N of M" display.💡 Suggested fix for consistent progress display
- qLabel := fmt.Sprintf("Q%d of %d: %s", f.question.Index, len(ic.fields), f.question.Text) + qLabel := fmt.Sprintf("Q%d of %d: %s", ic.cursor+1, len(ic.fields), f.question.Text)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tui/interview_content.go` around lines 635 - 636, The Q label currently uses f.question.Index with len(ic.fields), which can mismatch if indices aren’t sequential or some questions are filtered; change the label construction to use the current cursor position (ic.cursor) for the displayed ordinal (use ic.cursor + 1) so the progress reads "Q{current} of {len(ic.fields)}" consistently—update the fmt.Sprintf invocation that builds qLabel (and any related uses of f.question.Index for progress display) to reference ic.cursor + 1 instead of f.question.Index.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agent/config.go`:
- Around line 90-102: The current validation only checks ResponseSchema when
ResponseFormat == "json_schema", allowing a non-empty ResponseSchema to be
accepted when ResponseFormat is empty or "json_object"; add a guard to reject
orphaned ResponseSchema by returning an error if c.ResponseSchema != "" and
c.ResponseFormat != "json_schema". Update the validation block (around the
existing checks for c.ResponseFormat, c.ResponseSchema and json.Valid) to first
check that if c.ResponseSchema is non-empty then c.ResponseFormat must equal
"json_schema", otherwise return a clear error message referencing both
ResponseSchema and ResponseFormat.
In `@cmd/tracker/embed_test.go`:
- Around line 41-45: Rename the test function
TestListBuiltinWorkflowsReturnsThree to TestListBuiltinWorkflowsReturnsFour to
match the assertion expecting 4 workflows; update the function identifier in the
test file so the test name and its expectation (len(workflows) != 4) are
consistent and the test runner reflects the correct behavior.
In `@cmd/tracker/workflows/deep_review.dip`:
- Around line 365-368: Add an inline comment near the `ReviewPlan ->
WriteRemediationPlan` edge (the `revise` edge with `restart: true`) explaining
that the restart will be limited by the workflow's `max_restarts: 20` default to
prevent infinite revision loops; mention that `ReviewPlan` is `mode: freeform`
with exhaustive labeled edges (approve, revise, reject) so no extra failure
handling is required.
- Around line 25-30: The SaveGoal tool's command concatenates the expanded
${ctx.human_response} and runs it via sh -c without escaping, enabling shell
injection; update the expansion/execution path (where ExpandVariables() is used
for tool commands) to shell-escape or quote expanded values before invoking sh
-c (or avoid sh -c and pass arguments directly), and/or apply input validation
on human_response at the source; specifically, modify the code handling tool
command execution (the code that calls ExpandVariables() and then executes the
command via sh -c) to use a safe escaping routine or to build argv arrays so
metacharacters in human_response cannot be interpreted, mirroring the existing
working_dir validation approach in pipeline/handlers/tool.go.
In `@docs/superpowers/plans/2026-03-30-interview-mode.md`:
- Around line 478-483: The test uses a runtime type assertion on a concrete
*AutoApproveFreeformInterviewer (ii, ok := ai.(InterviewInterviewer)), which is
invalid in Go because assertions only work on interface values; replace that
assertion with a compile-time type assignment so the compiler will check that
AutoApproveFreeformInterviewer implements InterviewInterviewer (assign ai to a
variable of type InterviewInterviewer), then call ii.AskInterview(...) once
AskInterview is implemented.
In `@docs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.md`:
- Line 15: The heading "### Task 1: Commit existing parallel context fix (`#20`)
and goal-gate retry fix (`#15`)" jumps from the document-level `#` to `###`
(MD001); change this to a `##` heading or insert an intermediate `##` section
above it so headings progress properly. Update the "Task 1: Commit existing
parallel context fix (`#20`) and goal-gate retry fix (`#15`)" line to use `##` (or
add a new `##` container heading before the `###`) to resolve the MD001 rule.
- Around line 1393-1404: The fenced dependency-graph block that starts with ```
(the tree listing of Task 1 → Task 10) needs an explicit language to satisfy
markdownlint MD040; update the opening fence from ``` to ```text so the block is
fenced as a text code block (leave the block contents unchanged and keep the
closing ```).
In `@examples/subgraphs/interview-loop.dip`:
- Around line 37-38: The current Interviewer output of {"questions": []} still
routes to the Answer node and triggers freeform input; update the graph so that
when Interviewer emits an empty question list it does not continue to Answer:
either add a conditional branch that checks for an empty "questions" array and
routes to a completion/exit node (short-circuiting the normal Answer path), or
change Interviewer to emit an explicit signal (e.g., {"complete": true} plus no
"questions") and alter downstream edges (the transitions from Interviewer to
Answer) to inspect that flag and bypass Answer when present; update all
equivalent places where Interviewer returns empty questions (the other two
branches mentioned) to use the same branching or explicit-complete signal.
In `@llm/anthropic/adapter.go`:
- Around line 118-120: The code currently logs the full raw provider response
via log.Printf when TRACKER_DEBUG is set, which can leak sensitive user/tool
data; update the logging in the anthropic response handling (where respBody is
used and log.Printf is called) to avoid printing cleartext: compute a stable
fingerprint (e.g., SHA-256) of respBody and log the hex prefix plus the byte
length, and if you must include content for diagnostics log only a small
redacted/truncated snippet (e.g., first 100–200 bytes) with sensitive characters
masked or replaced, ensuring the full respBody is never emitted; keep the
TRACKER_DEBUG conditional but replace the direct string interpolation of
respBody with these hashed/truncated diagnostics.
In `@pipeline/engine_checkpoint.go`:
- Around line 195-210: Update the function comment that currently says
fallback_target is only used when retries are exhausted to state that
fallback_target (both node-level and graph-level: node.Attrs["fallback_target"]
and e.graph.Attrs["fallback_target"]) is now included in the retry candidate
list and therefore can act as a retry destination when retries remain as well as
an escalation destination when retries are exhausted; mention that this behavior
is exercised by TestGoalGateFallbackTargetAttributeRecognized and that pipelines
using both retry_target and fallback_target together remain unaffected.
In `@pipeline/engine_goal_gate_test.go`:
- Around line 153-198: The test uses "fallback_target" but should use
"retry_target" on the "work" node to represent the retry hop; update the Attrs
map in TestGoalGateFallbackTargetAttributeRecognized (node ID "work") to set
"retry_target":"repair" instead of "fallback_target", and tighten the assertion
on workAttempts (the workAttempts variable) to require the expected retry count
(e.g., assert workAttempts == 2) so the test fails if retries didn't occur
exactly as intended.
- Around line 146-150: The test currently allows a green exit by checking
result.Status against OutcomeFail OR OutcomeSuccess; change it to require only
OutcomeFail so an engine that escalates but incorrectly exits success fails the
test. Locate the assertion using result.Status and OutcomeSuccess/OutcomeFail in
the failing test (around the terminal-check after escalation), replace the
boolean OR condition with a single check for OutcomeFail, and update the
t.Fatalf message to state "want fail" (or similar) so the intent is clear.
In `@pipeline/handlers/interview_result.go`:
- Around line 74-90: The code treats whitespace-only answers as non-empty;
update both places that check q.Answer to trim whitespace before testing: use
strings.TrimSpace(q.Answer) when deciding to render "(skipped)" in the first
loop (the block that writes Answer/Elaboration via b.WriteString) and likewise
use strings.TrimSpace(q.Answer) when counting answered in the loop that computes
answered/len(r.Questions); ensure strings is imported if not already.
In `@tui/interviewer.go`:
- Around line 227-233: In askMode2Interview, when receiving from the ReplyCh
created for MsgGateInterview, treat a closed channel (ok==false) as a TUI
failure instead of a user cancel: replace the current return of
&handlers.InterviewResult{Canceled: true} with returning a non-nil error that
indicates the TUI closed before replying (for example errors.New or fmt.Errorf),
so callers can distinguish a crashed/closed TUI from an intentional cancel;
update callers of askMode2Interview to handle this error path if necessary.
---
Nitpick comments:
In @.pre-commit:
- Around line 94-98: The current pre-commit hook hides the output of the make
check-workflows target by redirecting stdout/stderr to /dev/null, so on failure
developers only see a generic message; update the conditional that runs
check-workflows (the make check-workflows invocation in the if/else block) to
not discard its output so that mismatch details are printed on failure—i.e.,
remove the ">/dev/null 2>&1" redirection or change the logic to capture output
only on success and echo the captured output when the command fails.
In `@docs/requests/dippin-lang-structured-output.md`:
- Around line 136-141: The fenced code block containing the validation rules
(DIP_NEW_001, DIP_NEW_002, DIP_NEW_003, DIP_LINT_NEW) lacks a language
specifier; update the opening triple-backtick to include a language such as
"text" or "plaintext" (e.g., change ``` to ```text) so the block is consistently
formatted and linting-compliant.
In `@pipeline/handlers/human_test.go`:
- Around line 659-747: Add a test that covers selecting "Other" by numeric index
for ConsoleInterviewer.AskInterview: create a new test (e.g.,
TestConsoleInterviewer_AskInterview_OtherByNumber) that supplies input selecting
the options length+1 (e.g., "3\nCustom value\n"), constructs a
ConsoleInterviewer with that Reader/Writer, calls AskInterview with a Question
that has two options, and asserts the returned InterviewResult contains the
freeform "Custom value" as the answer; this ensures the code path handling the
numeric "Other" choice (len(options)+1) is exercised and validated.
In `@pipeline/handlers/human.go`:
- Around line 285-299: Consolidate the duplicate fmt.Sscanf calls by parsing
input into idx exactly once and reusing the result: in the branch where
!matched, call fmt.Sscanf(input, "%d", &idx) a single time, check err==nil and
then handle three cases (1) idx between 1 and len(q.Options) -> set ans.Answer =
q.Options[idx-1], (2) idx == len(q.Options)+1 -> prompt via
fmt.Fprintf(c.Writer, "Enter your answer: "), call otherLine, otherErr :=
c.readLine() and set ans.Answer = strings.TrimSpace(otherLine) on no error, and
(3) otherwise -> fall back to treating input as freeform ans.Answer = input;
keep variable names matched, input, q.Options, ans.Answer and c.readLine as in
the diff.
In `@tui/interview_content.go`:
- Around line 635-636: The Q label currently uses f.question.Index with
len(ic.fields), which can mismatch if indices aren’t sequential or some
questions are filtered; change the label construction to use the current cursor
position (ic.cursor) for the displayed ordinal (use ic.cursor + 1) so the
progress reads "Q{current} of {len(ic.fields)}" consistently—update the
fmt.Sprintf invocation that builds qLabel (and any related uses of
f.question.Index for progress display) to reference ic.cursor + 1 instead of
f.question.Index.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4090bf60-dd6c-47fa-b9b4-3a9912fcc729
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (55)
.pre-commitCHANGELOG.mdCLAUDE.mdMakefileREADME.mdagent/config.goagent/config_test.goagent/session.goagent/session_run.gocmd/tracker/embed_test.gocmd/tracker/workflows/deep_review.dipdocs/requests/dippin-lang-structured-output.mddocs/site/changelog.htmldocs/site/workflows.htmldocs/superpowers/plans/2026-03-30-interview-mode.mddocs/superpowers/plans/2026-03-30-p0-critical-safety-fixes.mddocs/superpowers/specs/2026-03-30-p0-critical-safety-fixes-design.mdexamples/ask_and_execute.dipexamples/build_product.dipexamples/build_product_with_superspec.dipexamples/deep_review.dipexamples/subgraphs/interview-loop.dipgo.modllm/anthropic/adapter.gopipeline/context.gopipeline/dippin_adapter.gopipeline/dippin_adapter_test.gopipeline/engine_checkpoint.gopipeline/engine_goal_gate_test.gopipeline/engine_run.gopipeline/events.gopipeline/events_test.gopipeline/handlers/autopilot.gopipeline/handlers/autopilot_claudecode.gopipeline/handlers/autopilot_claudecode_test.gopipeline/handlers/autopilot_test.gopipeline/handlers/codergen.gopipeline/handlers/human.gopipeline/handlers/human_test.gopipeline/handlers/interview_integration_test.gopipeline/handlers/interview_parse.gopipeline/handlers/interview_parse_test.gopipeline/handlers/interview_result.gopipeline/handlers/interview_result_test.gopipeline/handlers/parallel.gopipeline/handlers/tool_test.gotui/app.gotui/autopilot_interviewer.gotui/autopilot_interviewer_test.gotui/interview_content.gotui/interview_content_test.gotui/interviewer.gotui/interviewer_test.gotui/messages.gotui/modal.go
✅ Files skipped from review due to trivial changes (8)
- go.mod
- examples/ask_and_execute.dip
- pipeline/events_test.go
- pipeline/events.go
- examples/build_product_with_superspec.dip
- examples/build_product.dip
- agent/config_test.go
- CLAUDE.md
🚧 Files skipped from review as they are similar to previous changes (2)
- pipeline/context.go
- pipeline/handlers/parallel.go
| if c.ResponseFormat != "" { | ||
| if c.ResponseFormat != "json_object" && c.ResponseFormat != "json_schema" { | ||
| return fmt.Errorf("ResponseFormat must be \"json_object\" or \"json_schema\", got %q", c.ResponseFormat) | ||
| } | ||
| if c.ResponseFormat == "json_schema" { | ||
| if c.ResponseSchema == "" { | ||
| return fmt.Errorf("ResponseSchema must be non-empty when ResponseFormat is \"json_schema\"") | ||
| } | ||
| if !json.Valid([]byte(c.ResponseSchema)) { | ||
| return fmt.Errorf("ResponseSchema must be valid JSON, got %q", c.ResponseSchema) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Reject orphaned ResponseSchema to prevent silent misconfiguration
Line 90 onward validates schema only for json_schema, but a non-empty ResponseSchema is currently accepted when ResponseFormat is unset or json_object. That silently drops intent.
🔧 Suggested validation guard
+ if c.ResponseSchema != "" && c.ResponseFormat != "json_schema" {
+ return fmt.Errorf("ResponseSchema requires ResponseFormat to be %q", "json_schema")
+ }
if c.ResponseFormat != "" {
if c.ResponseFormat != "json_object" && c.ResponseFormat != "json_schema" {
return fmt.Errorf("ResponseFormat must be \"json_object\" or \"json_schema\", got %q", c.ResponseFormat)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if c.ResponseFormat != "" { | |
| if c.ResponseFormat != "json_object" && c.ResponseFormat != "json_schema" { | |
| return fmt.Errorf("ResponseFormat must be \"json_object\" or \"json_schema\", got %q", c.ResponseFormat) | |
| } | |
| if c.ResponseFormat == "json_schema" { | |
| if c.ResponseSchema == "" { | |
| return fmt.Errorf("ResponseSchema must be non-empty when ResponseFormat is \"json_schema\"") | |
| } | |
| if !json.Valid([]byte(c.ResponseSchema)) { | |
| return fmt.Errorf("ResponseSchema must be valid JSON, got %q", c.ResponseSchema) | |
| } | |
| } | |
| } | |
| if c.ResponseSchema != "" && c.ResponseFormat != "json_schema" { | |
| return fmt.Errorf("ResponseSchema requires ResponseFormat to be %q", "json_schema") | |
| } | |
| if c.ResponseFormat != "" { | |
| if c.ResponseFormat != "json_object" && c.ResponseFormat != "json_schema" { | |
| return fmt.Errorf("ResponseFormat must be \"json_object\" or \"json_schema\", got %q", c.ResponseFormat) | |
| } | |
| if c.ResponseFormat == "json_schema" { | |
| if c.ResponseSchema == "" { | |
| return fmt.Errorf("ResponseSchema must be non-empty when ResponseFormat is \"json_schema\"") | |
| } | |
| if !json.Valid([]byte(c.ResponseSchema)) { | |
| return fmt.Errorf("ResponseSchema must be valid JSON, got %q", c.ResponseSchema) | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/config.go` around lines 90 - 102, The current validation only checks
ResponseSchema when ResponseFormat == "json_schema", allowing a non-empty
ResponseSchema to be accepted when ResponseFormat is empty or "json_object"; add
a guard to reject orphaned ResponseSchema by returning an error if
c.ResponseSchema != "" and c.ResponseFormat != "json_schema". Update the
validation block (around the existing checks for c.ResponseFormat,
c.ResponseSchema and json.Valid) to first check that if c.ResponseSchema is
non-empty then c.ResponseFormat must equal "json_schema", otherwise return a
clear error message referencing both ResponseSchema and ResponseFormat.
| tool SaveGoal | ||
| label: "Snapshot the developer's goal" | ||
| command: | ||
| mkdir -p .ai/review | ||
| printf '%s\n' "${ctx.human_response}" > .ai/review/goal.txt | ||
| echo "Goal saved to .ai/review/goal.txt" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how tool command interpolation handles escaping
rg -n -B 2 -A 10 'ExpandVariables|interpolat' pipeline/*.go | head -60Repository: 2389-research/tracker
Length of output: 3057
🏁 Script executed:
# Get the full ExpandVariables implementation
cat -n pipeline/expand.go | head -100Repository: 2389-research/tracker
Length of output: 3574
🏁 Script executed:
# Check how tool commands are executed and if escaping is applied
rg -n -B 5 -A 15 'command:' cmd/tracker/workflows/deep_review.dipRepository: 2389-research/tracker
Length of output: 879
🏁 Script executed:
# Search for shell command execution logic in the pipeline
rg -n -B 3 -A 10 'exec|sh|bash|cmd' pipeline/*.go | grep -i 'command\|exec\|shell' | head -40Repository: 2389-research/tracker
Length of output: 4340
🏁 Script executed:
# Look for any shell-specific escaping or quote handling functions
rg -n 'ShellQuote|EscapeShell|shlex|Quote' pipeline/*.goRepository: 2389-research/tracker
Length of output: 621
🏁 Script executed:
# Search for tool handler implementation
rg -n -B 5 -A 20 'type.*Tool.*Handler|func.*Tool.*Execute' pipeline/*.goRepository: 2389-research/tracker
Length of output: 47
🏁 Script executed:
# Search for where shell commands are executed
rg -n -B 3 -A 10 'exec.Command|os.Sh|sh\s+-c' pipeline/*.goRepository: 2389-research/tracker
Length of output: 47
🏁 Script executed:
# Look for any recent changes or protections added to tool commands
git log --oneline --all | grep -i 'tool\|escap\|protect' | head -20Repository: 2389-research/tracker
Length of output: 1573
🏁 Script executed:
# Show the changes in the key commits
git show 830dafc --statRepository: 2389-research/tracker
Length of output: 1409
🏁 Script executed:
git show 1f1c98f --statRepository: 2389-research/tracker
Length of output: 795
🏁 Script executed:
# Look for tool handler implementation
find . -name '*.go' -type f | xargs grep -l 'tool' | grep handlerRepository: 2389-research/tracker
Length of output: 482
🏁 Script executed:
# Search for tool execution more broadly
rg -n 'type.*Handler' pipeline/*.go | grep -i toolRepository: 2389-research/tracker
Length of output: 47
🏁 Script executed:
# Read the tool handler implementation
cat -n pipeline/handlers/tool.goRepository: 2389-research/tracker
Length of output: 5237
Shell command executes expanded variables without escaping — shell injection risk.
The SaveGoal tool expands ${ctx.human_response} via ExpandVariables() and passes the result directly to sh -c. The variable expansion layer applies no shell escaping (see pipeline/expand.go), so shell metacharacters in human_response (e.g., $(rm -rf /), backticks, quotes) will be interpreted by the shell. While the working_dir attribute has metacharacter validation (pipeline/handlers/tool.go lines 64-72), the expanded tool command itself has no such protection. Ensure all expanded variables in tool commands are properly shell-escaped before execution, or constrain input values at the source.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/tracker/workflows/deep_review.dip` around lines 25 - 30, The SaveGoal
tool's command concatenates the expanded ${ctx.human_response} and runs it via
sh -c without escaping, enabling shell injection; update the expansion/execution
path (where ExpandVariables() is used for tool commands) to shell-escape or
quote expanded values before invoking sh -c (or avoid sh -c and pass arguments
directly), and/or apply input validation on human_response at the source;
specifically, modify the code handling tool command execution (the code that
calls ExpandVariables() and then executes the command via sh -c) to use a safe
escaping routine or to build argv arrays so metacharacters in human_response
cannot be interpreted, mirroring the existing working_dir validation approach in
pipeline/handlers/tool.go.
| WriteRemediationPlan -> ReviewPlan | ||
| ReviewPlan -> ApplyFormat label: "approve" | ||
| ReviewPlan -> WriteRemediationPlan label: "revise" restart: true | ||
| ReviewPlan -> Done label: "reject" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if there's default failure handling in the pipeline engine for nodes without fail edges
rg -n -A 5 'OutcomeFail|outcome.*fail' pipeline/engine*.go | head -50Repository: 2389-research/tracker
Length of output: 2838
🏁 Script executed:
cat -n cmd/tracker/workflows/deep_review.dip | head -400 | tail -100Repository: 2389-research/tracker
Length of output: 4359
🏁 Script executed:
# Check freeform mode handling and OutcomeFail scenarios
rg -n "mode.*freeform|freeform.*outcome" pipeline/ cmd/tracker/Repository: 2389-research/tracker
Length of output: 1540
🏁 Script executed:
# Look for max_restarts defaults
rg -n "max_restarts|defaults" cmd/tracker/workflows/deep_review.dipRepository: 2389-research/tracker
Length of output: 100
🏁 Script executed:
sed -n '380,440p' pipeline/handlers/human.goRepository: 2389-research/tracker
Length of output: 2341
🏁 Script executed:
sed -n '440,480p' pipeline/handlers/human.goRepository: 2389-research/tracker
Length of output: 1376
Add comment documenting the restart limit for the revision loop.
The revise edge loops back to WriteRemediationPlan with restart: true. The max_restarts: 20 default provides a safety limit to prevent infinite revision loops — consider adding a comment to document this behavior explicitly.
Note: ReviewPlan is a mode: freeform node with labeled edges (approve, revise, reject) that exhaustively route all responses, so no additional failure handling is needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/tracker/workflows/deep_review.dip` around lines 365 - 368, Add an inline
comment near the `ReviewPlan -> WriteRemediationPlan` edge (the `revise` edge
with `restart: true`) explaining that the restart will be limited by the
workflow's `max_restarts: 20` default to prevent infinite revision loops;
mention that `ReviewPlan` is `mode: freeform` with exhaustive labeled edges
(approve, revise, reject) so no extra failure handling is required.
| func TestAutoApproveInterviewInterviewer(t *testing.T) { | ||
| ai := &AutoApproveFreeformInterviewer{} | ||
| // Type-assert to InterviewInterviewer should work | ||
| ii, ok := ai.(InterviewInterviewer) // Will fail until implemented | ||
| result, err := ii.AskInterview(questions, nil) | ||
| // All answers empty, incomplete=false, canceled=false |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n docs/superpowers/plans/2026-03-30-interview-mode.md | sed -n '475,490p'Repository: 2389-research/tracker
Length of output: 652
🌐 Web query:
Go type assertion concrete type vs interface type rules
💡 Result:
In Go, type assertions x.(T) on an interface value x have different rules depending on whether the target type T is a concrete (non-interface) type or an interface type. For concrete T (not an interface): - Asserts that the dynamic type of x is identical to T. - T must implement x's static interface type (compile-time check); otherwise invalid. - If true, returns the stored value with static type T; else panics (unless comma-ok form used). For interface T: - Asserts that the dynamic type of x implements the interface T. - Succeeds if x's dynamic type satisfies T's type set. - Returns the value wrapped in interface T. Examples: Concrete: var x any = 7; i := x.(int) // succeeds, i is int Interface: type Fooer interface{Foo}; type Thing int; func(t Thing)Foo{}; var y any = Thing(3); f := y.(Fooer) // succeeds Always prefer comma-ok form v, ok := x.(T) for safety to avoid panics. Sources: Go spec (6,13,25), Boldly Go analyses (12,20), GeeksforGeeks/Rednafi (2,3).
Citations:
- 1: https://go.dev/ref/spec
- 2: https://boldlygo.tech/archive/2023-12-11-type-assertion-values
- 3: https://go.dev/doc/go1.17_spec
- 4: https://boldlygo.tech/archive/2023-12-06-type-assertions-of-non-interface-types/
- 5: https://www.geeksforgeeks.org/go-type-assertion-vs-type-switches/
- 6: https://rednafi.com/go/type-assertion-vs-type-switches/
- 7: https://yourbasic.org/golang/type-assertion-switch
This interface-check example is invalid Go.
Line 481 uses a type assertion on a concrete *AutoApproveFreeformInterviewer type, but Go type assertions only work on interface values. This code will never compile, even after AskInterview is implemented. Replace with an assignment to perform the compile-time interface check idiomatically:
var ii InterviewInterviewer = ai // compile-time check once implemented🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/superpowers/plans/2026-03-30-interview-mode.md` around lines 478 - 483,
The test uses a runtime type assertion on a concrete
*AutoApproveFreeformInterviewer (ii, ok := ai.(InterviewInterviewer)), which is
invalid in Go because assertions only work on interface values; replace that
assertion with a compile-time type assignment so the compiler will check that
AutoApproveFreeformInterviewer implements InterviewInterviewer (assign ai to a
variable of type InterviewInterviewer), then call ii.AskInterview(...) once
AskInterview is implemented.
| func TestGoalGateFallbackTargetAttributeRecognized(t *testing.T) { | ||
| // Verify that "fallback_target" (not just "fallback_retry_target") is recognized | ||
| // as a retry target when retries remain. | ||
| g := NewGraph("goal-gate-attr") | ||
|
|
||
| g.AddNode(&Node{ID: "start", Shape: "Mdiamond"}) | ||
| g.AddNode(&Node{ID: "work", Shape: "box", Attrs: map[string]string{ | ||
| "goal_gate": "true", | ||
| "fallback_target": "repair", | ||
| }}) | ||
| g.AddNode(&Node{ID: "repair", Shape: "box"}) | ||
| g.AddNode(&Node{ID: "done", Shape: "Msquare"}) | ||
|
|
||
| g.AddEdge(&Edge{From: "start", To: "work"}) | ||
| g.AddEdge(&Edge{From: "work", To: "done", Condition: "ctx.outcome = success"}) | ||
| g.AddEdge(&Edge{From: "work", To: "done", Condition: "ctx.outcome = fail"}) | ||
| g.AddEdge(&Edge{From: "repair", To: "work"}) | ||
|
|
||
| reg := newTestRegistry() | ||
| workAttempts := 0 | ||
| reg.Register(&testHandler{ | ||
| name: "codergen", | ||
| executeFn: func(ctx context.Context, node *Node, pctx *PipelineContext) (Outcome, error) { | ||
| if node.ID == "work" { | ||
| workAttempts++ | ||
| if workAttempts == 1 { | ||
| return Outcome{Status: OutcomeFail}, nil | ||
| } | ||
| return Outcome{Status: OutcomeSuccess}, nil | ||
| } | ||
| return Outcome{Status: OutcomeSuccess}, nil | ||
| }, | ||
| }) | ||
|
|
||
| engine := NewEngine(g, reg) | ||
| result, err := engine.Run(context.Background()) | ||
| if err != nil { | ||
| t.Fatalf("engine.Run failed: %v", err) | ||
| } | ||
| if result.Status != OutcomeSuccess { | ||
| t.Fatalf("status = %q, want %q", result.Status, OutcomeSuccess) | ||
| } | ||
| if workAttempts < 2 { | ||
| t.Fatalf("work ran %d times, expected at least 2", workAttempts) | ||
| } | ||
| } |
There was a problem hiding this comment.
Use retry_target for the repair hop in this test.
This test currently treats fallback_target as the normal retry path, but the PR contract describes fallback as the route taken after retries are exhausted. The workAttempts < 2 assertion on Line 195 is also too weak; extra loops still pass.
💡 Suggested fix
-func TestGoalGateFallbackTargetAttributeRecognized(t *testing.T) {
- // Verify that "fallback_target" (not just "fallback_retry_target") is recognized
- // as a retry target when retries remain.
+func TestGoalGateRetryTargetAttributeRecognized(t *testing.T) {
+ // Verify that "retry_target" is used while retries remain.
g := NewGraph("goal-gate-attr")
@@
g.AddNode(&Node{ID: "work", Shape: "box", Attrs: map[string]string{
- "goal_gate": "true",
- "fallback_target": "repair",
+ "goal_gate": "true",
+ "retry_target": "repair",
}})
@@
- if workAttempts < 2 {
- t.Fatalf("work ran %d times, expected at least 2", workAttempts)
+ if workAttempts != 2 {
+ t.Fatalf("work ran %d times, want 2", workAttempts)
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pipeline/engine_goal_gate_test.go` around lines 153 - 198, The test uses
"fallback_target" but should use "retry_target" on the "work" node to represent
the retry hop; update the Attrs map in
TestGoalGateFallbackTargetAttributeRecognized (node ID "work") to set
"retry_target":"repair" instead of "fallback_target", and tighten the assertion
on workAttempts (the workAttempts variable) to require the expected retry count
(e.g., assert workAttempts == 2) so the test fails if retries didn't occur
exactly as intended.
| func (b *BubbleteaInterviewer) askMode2Interview(questions []handlers.Question, prev *handlers.InterviewResult) (*handlers.InterviewResult, error) { | ||
| ch := make(chan string, 1) | ||
| b.send(MsgGateInterview{Questions: questions, Previous: prev, ReplyCh: ch}) | ||
| reply, ok := <-ch | ||
| if !ok { | ||
| return &handlers.InterviewResult{Canceled: true}, nil | ||
| } |
There was a problem hiding this comment.
Treat a closed interview reply channel as a TUI failure, not a cancel.
The existing choice/freeform gates return an error when the TUI program closes before replying. Returning Canceled: true here makes a crashed TUI or broken modal wiring indistinguishable from an intentional user cancel.
💡 Suggested fix
func (b *BubbleteaInterviewer) askMode2Interview(questions []handlers.Question, prev *handlers.InterviewResult) (*handlers.InterviewResult, error) {
ch := make(chan string, 1)
b.send(MsgGateInterview{Questions: questions, Previous: prev, ReplyCh: ch})
reply, ok := <-ch
if !ok {
- return &handlers.InterviewResult{Canceled: true}, nil
+ return nil, fmt.Errorf("TUI program closed before responding to interview gate")
}
result, err := handlers.DeserializeInterviewResult(reply)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (b *BubbleteaInterviewer) askMode2Interview(questions []handlers.Question, prev *handlers.InterviewResult) (*handlers.InterviewResult, error) { | |
| ch := make(chan string, 1) | |
| b.send(MsgGateInterview{Questions: questions, Previous: prev, ReplyCh: ch}) | |
| reply, ok := <-ch | |
| if !ok { | |
| return &handlers.InterviewResult{Canceled: true}, nil | |
| } | |
| func (b *BubbleteaInterviewer) askMode2Interview(questions []handlers.Question, prev *handlers.InterviewResult) (*handlers.InterviewResult, error) { | |
| ch := make(chan string, 1) | |
| b.send(MsgGateInterview{Questions: questions, Previous: prev, ReplyCh: ch}) | |
| reply, ok := <-ch | |
| if !ok { | |
| return nil, fmt.Errorf("TUI program closed before responding to interview gate") | |
| } | |
| result, err := handlers.DeserializeInterviewResult(reply) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tui/interviewer.go` around lines 227 - 233, In askMode2Interview, when
receiving from the ReplyCh created for MsgGateInterview, treat a closed channel
(ok==false) as a TUI failure instead of a user cancel: replace the current
return of &handlers.InterviewResult{Canceled: true} with returning a non-nil
error that indicates the TUI closed before replying (for example errors.New or
fmt.Errorf), so callers can distinguish a crashed/closed TUI from an intentional
cancel; update callers of askMode2Interview to handle this error path if
necessary.
- Resolve merge conflict in human.go (keep IsYesNo-first ordering) - Move FallbackTaken guard to dedicated Checkpoint field (persists through saveCheckpoint which overwrites cp.Context with pctx.Snapshot) - Update goalGateRetryTarget doc to describe fallback_target dual role - Tighten fallback test assertion to require OutcomeFail only - Truncate debug response logging to 200 bytes (avoid leaking payloads) - Treat whitespace-only interview answers as skipped - Fix markdown heading level jump (MD001) and code fence language (MD040) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
goalGateRetryTargetnow checksRetryCountagainstmaxRetriesbefore allowing another loop. When exhausted, routes tofallback_targetor signals failure withEventStageFailed.PipelineContext.DiffFrom()method detects side-effect writes (pctx.Set()) that handlers make directly, not just values returned viaOutcome.ContextUpdates.DiffFrom(snapshot)with explicitContextUpdates, preventing silent data loss from branch handlers that write to context as a side effect.EventStageRetryingandEventStageFailedevents are now emitted for goal-gate retry loops, giving the TUI and trace visibility into retry behavior.Test plan
TestContextDiffFrom*— covers new keys, changed keys, unchanged exclusion, empty baseline/contextTestParallelHandlerCapturesSideEffectWrites— verifies side-effect-only writes are capturedTestParallelHandlerExplicitOverridesSideEffects— verifies explicit ContextUpdates win over side-effectspipeline/engine_goal_gate_test.go— new file covering goal-gate retry budget and fallback routinggo test ./... -short— all packages pass🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation