fix(pi): surface OpenRouter provider errors#104
Conversation
📝 WalkthroughWalkthroughBuilds the pi child-process environment from request context and provider env (injecting Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Executor
participant ServiceMgr as Service Manager
participant PiProcess as pi (child)
participant StreamParser
Client->>Executor: Start task (provider=openrouter)
Executor->>ServiceMgr: lookupServiceManagerEnv("OPENROUTER_API_KEY")
ServiceMgr-->>Executor: OPENROUTER_API_KEY=value
Executor->>PiProcess: spawn pi with env (incl. OPENROUTER_API_KEY)
PiProcess-->>StreamParser: NDJSON event stream
StreamParser->>StreamParser: parse events (incl. agent_end)
alt agent_end with error (stopReason="error")
StreamParser-->>Executor: return formatted error (includes OPENROUTER_API_KEY hint)
Executor-->>Client: report TaskError
else normal completion
StreamParser-->>Executor: emit result events
Executor-->>Client: report TaskComplete
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 1/10 review remaining, refill in 53 minutes and 7 seconds. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/pi/executor.go`:
- Around line 353-371: The code currently treats any OPENROUTER_API_KEY= entry
as present (envHasKey), preventing fallback even when the value is blank; update
the check so providerEnv only treats the key as present when it has a non-empty
value. Concretely, change envHasKey (or add a new helper) to verify that for
entries with prefix key+"=", the substring after '=' when trimmed is not empty,
and have providerEnv use that stronger check (still using providerEnv and
envHasKey names to locate the logic).
- Around line 193-197: The providerEnv and serviceManagerEnv functions can block
because they call exec.Command without a context; change their signatures to
accept a context.Context (e.g., providerEnv(ctx context.Context, env []string,
provider string) and serviceManagerEnv(ctx context.Context, provider string))
and inside serviceManagerEnv use exec.CommandContext with a short timeout
context (e.g., context.WithTimeout) for any launchctl/systemctl calls to avoid
hanging; update all call sites (for example where executor.go constructs the
command environment: the call chain browserEnv(providerEnv(...), ...) should
pass the current ctx from piRPCCommand invocation) so the executor's ctx is
threaded through to providerEnv and serviceManagerEnv. Ensure timeouts are
cancellable and errors are propagated into the environment-building logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 5afbe82d-0421-4e70-a47d-3e943481d083
📒 Files selected for processing (3)
internal/pi/executor.gointernal/pi/executor_test.gointernal/session/actor_test.go
3f0e151 to
7b5fd95
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/pi/executor.go`:
- Around line 554-557: The agent_end branch can leave the Pi subprocess running
because markBroken() only flips a flag; modify the error path so the process is
terminated synchronously: either update markBroken() to also send a termination
signal to the worker subprocess (use the same mechanism as the context
cancellation handler, e.g., syscall.Kill(-pid, syscall.SIGTERM) or
os.Process.Signal on the executor's subprocess field) or, immediately after
agentEndError(...) returns non-nil in streamPiEvents, call the worker teardown
routine to kill the process and wait for exit before returning the error;
reference streamPiEvents, agentEndError, markBroken and the executor's
subprocess/PID field when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 4c131f58-66ce-4dfc-842b-2dcc70aa952e
📒 Files selected for processing (4)
internal/pi/executor.gointernal/pi/executor_test.gointernal/pi/worker.gointernal/session/actor_test.go
✅ Files skipped from review due to trivial changes (1)
- internal/pi/worker.go
7b5fd95 to
930c8c5
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/pi/worker_test.go (1)
113-145: Consider checking the error fromdefer worker.Stop().The static analysis tool flagged the unchecked error return at line 127. While this is a cleanup defer in test code, it would be more robust to handle potential stop failures explicitly.
🔧 Optional fix to check the error
- defer worker.Stop(context.Background()) + t.Cleanup(func() { + if err := worker.Stop(context.Background()); err != nil { + t.Logf("worker.Stop: %v", err) + } + })Alternatively, if the error is truly ignorable in cleanup context, you can explicitly discard it:
- defer worker.Stop(context.Background()) + defer func() { _ = worker.Stop(context.Background()) }()The rest of the test is well-structured and properly validates both the error message content and the worker state after failure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/pi/worker_test.go` around lines 113 - 145, The defer calling worker.Stop in TestWorkerStopsProcessAfterAgentEndError currently ignores its returned error; update the defer to handle the error from worker.Stop(ctx) — either check the returned error and call t.Fatalf/t.Logf if non-nil or explicitly discard it (e.g. assign to _ ) to silence the static analyzer; ensure you reference the same worker.Stop call used in the test to keep cleanup behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/pi/worker_test.go`:
- Around line 113-145: The defer calling worker.Stop in
TestWorkerStopsProcessAfterAgentEndError currently ignores its returned error;
update the defer to handle the error from worker.Stop(ctx) — either check the
returned error and call t.Fatalf/t.Logf if non-nil or explicitly discard it
(e.g. assign to _ ) to silence the static analyzer; ensure you reference the
same worker.Stop call used in the test to keep cleanup behavior unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 018d6149-0149-4ead-9722-66ad844913d4
📒 Files selected for processing (5)
internal/pi/executor.gointernal/pi/executor_test.gointernal/pi/worker.gointernal/pi/worker_test.gointernal/session/actor_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/pi/executor.go
What
OPENROUTER_API_KEY.stopReason: errormessages as task failures instead of synthesizing a success result.Why
OpenRouter provider failures can otherwise leave the web task in a working state while Pi records the error only in its session transcript.
Verification
go test ./internal/pi ./internal/sessiongo test ./...go build -o /tmp/gsd-cloud-openrouter-error-fix .Post-merge
Summary by CodeRabbit
Bug Fixes
Tests