Skip to content

fix(pi): surface OpenRouter provider errors#104

Merged
glittercowboy merged 1 commit intomainfrom
codex/openrouter-pi-error-propagation
Apr 29, 2026
Merged

fix(pi): surface OpenRouter provider errors#104
glittercowboy merged 1 commit intomainfrom
codex/openrouter-pi-error-propagation

Conversation

@glittercowboy
Copy link
Copy Markdown
Contributor

@glittercowboy glittercowboy commented Apr 29, 2026

What

  • Pass OpenRouter API keys from the user service manager environment into Pi when the daemon process does not already have OPENROUTER_API_KEY.
  • Treat terminal Pi assistant stopReason: error messages as task failures instead of synthesizing a success result.
  • Add executor and actor coverage for OpenRouter auth failures.

Why

OpenRouter provider failures can otherwise leave the web task in a working state while Pi records the error only in its session transcript.

Verification

  • go test ./internal/pi ./internal/session
  • go test ./...
  • go build -o /tmp/gsd-cloud-openrouter-error-fix .

Post-merge

  • Tag and publish the next daemon release so installed machines receive the fix.

Summary by CodeRabbit

  • Bug Fixes

    • OpenRouter provider now sources API key from the system daemon environment if not configured locally.
    • Stream handling treats OpenRouter authentication failures as real errors and provides actionable OPENROUTER_API_KEY guidance instead of emitting a synthetic success.
  • Tests

    • Added tests for daemon-based OpenRouter API key resolution.
    • Added tests ensuring auth failures propagate as errors and suppress success results.
    • Added fake PI harness mode to exercise auth-failure handling.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Builds the pi child-process environment from request context and provider env (injecting OPENROUTER_API_KEY from the service manager for openrouter when missing). Enhances stream parsing to extract agent_end errors (e.g., OpenRouter auth failure) and return provider-aware errors instead of synthesizing result.

Changes

Cohort / File(s) Summary
Executor core
internal/pi/executor.go
Reworks child process env construction to accept ctx, applies providerEnv before browserEnv, and injects OPENROUTER_API_KEY from the service manager for openrouter. Updates streamPiEvents to parse agent_end messages, extract provider/model-specific errors (e.g., OpenRouter auth), and return formatted errors (including OPENROUTER_API_KEY hints) instead of always emitting synthesized result.
Worker runtime
internal/pi/worker.go
startLocked now calls processEnv with context.Background(). Prompt failure paths no longer immediately mark worker broken; they use w.stopAfterFailure(...) and wrap stop errors when applicable.
Tests — executor & worker
internal/pi/executor_test.go, internal/pi/worker_test.go
Adds tests forcing service-manager env lookup for OpenRouter key and asserting OPENROUTER_API_KEY is written into generated env; adds tests simulating OpenRouter auth failure to ensure streamPiEvents and Prompt return errors containing OPENROUTER_API_KEY hint and that worker process stops.
Tests — actor / fake PI
internal/session/actor_test.go
Extends fake-PI harness with FAKE_PI_AGENT_ERROR mode emitting OpenRouter-shaped auth failure sequence; adds test verifying actor produces a protocol.TaskError containing OPENROUTER_API_KEY and does not emit protocol.TaskComplete.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Executor
    participant ServiceMgr as Service Manager
    participant PiProcess as pi (child)
    participant StreamParser

    Client->>Executor: Start task (provider=openrouter)
    Executor->>ServiceMgr: lookupServiceManagerEnv("OPENROUTER_API_KEY")
    ServiceMgr-->>Executor: OPENROUTER_API_KEY=value
    Executor->>PiProcess: spawn pi with env (incl. OPENROUTER_API_KEY)
    PiProcess-->>StreamParser: NDJSON event stream
    StreamParser->>StreamParser: parse events (incl. agent_end)
    alt agent_end with error (stopReason="error")
        StreamParser-->>Executor: return formatted error (includes OPENROUTER_API_KEY hint)
        Executor-->>Client: report TaskError
    else normal completion
        StreamParser-->>Executor: emit result events
        Executor-->>Client: report TaskComplete
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through logs where daemon secrets hide,
Pulled out a key the system tried to bide,
When OpenRouter grumbled and could not comply,
I whispered: "OPENROUTER_API_KEY" — look nearby! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.88% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(pi): surface OpenRouter provider errors' directly and specifically summarizes the main change: surfacing OpenRouter provider errors in the Pi executor.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/openrouter-pi-error-propagation

Review rate limit: 1/10 review remaining, refill in 53 minutes and 7 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/pi/executor.go`:
- Around line 353-371: The code currently treats any OPENROUTER_API_KEY= entry
as present (envHasKey), preventing fallback even when the value is blank; update
the check so providerEnv only treats the key as present when it has a non-empty
value. Concretely, change envHasKey (or add a new helper) to verify that for
entries with prefix key+"=", the substring after '=' when trimmed is not empty,
and have providerEnv use that stronger check (still using providerEnv and
envHasKey names to locate the logic).
- Around line 193-197: The providerEnv and serviceManagerEnv functions can block
because they call exec.Command without a context; change their signatures to
accept a context.Context (e.g., providerEnv(ctx context.Context, env []string,
provider string) and serviceManagerEnv(ctx context.Context, provider string))
and inside serviceManagerEnv use exec.CommandContext with a short timeout
context (e.g., context.WithTimeout) for any launchctl/systemctl calls to avoid
hanging; update all call sites (for example where executor.go constructs the
command environment: the call chain browserEnv(providerEnv(...), ...) should
pass the current ctx from piRPCCommand invocation) so the executor's ctx is
threaded through to providerEnv and serviceManagerEnv. Ensure timeouts are
cancellable and errors are propagated into the environment-building logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5afbe82d-0421-4e70-a47d-3e943481d083

📥 Commits

Reviewing files that changed from the base of the PR and between 521248d and 7100814.

📒 Files selected for processing (3)
  • internal/pi/executor.go
  • internal/pi/executor_test.go
  • internal/session/actor_test.go

Comment thread internal/pi/executor.go Outdated
Comment thread internal/pi/executor.go Outdated
@glittercowboy glittercowboy force-pushed the codex/openrouter-pi-error-propagation branch from 3f0e151 to 7b5fd95 Compare April 29, 2026 22:16
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/pi/executor.go`:
- Around line 554-557: The agent_end branch can leave the Pi subprocess running
because markBroken() only flips a flag; modify the error path so the process is
terminated synchronously: either update markBroken() to also send a termination
signal to the worker subprocess (use the same mechanism as the context
cancellation handler, e.g., syscall.Kill(-pid, syscall.SIGTERM) or
os.Process.Signal on the executor's subprocess field) or, immediately after
agentEndError(...) returns non-nil in streamPiEvents, call the worker teardown
routine to kill the process and wait for exit before returning the error;
reference streamPiEvents, agentEndError, markBroken and the executor's
subprocess/PID field when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4c131f58-66ce-4dfc-842b-2dcc70aa952e

📥 Commits

Reviewing files that changed from the base of the PR and between 3f0e151 and 7b5fd95.

📒 Files selected for processing (4)
  • internal/pi/executor.go
  • internal/pi/executor_test.go
  • internal/pi/worker.go
  • internal/session/actor_test.go
✅ Files skipped from review due to trivial changes (1)
  • internal/pi/worker.go

Comment thread internal/pi/executor.go
@glittercowboy glittercowboy force-pushed the codex/openrouter-pi-error-propagation branch from 7b5fd95 to 930c8c5 Compare April 29, 2026 22:22
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/pi/worker_test.go (1)

113-145: Consider checking the error from defer worker.Stop().

The static analysis tool flagged the unchecked error return at line 127. While this is a cleanup defer in test code, it would be more robust to handle potential stop failures explicitly.

🔧 Optional fix to check the error
-	defer worker.Stop(context.Background())
+	t.Cleanup(func() {
+		if err := worker.Stop(context.Background()); err != nil {
+			t.Logf("worker.Stop: %v", err)
+		}
+	})

Alternatively, if the error is truly ignorable in cleanup context, you can explicitly discard it:

-	defer worker.Stop(context.Background())
+	defer func() { _ = worker.Stop(context.Background()) }()

The rest of the test is well-structured and properly validates both the error message content and the worker state after failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/pi/worker_test.go` around lines 113 - 145, The defer calling
worker.Stop in TestWorkerStopsProcessAfterAgentEndError currently ignores its
returned error; update the defer to handle the error from worker.Stop(ctx) —
either check the returned error and call t.Fatalf/t.Logf if non-nil or
explicitly discard it (e.g. assign to _ ) to silence the static analyzer; ensure
you reference the same worker.Stop call used in the test to keep cleanup
behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/pi/worker_test.go`:
- Around line 113-145: The defer calling worker.Stop in
TestWorkerStopsProcessAfterAgentEndError currently ignores its returned error;
update the defer to handle the error from worker.Stop(ctx) — either check the
returned error and call t.Fatalf/t.Logf if non-nil or explicitly discard it
(e.g. assign to _ ) to silence the static analyzer; ensure you reference the
same worker.Stop call used in the test to keep cleanup behavior unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 018d6149-0149-4ead-9722-66ad844913d4

📥 Commits

Reviewing files that changed from the base of the PR and between 7b5fd95 and 930c8c5.

📒 Files selected for processing (5)
  • internal/pi/executor.go
  • internal/pi/executor_test.go
  • internal/pi/worker.go
  • internal/pi/worker_test.go
  • internal/session/actor_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/pi/executor.go

@glittercowboy glittercowboy merged commit bef2858 into main Apr 29, 2026
2 checks passed
@glittercowboy glittercowboy deleted the codex/openrouter-pi-error-propagation branch April 29, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant