Skip to content

feat(runner,manifests): alpha migration PR 6+7 — runners and Kustomize overlays#1379

Merged
mergify[bot] merged 6 commits intomainfrom
chore/alpha-migration-pr6-runners
Apr 21, 2026
Merged

feat(runner,manifests): alpha migration PR 6+7 — runners and Kustomize overlays#1379
mergify[bot] merged 6 commits intomainfrom
chore/alpha-migration-pr6-runners

Conversation

@markturansky
Copy link
Copy Markdown
Contributor

@markturansky markturansky commented Apr 21, 2026

Summary

PR 6+7 of the alpha-to-main migration (plan).
Combined runners (PR 6) and manifests (PR 7) since all component dependencies have landed.

Runners (44 files, ~4.9K insertions)

  • gRPC transport: Session message streaming via gRPC, delta buffer
  • gRPC client: Control-plane token endpoint client
  • Inbox/session messages APIs: Full CRUD with delta buffer
  • Credential system: fetch/populate/clear, gh CLI wrapper, sanitize_user_context
  • SSE enhancements: Flush-per-chunk, unbounded tap queue
  • CP OIDC token: RSA keypair auth for backend credential fetches
  • New deps: cryptography>=42.0.0, grpcio>=1.60.0, protobuf>=4.25.0
  • New tests: grpc_client, grpc_transport, grpc_writer, events_endpoint, app_initial_prompt

Manifests (33 files, ~980 insertions)

  • mpp-openshift overlay: NetworkPolicy, gRPC Route, CP token Service, RBAC, MCP sidecar
  • production overlay: route and kustomization updates
  • openshift-dev overlay: new overlay with args/env patches
  • Removed: deprecated cluster-reader overlay
  • Base RBAC: control-plane ClusterRole, ClusterRoleBinding, ServiceAccount

Migration plan updated

  • PRs 1-5 marked as merged (with PR numbers)
  • PR 6+7 combined into single PR

Verification

  • ruff check .
  • ruff format --check .
  • python -m pytest tests/ ✅ (707 passed, 11 skipped)
  • kustomize build ✅ (base, kind-local, mpp-openshift, openshift-dev, production)
  • 1 pre-existing failure in test_observability_mlflow_integration.py (missing optional mlflow dep, unchanged file)

Test plan

  • Verify runner tests pass in CI
  • Verify ruff lint/format passes
  • Verify kustomize build for all overlays
  • Verify validate-manifests CI job passes

After this PR

Only PR 7.1 (cleanup) remains — delete the migration plan file.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added gRPC-based session message streaming for improved platform communication
    • Implemented enhanced token-based authentication with automatic refresh capabilities
    • Added support for Kubernetes service account credentials
    • Introduced Ambient Control Plane deployment with RBAC and network policies
  • Bug Fixes

    • Removed unsupported "thinking" event type from Gemini CLI adapter
    • Improved error handling for credential population and session message delivery
  • Infrastructure

    • Added Kubernetes manifests for multiple deployment environments (OpenShift, local dev, MPP)
    • Enhanced credential management with kubeconfig and multi-provider support
    • Added gRPC dependencies (cryptography, grpcio, protobuf)

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 21, 2026

Deploy Preview for cheerful-kitten-f556a0 canceled.

Name Link
🔨 Latest commit f240cd9
🔍 Latest deploy log https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/69e7e0e1a67d600008d72702

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a bidirectional gRPC transport layer enabling persistent session-based message delivery between a Kubernetes-deployed ambient runner and a control-plane API server. It adds gRPC infrastructure (client, session/inbox message APIs), integrates gRPC into the Claude bridge via a listener and message writer, updates endpoints and middleware for event streaming, enhances credential handling with kubeconfig and CP token support, and provisions OpenShift deployment manifests.

Changes

Cohort / File(s) Summary
gRPC Client & Core Infrastructure
ambient_runner/_grpc_client.py, ambient_runner/_session_messages_api.py, ambient_runner/_inbox_messages_api.py
Added new gRPC client factory, RSA-OAEP session-ID encryption, CP token-fetch with retries/exponential backoff, TLS/bearer-token channel construction, and stateful session/inbox message streaming wrappers using hand-built protobuf wire encoding.
Claude Bridge gRPC Transport
ambient_runner/bridges/claude/grpc_transport.py, ambient_runner/bridges/claude/bridge.py
Introduced GRPCSessionListener for pod-lifetime watch of WatchSessionMessages, GRPCMessageWriter for durable push of terminal events, and async message fan-out to SSE queue taps; added bridge lifecycle hooks (start_grpc_listener, shutdown).
Event Streaming & Middleware
ambient_runner/endpoints/events.py, ambient_runner/endpoints/run.py, ambient_runner/middleware/grpc_push.py, ambient_runner/middleware/__init__.py
Replaced /events with /events/{thread_id} queue-based SSE; added /events/{thread_id}/wait polling endpoint; introduced grpc_push_middleware to fire-and-forget push events to gRPC; integrated middleware into run endpoint.
Authentication & Token Management
ambient_runner/platform/auth.py, ambient_runner/platform/utils.py
Extended credential population with kubeconfig support, unified token field handling (Google/Jira), added CP token caching and refresh via refresh_bot_token(), implemented get_sa_token() for mounted Kubernetes SA tokens.
Initial Prompt & App Lifecycle
ambient_runner/app.py
Added gated gRPC listener startup in lifespan (awaiting readiness), split HTTP/gRPC initial-prompt dispatch paths, pre-registers SSE queue for race-condition mitigation.
Claude Bridge Config & Credentials
ambient_runner/bridges/claude/mcp.py, ambient_runner/bridges/claude/prompts.py
Added environment-driven MCP ambient-server injection with SSE transport; prepended DEFAULT_AGENT_PREAMBLE to workspace context.
Session & Tool Management
ambient_runner/bridges/claude/session.py, ambient_runner/bridges/claude/tools.py, ambient_runner/platform/prompts.py
Reformatted session reader handling (no functional change); removed MCP post-refresh diagnostics check; added DEFAULT_AGENT_PREAMBLE constant and removed GIT_SAFETY_INSTRUCTIONS.
Gemini & LangGraph Bridges
ag_ui_gemini_cli/types.py, ambient_runner/bridges/gemini_cli/bridge.py, ambient_runner/bridges/gemini_cli/session.py, ambient_runner/bridges/langgraph/bridge.py
Removed ThinkingEvent from Gemini event parsing; reformatted bridge signatures (no logic changes); removed subprocess limit parameter.
Claude SDK Adapter & Handlers
ag_ui_claude_sdk/adapter.py, ag_ui_claude_sdk/handlers.py, ag_ui_claude_sdk/hooks.py, ag_ui_claude_sdk/reasoning_events.py
Reformatted event emission and message dispatch (no functional changes); adjusted hook initialization whitespace; updated event constructor formatting.
gRPC Configuration & Discovery
components/runners/ambient-runner/.mcp.json
Added OpenShift MCP server entry using kubernetes-mcp-server with Ambient kubeconfig disabled.
OpenShift & MPP Manifests
components/manifests/overlays/mpp-openshift/*, components/manifests/overlays/openshift-dev/*
Introduced full MPP/OpenShift overlay stack: API server with TLS/gRPC, PostgreSQL database, control-plane deployment, tenant RBAC, network policies, and bootstrap documentation.
Base & Production Manifests
components/manifests/base/*, components/manifests/overlays/production/*, components/manifests/overlays/local-dev/*
Added control-plane service account/RBAC to base; extended image overrides; added HAProxy timeout annotations; removed deprecated cluster-reader overlay.
Dependencies & Project Config
pyproject.toml
Added cryptography>=42.0.0, grpcio>=1.60.0, protobuf>=4.25.0.
Architecture & Migration Docs
architecture.md, docs/internal/proposals/alpha-to-main-migration.md
Added comprehensive architecture documentation covering HTTP/gRPC delivery modes, lifespan control flow, gRPC listener lifecycle, SSE queue contracts, and bridge abstractions; updated migration checklist.
New Tests
tests/test_app_initial_prompt.py, tests/test_events_endpoint.py, tests/test_grpc_client.py, tests/test_grpc_transport.py, tests/test_grpc_writer.py, tests/test_bridge_claude.py (expanded)
Comprehensive test coverage: gRPC client (encryption, CP token-fetch, channel construction), session listener (watch filtering, fan-out, SSE queue), message writer (snapshot accumulation, terminal-event pushes), initial-prompt dispatch routing, events endpoint (queue lifecycle, cleanup semantics), and bridge listener lifecycle.
Test Refactoring & Cleanup
tests/test_gemini_cli_adapter.py, tests/test_gemini_session.py, tests/test_auto_push.py, tests/test_shared_session_credentials.py (expanded), tests/test_e2e_api.py, tests/test_gemini_auth.py, tests/test_google_drive_e2e.py
Removed Gemini thinking-event tests; deleted subprocess limit tests; removed Git-safety prompt tests; expanded credential lifecycle with kubeconfig/CP-token mocking; refactored dynamic module attribute fetching; minor whitespace cleanup.
Minor Formatting
ambient_runner/tools/backend_api.py, ambient_runner/platform/context.py, ambient_runner/endpoints/tasks.py, ambient_runner/middleware/secret_redaction.py
Reformatted conditional/dict literals and function signatures (no logic changes).

Sequence Diagram(s)

sequenceDiagram
    participant Pod as Ambient Runner Pod
    participant Bridge as ClaudeBridge
    participant CPServer as Control Plane
    participant SSEClient as SSE Client
    
    Note over Pod,CPServer: gRPC Initialization (Lifespan)
    Pod->>Bridge: start_grpc_listener(grpc_url)
    Bridge->>CPServer: Create AmbientGRPCClient
    Bridge->>CPServer: Start WatchSessionMessages stream
    Note over Bridge: Ready event set
    Pod->>Pod: Pre-register SSE queue
    Pod->>CPServer: Push INITIAL_PROMPT via gRPC
    
    Note over Pod,CPServer: Session Message Delivery
    CPServer->>Bridge: Stream SessionMessage (event_type=user)
    Bridge->>Bridge: Parse RunnerInput from payload
    Bridge->>Bridge: Invoke bridge.run() for turn
    
    Note over Bridge,SSEClient: Event Fan-out
    Bridge->>Bridge: Emit events from run()
    Bridge->>Bridge: Enqueue to active_streams[thread_id]
    Bridge->>CPServer: Push to session_messages (assistant)
    SSEClient->>Pod: GET /events/{thread_id}
    Pod->>SSEClient: Stream events (TEXT, TOOL_CALL, etc)
    
    Note over Bridge,CPServer: Terminal Event & Cleanup
    Bridge->>Bridge: Emit RUN_FINISHED
    Bridge->>CPServer: Push status=completed
    Bridge->>Bridge: Remove active_streams[thread_id]
    SSEClient->>SSEClient: Stream ends
Loading
sequenceDiagram
    participant Client as Client
    participant App as Ambient Runner App
    participant Auth as Auth/Crypto
    participant CPServer as CP Token Server
    
    Note over Client,CPServer: CP Token Fetch (On Startup)
    App->>Auth: refresh_bot_token()
    alt CP Config Present
        Auth->>Auth: Read AMBIENT_CP_TOKEN_PUBLIC_KEY
        Auth->>Auth: RSA-OAEP encrypt SESSION_ID
        Auth->>CPServer: POST /token with Bearer header
        CPServer->>CPServer: Validate request
        CPServer->>Auth: Return {token: ...}
        Auth->>Auth: Cache token (set_bot_token)
        Auth->>App: Return refreshed token
    else No CP Config
        Auth->>Auth: Fall back to get_bot_token()
        Auth->>App: Return cached/mounted token
    end
    
    Note over App,Client: Subsequent API Calls
    App->>Auth: get_bot_token()
    Auth->>Auth: Return cached CP token first
    App->>CPServer: Request with bearer token
Loading
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/alpha-migration-pr6-runners
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch chore/alpha-migration-pr6-runners

@markturansky markturansky force-pushed the chore/alpha-migration-pr6-runners branch from fe35a64 to 7f7227d Compare April 21, 2026 16:32
@markturansky markturansky changed the title feat(runner): alpha migration PR 6 — runners feat(runner,manifests): alpha migration PR 6+7 — runners and Kustomize overlays Apr 21, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
components/runners/ambient-runner/ag_ui_gemini_cli/types.py (1)

96-96: ⚠️ Potential issue | 🟠 Major

Avoid logging raw event payload content on parse failures.

line[:120] can include user prompts or tokens from malformed JSON and leak sensitive data into logs. Log metadata only (for example, size/source/hash), not payload text.

As per coding guidelines components/runners/ambient-runner/**/*.py: Check subprocess handling, timeout management, and that secrets are not logged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ag_ui_gemini_cli/types.py` at line 96, The
warning currently logs raw payload via logger.warning("Failed to parse JSONL
line: %s", line[:120]) which may leak sensitive user data; change this to log
only metadata about the failing input (e.g., its byte/char length, source
identifier if available, and a non-reversible hash like SHA256) and remove any
substring of the original payload. Locate the logger.warning call that
references the variable line in types.py and replace the message to include
computed metadata (length, source/name, sha256 or similar) while ensuring no
payload text is included in the log.
components/runners/ambient-runner/ambient_runner/app.py (1)

305-308: ⚠️ Potential issue | 🟠 Major

Duplicate router registration.

The events_router is already registered at line 263. This second registration (lines 306-308) adds duplicate routes.

🐛 Remove duplicate registration
-    # Between-run event stream (always registered)
-    from ambient_runner.endpoints.events import router as events_router
-
-    app.include_router(events_router)

     caps = bridge.capabilities()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/app.py` around lines 305 -
308, Remove the duplicate router registration by deleting the second
app.include_router(events_router) call; ensure only the first registration of
events_router (the existing import from ambient_runner.endpoints.events and the
initial app.include_router(events_router) at the earlier location) remains so
routes are not registered twice.
components/runners/ambient-runner/ambient_runner/platform/auth.py (1)

401-409: ⚠️ Potential issue | 🟠 Major

Jira credential field mismatch: env vars not set when backend returns apiToken.

fetch_jira_credentials() (line 304) and mcp.py (line 264) both check for apiToken, but populate_runtime_credentials() here checks for token (line 405). If the backend returns {"apiToken": "..."}, the env vars JIRA_URL, JIRA_API_TOKEN, and JIRA_EMAIL are silently skipped.

Use consistent field name: update line 405 to check apiToken, and line 407 to read from apiToken.

Proposed fix
-    elif jira_creds.get("token"):
+    elif jira_creds.get("apiToken"):
         os.environ["JIRA_URL"] = jira_creds.get("url", "")
-        os.environ["JIRA_API_TOKEN"] = jira_creds.get("token", "")
+        os.environ["JIRA_API_TOKEN"] = jira_creds.get("apiToken", "")
         os.environ["JIRA_EMAIL"] = jira_creds.get("email", "")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around
lines 401 - 409, populate_runtime_credentials() is checking for
jira_creds.get("token") but fetch_jira_credentials() and mcp.py use "apiToken",
so credential data from the backend is being skipped; change the check in
populate_runtime_credentials() to look for "apiToken" and read
jira_creds.get("apiToken") when setting JIRA_API_TOKEN (keep JIRA_URL and
JIRA_EMAIL assignments as-is) so the environment vars are populated consistently
with fetch_jira_credentials() and mcp.py.
🧹 Nitpick comments (1)
components/runners/ambient-runner/ambient_runner/endpoints/tasks.py (1)

58-62: Align synthetic task:completed payload shape with adapter-emitted payloads.
stop_task currently emits a reduced value object, while adapter notifications include usage and output_file fields (components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py, Line 679-694). Keeping one schema avoids client-side branching.

Proposed schema-alignment diff
     completed_event = CustomEvent(
         type=EventType.CUSTOM,
         name="task:completed",
         value={
             "task_id": task_id,
             "status": "stopped",
             "summary": "Task stopped by user",
+            "usage": None,
+            "output_file": None,
         },
     )
As per coding guidelines, "Documentation and refactoring suggestions are welcome when substantive."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/endpoints/tasks.py` around
lines 58 - 62, The synthetic payload emitted in stop_task (the value dict for
the "task:completed" notification) is missing fields present in adapter-emitted
payloads—specifically usage and output_file—leading to divergent schemas; update
the stop_task emission in ambient_runner/endpoints/tasks.py so its value
includes the same keys as adapter notifications (task_id, status, summary,
usage, output_file), populating usage and output_file with sensible defaults
(e.g., zeroed usage object and null/empty output_file) or actual data if
available, and ensure the event name "task:completed" and the value shape
exactly match the adapter's payload contract used in ag_ui_claude_sdk/adapter.py
(lines where usage/output_file are set).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/runners/ambient-runner/ambient_runner/_session_messages_api.py`:
- Around line 103-117: The UNAUTHENTICATED retry path calls
self._grpc_client.reconnect() and then mutates self._push_rpc/self._metadata
without guarding against failures; wrap the reconnect +
new_api/_push_rpc/_metadata assignment in a try/except Exception block, catch
any exception as e, log it (e.g., logger.exception or logger.error with context
including session_id) and do not let it propagate—exit the retry branch (do not
continue the loop) so the failure is handled gracefully and the outer flow
remains fire-and-forget. Keep the rest of the UNAUTHENTICATED handling (the
warning log and reconnect attempt) as-is but protect the reconnect and new_api
assignments with this local try/except.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 215-220: This call sequence (clear_runtime_credentials(),
populate_runtime_credentials(self._context),
populate_mcp_server_credentials(self._context)) mutates process-global env and
shared token files and must be serialized across all concurrent bridge runs;
introduce and use a bridge-wide async lock (e.g. self._creds_lock) to guard the
entire refresh path and also use the same lock around the
clear_runtime_credentials() call in run()'s finally block so no other thread can
clear or overwrite credentials mid-turn; update methods/blocks invoking these
helpers (the location that sets self._last_creds_refresh and the run() finally
block) to acquire the bridge-wide lock before calling
clear_runtime_credentials/populate_* and release after.
- Around line 482-503: start_grpc_listener currently starts the
GRPCSessionListener and returns immediately; change it to await the listener's
readiness with a bounded timeout: after calling self._grpc_listener.start(), use
asyncio.wait_for(self._grpc_listener.ready.wait(), timeout=GRPC_STARTUP_TIMEOUT)
(or fetch a configured timeout) and raise a clear RuntimeError or log and clean
up if the wait times out; reference the start_grpc_listener method, the
GRPCSessionListener instantiation, the self._grpc_listener.start() call, and the
self._grpc_listener.ready event to implement this bounded readiness wait.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py`:
- Line 72: The log call prints the raw ambient_mcp_url which may include
sensitive userinfo or query tokens; change the logger.info usage around the
ambient_mcp_url variable (the logger.info("Added ambient MCP sidecar server
(SSE): %s", ambient_mcp_url) call) to redact secrets before logging — e.g.,
parse ambient_mcp_url and remove userinfo and query params (or log only
scheme+host+path or a fixed label), then log the redacted value or omit the URL
entirely.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py`:
- Around line 85-97: session_id currently falls back to the empty string which
disables gRPC persistence; change it to use the same fallback ClaudeBridge.run()
uses by deriving session_id from run_agent_input.thread_id or the bridge context
(e.g., session_id = run_agent_input.thread_id or (getattr(bridge, "context",
None) and bridge.context.session_id)), and pass that value to
grpc_push_middleware instead of "" so gRPC persistence is preserved; update the
session_id variable near its declaration and ensure grpc_push_middleware is
given None or the actual session id rather than an empty string.

In `@components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py`:
- Around line 101-109: The loop yields events from event_stream then calls the
blocking function _push_event which synchronously invokes
grpc_client.session_messages.push(), blocking the async loop; change this to
fire-and-forget by offloading the blocking gRPC call: either call
asyncio.get_running_loop().run_in_executor(None, _push_event, grpc_client, sid,
event) or create an asyncio.create_task that runs an async wrapper (e.g.
_push_event_async) which itself uses run_in_executor to invoke
grpc_client.session_messages.push(), ensure you do not await it so streaming
stays non-blocking and handle/log exceptions inside the wrapper; keep the
existing grpc_client.close() in the finally block.

In `@components/runners/ambient-runner/tests/test_app_initial_prompt.py`:
- Line 119: Replace the deprecated call to asyncio.get_event_loop() assigned to
real_loop with asyncio.get_running_loop() so the async test uses the running
loop; locate the assignment to real_loop and change the function call from
asyncio.get_event_loop() to asyncio.get_running_loop() to match the existing use
of get_running_loop() elsewhere in the test.

In `@components/runners/ambient-runner/tests/test_grpc_writer.py`:
- Around line 27-34: The test helper make_writer currently converts an explicit
None grpc_client into a MagicMock, so the GRPCMessageWriter._write_message
branch where self._grpc_client is None is never exercised; update make_writer to
accept and pass through a sentinel (or allow grpc_client=None to be passed
through) so you can construct GRPCMessageWriter with grpc_client=None in the
test that targets the early-return path, and add/adjust the test at Line ~211 to
call make_writer(grpc_client=None) (or directly instantiate GRPCMessageWriter
with grpc_client=None) to verify the early return in
GRPCMessageWriter._write_message.

---

Outside diff comments:
In `@components/runners/ambient-runner/ag_ui_gemini_cli/types.py`:
- Line 96: The warning currently logs raw payload via logger.warning("Failed to
parse JSONL line: %s", line[:120]) which may leak sensitive user data; change
this to log only metadata about the failing input (e.g., its byte/char length,
source identifier if available, and a non-reversible hash like SHA256) and
remove any substring of the original payload. Locate the logger.warning call
that references the variable line in types.py and replace the message to include
computed metadata (length, source/name, sha256 or similar) while ensuring no
payload text is included in the log.

In `@components/runners/ambient-runner/ambient_runner/app.py`:
- Around line 305-308: Remove the duplicate router registration by deleting the
second app.include_router(events_router) call; ensure only the first
registration of events_router (the existing import from
ambient_runner.endpoints.events and the initial
app.include_router(events_router) at the earlier location) remains so routes are
not registered twice.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 401-409: populate_runtime_credentials() is checking for
jira_creds.get("token") but fetch_jira_credentials() and mcp.py use "apiToken",
so credential data from the backend is being skipped; change the check in
populate_runtime_credentials() to look for "apiToken" and read
jira_creds.get("apiToken") when setting JIRA_API_TOKEN (keep JIRA_URL and
JIRA_EMAIL assignments as-is) so the environment vars are populated consistently
with fetch_jira_credentials() and mcp.py.

---

Nitpick comments:
In `@components/runners/ambient-runner/ambient_runner/endpoints/tasks.py`:
- Around line 58-62: The synthetic payload emitted in stop_task (the value dict
for the "task:completed" notification) is missing fields present in
adapter-emitted payloads—specifically usage and output_file—leading to divergent
schemas; update the stop_task emission in ambient_runner/endpoints/tasks.py so
its value includes the same keys as adapter notifications (task_id, status,
summary, usage, output_file), populating usage and output_file with sensible
defaults (e.g., zeroed usage object and null/empty output_file) or actual data
if available, and ensure the event name "task:completed" and the value shape
exactly match the adapter's payload contract used in ag_ui_claude_sdk/adapter.py
(lines where usage/output_file are set).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cb6929ac-dbdc-4c8b-af01-5602bfdb974b

📥 Commits

Reviewing files that changed from the base of the PR and between aa3b433 and fe35a64.

📒 Files selected for processing (47)
  • components/runners/ambient-runner/.mcp.json
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/adapter.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/types.py
  • components/runners/ambient-runner/ambient_runner/_grpc_client.py
  • components/runners/ambient-runner/ambient_runner/_inbox_messages_api.py
  • components/runners/ambient-runner/ambient_runner/_session_messages_api.py
  • components/runners/ambient-runner/ambient_runner/app.py
  • components/runners/ambient-runner/ambient_runner/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/tools.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/events.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/runners/ambient-runner/ambient_runner/middleware/__init__.py
  • components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/ambient_runner/platform/prompts.py
  • components/runners/ambient-runner/ambient_runner/platform/utils.py
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/runners/ambient-runner/architecture.md
  • components/runners/ambient-runner/pyproject.toml
  • components/runners/ambient-runner/tests/test_app_initial_prompt.py
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/tests/test_bridge_claude.py
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/runners/ambient-runner/tests/test_events_endpoint.py
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/runners/ambient-runner/tests/test_grpc_client.py
  • components/runners/ambient-runner/tests/test_grpc_transport.py
  • components/runners/ambient-runner/tests/test_grpc_writer.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py
💤 Files with no reviewable changes (7)
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/adapter.py

Comment on lines +103 to +117
except grpc.RpcError as exc:
if (
attempt == 0
and exc.code() == grpc.StatusCode.UNAUTHENTICATED
and self._grpc_client is not None
):
logger.warning(
"[GRPC PUSH→] UNAUTHENTICATED — reconnecting with fresh token (session=%s)",
session_id,
)
self._grpc_client.reconnect()
new_api = self._grpc_client.session_messages
self._push_rpc = new_api._push_rpc
self._metadata = new_api._metadata
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unhandled exception if reconnect() fails during UNAUTHENTICATED retry.

If grpc_client.reconnect() raises (e.g., CP token endpoint unreachable), the exception escapes the except grpc.RpcError block and propagates up — potentially crashing the caller. The outer except Exception at line 125 won't catch it since it's not within the try block at that point.

Wrap the reconnect path in its own try/except to maintain fire-and-forget semantics.

Proposed fix
                 if (
                     attempt == 0
                     and exc.code() == grpc.StatusCode.UNAUTHENTICATED
                     and self._grpc_client is not None
                 ):
                     logger.warning(
                         "[GRPC PUSH→] UNAUTHENTICATED — reconnecting with fresh token (session=%s)",
                         session_id,
                     )
-                    self._grpc_client.reconnect()
-                    new_api = self._grpc_client.session_messages
-                    self._push_rpc = new_api._push_rpc
-                    self._metadata = new_api._metadata
-                    continue
+                    try:
+                        self._grpc_client.reconnect()
+                        new_api = self._grpc_client.session_messages
+                        self._push_rpc = new_api._push_rpc
+                        self._metadata = new_api._metadata
+                        continue
+                    except Exception as reconnect_exc:
+                        logger.warning(
+                            "[GRPC PUSH→] reconnect failed (session=%s): %s",
+                            session_id,
+                            reconnect_exc,
+                        )
+                        return None
                 logger.warning(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/_session_messages_api.py`
around lines 103 - 117, The UNAUTHENTICATED retry path calls
self._grpc_client.reconnect() and then mutates self._push_rpc/self._metadata
without guarding against failures; wrap the reconnect +
new_api/_push_rpc/_metadata assignment in a try/except Exception block, catch
any exception as e, log it (e.g., logger.exception or logger.error with context
including session_id) and do not let it propagate—exit the retry branch (do not
continue the loop) so the failure is handled gracefully and the outer flow
remains fire-and-forget. Keep the rest of the UNAUTHENTICATED handling (the
warning log and reconnect attempt) as-is but protect the reconnect and new_api
assignments with this local try/except.

Comment on lines +215 to 220
# Fresh credentials for this user on every run
clear_runtime_credentials()
await populate_runtime_credentials(self._context)
await populate_mcp_server_credentials(self._context)
self._last_creds_refresh = time.monotonic()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Serialize credential refresh/cleanup across concurrent runs.

These helpers mutate process-global env vars and shared token files. Doing that on every run without a bridge-wide lock means one thread can clear or overwrite another thread’s credentials mid-turn; the per-thread session lock below does not protect this path.

Possible fix
 class ClaudeBridge(PlatformBridge):
     def __init__(self) -> None:
         super().__init__()
+        self._credentials_lock = asyncio.Lock()
         self._adapter: ClaudeAgentAdapter | None = None
         ...
 
-        clear_runtime_credentials()
-        await populate_runtime_credentials(self._context)
-        await populate_mcp_server_credentials(self._context)
-        self._last_creds_refresh = time.monotonic()
+        async with self._credentials_lock:
+            clear_runtime_credentials()
+            await populate_runtime_credentials(self._context)
+            await populate_mcp_server_credentials(self._context)
+            self._last_creds_refresh = time.monotonic()

The same lock also needs to guard the clear_runtime_credentials() call in run()'s finally block, otherwise another active turn can still get its credentials cleared underneath it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 215 - 220, This call sequence (clear_runtime_credentials(),
populate_runtime_credentials(self._context),
populate_mcp_server_credentials(self._context)) mutates process-global env and
shared token files and must be serialized across all concurrent bridge runs;
introduce and use a bridge-wide async lock (e.g. self._creds_lock) to guard the
entire refresh path and also use the same lock around the
clear_runtime_credentials() call in run()'s finally block so no other thread can
clear or overwrite credentials mid-turn; update methods/blocks invoking these
helpers (the location that sets self._last_creds_refresh and the run() finally
block) to acquire the bridge-wide lock before calling
clear_runtime_credentials/populate_* and release after.

Comment on lines +482 to +503
async def start_grpc_listener(self, grpc_url: str) -> None:
"""Start the gRPC session listener for this bridge.

Separated from _setup_platform so it can be called after platform
setup completes, with a bounded timeout for readiness. Only valid
when AMBIENT_GRPC_ENABLED=true and AMBIENT_GRPC_URL are both set.
"""
if self._context is None:
raise RuntimeError("Cannot start gRPC listener: context not set")
if self._grpc_listener is not None:
logger.warning("gRPC listener already started — skipping duplicate start")
return

from ambient_runner.bridges.claude.grpc_transport import GRPCSessionListener

session_id = self._context.session_id
self._grpc_listener = GRPCSessionListener(
bridge=self,
session_id=session_id,
grpc_url=grpc_url,
)
self._grpc_listener.start()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C4 'class GRPCSessionListener|def start\(|ready|start_grpc_listener\(' \
  components/runners/ambient-runner/ambient_runner \
  --type=py

Repository: ambient-code/platform

Length of output: 50377


🏁 Script executed:

rg -n 'start_grpc_listener' components/runners/ambient-runner --type=py

Repository: ambient-code/platform

Length of output: 2673


Implement bounded readiness timeout inside start_grpc_listener.

The docstring claims this method provides "a bounded timeout for readiness," but the implementation returns immediately after start() without awaiting the ready event. The caller in app.py compensates by waiting explicitly (line 137), but this shifts responsibility outside the method and violates the stated API contract.

Suggested fix
         self._grpc_listener = GRPCSessionListener(
             bridge=self,
             session_id=session_id,
             grpc_url=grpc_url,
         )
         self._grpc_listener.start()
+        try:
+            await asyncio.wait_for(self._grpc_listener.ready.wait(), timeout=5)
+        except Exception:
+            await self._grpc_listener.stop()
+            self._grpc_listener = None
+            raise
         logger.info(
             "gRPC listener started: session=%s url=%s",
             session_id,
             grpc_url,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 482 - 503, start_grpc_listener currently starts the
GRPCSessionListener and returns immediately; change it to await the listener's
readiness with a bounded timeout: after calling self._grpc_listener.start(), use
asyncio.wait_for(self._grpc_listener.ready.wait(), timeout=GRPC_STARTUP_TIMEOUT)
(or fetch a configured timeout) and raise a clear RuntimeError or log and clean
up if the wait times out; reference the start_grpc_listener method, the
GRPCSessionListener instantiation, the self._grpc_listener.start() call, and the
self._grpc_listener.ready event to implement this bounded readiness wait.

"type": "sse",
"url": f"{ambient_mcp_url.rstrip('/')}/sse",
}
logger.info("Added ambient MCP sidecar server (SSE): %s", ambient_mcp_url)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging raw AMBIENT_MCP_URL values

Line 72 logs the full URL from environment; if it contains userinfo/query tokens, this leaks secrets into logs.

Suggested fix
-        logger.info("Added ambient MCP sidecar server (SSE): %s", ambient_mcp_url)
+        logger.info("Added ambient MCP sidecar server (SSE)")
As per coding guidelines, `components/runners/ambient-runner/**/*.py`: "Check subprocess handling, timeout management, and that secrets are not logged."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.info("Added ambient MCP sidecar server (SSE): %s", ambient_mcp_url)
logger.info("Added ambient MCP sidecar server (SSE)")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py` at
line 72, The log call prints the raw ambient_mcp_url which may include sensitive
userinfo or query tokens; change the logger.info usage around the
ambient_mcp_url variable (the logger.info("Added ambient MCP sidecar server
(SSE): %s", ambient_mcp_url) call) to redact secrets before logging — e.g.,
parse ambient_mcp_url and remove userinfo and query params (or log only
scheme+host+path or a fixed label), then log the redacted value or omit the URL
entirely.

Comment on lines +85 to 97
session_id = run_agent_input.thread_id or ""

async def event_stream():
try:
async for event in bridge.run(
run_agent_input,
current_user_id=current_user_id,
current_user_name=current_user_name,
caller_token=caller_token,
async for event in grpc_push_middleware(
bridge.run(
run_agent_input,
current_user_id=current_user_id,
current_user_name=current_user_name,
caller_token=caller_token,
),
session_id=session_id,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the same session fallback as bridge.run().

thread_id is optional here, but ClaudeBridge.run() already falls back to bridge.context.session_id when it is missing. Passing "" into grpc_push_middleware silently disables gRPC persistence for those runs even though the run itself still succeeds.

Possible fix
-    session_id = run_agent_input.thread_id or ""
+    session_id = (
+        run_agent_input.thread_id
+        or getattr(getattr(bridge, "context", None), "session_id", "")
+        or ""
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py` around
lines 85 - 97, session_id currently falls back to the empty string which
disables gRPC persistence; change it to use the same fallback ClaudeBridge.run()
uses by deriving session_id from run_agent_input.thread_id or the bridge context
(e.g., session_id = run_agent_input.thread_id or (getattr(bridge, "context",
None) and bridge.context.session_id)), and pass that value to
grpc_push_middleware instead of "" so gRPC persistence is preserved; update the
session_id variable near its declaration and ensure grpc_push_middleware is
given None or the actual session id rather than an empty string.

Comment on lines +101 to +109
try:
async for event in event_stream:
yield event
_push_event(grpc_client, sid, event)
finally:
try:
grpc_client.close()
except Exception:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Blocking gRPC call on async event loop.

_push_event() calls grpc_client.session_messages.push() synchronously (blocking I/O). This blocks the event loop while waiting for the gRPC response, which can stall event streaming.

The docstring says "fire-and-forget" but the current implementation is synchronous. Either offload to an executor or use asyncio.create_task with a wrapper.

🔧 Proposed fix using run_in_executor
     try:
         async for event in event_stream:
             yield event
-            _push_event(grpc_client, sid, event)
+            # Fire-and-forget: offload blocking gRPC call to thread pool
+            loop = asyncio.get_running_loop()
+            loop.run_in_executor(None, _push_event, grpc_client, sid, event)
     finally:

Or wrap in a task if you want error visibility:

asyncio.create_task(_push_event_async(grpc_client, sid, event))
🧰 Tools
🪛 Ruff (0.15.10)

[error] 108-109: try-except-pass detected, consider logging the exception

(S110)


[warning] 108-108: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py`
around lines 101 - 109, The loop yields events from event_stream then calls the
blocking function _push_event which synchronously invokes
grpc_client.session_messages.push(), blocking the async loop; change this to
fire-and-forget by offloading the blocking gRPC call: either call
asyncio.get_running_loop().run_in_executor(None, _push_event, grpc_client, sid,
event) or create an asyncio.create_task that runs an async wrapper (e.g.
_push_event_async) which itself uses run_in_executor to invoke
grpc_client.session_messages.push(), ensure you do not await it so streaming
stays non-blocking and handle/log exceptions inside the wrapper; keep the
existing grpc_client.close() in the finally block.

mock_cls.from_env.return_value = mock_client

executor_calls = []
real_loop = asyncio.get_event_loop()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Deprecated asyncio.get_event_loop() — use get_running_loop() instead.

In async context, get_event_loop() is deprecated since Python 3.10. Since the test is already async, use asyncio.get_running_loop() which you're already calling on line 121.

Proposed fix
-        real_loop = asyncio.get_event_loop()
+        real_loop = asyncio.get_running_loop()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
real_loop = asyncio.get_event_loop()
real_loop = asyncio.get_running_loop()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_app_initial_prompt.py` at line
119, Replace the deprecated call to asyncio.get_event_loop() assigned to
real_loop with asyncio.get_running_loop() so the async test uses the running
loop; locate the assignment to real_loop and change the function call from
asyncio.get_event_loop() to asyncio.get_running_loop() to match the existing use
of get_running_loop() elsewhere in the test.

Comment on lines +27 to +34
def make_writer(grpc_client=None):
if grpc_client is None:
grpc_client = MagicMock()
return GRPCMessageWriter(
session_id="sess-1",
run_id="run-1",
grpc_client=grpc_client,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

grpc_client=None never hits the no-client branch.

make_writer() converts an explicit None into a MagicMock, so Line 211 still exercises the normal push path. The GRPCMessageWriter._write_message() early return for self._grpc_client is None remains untested.

Possible fix
+_DEFAULT = object()
+
-def make_writer(grpc_client=None):
-    if grpc_client is None:
+def make_writer(grpc_client=_DEFAULT):
+    if grpc_client is _DEFAULT:
         grpc_client = MagicMock()
     return GRPCMessageWriter(
         session_id="sess-1",
         run_id="run-1",
         grpc_client=grpc_client,
     )

Also applies to: 208-213

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_grpc_writer.py` around lines 27
- 34, The test helper make_writer currently converts an explicit None
grpc_client into a MagicMock, so the GRPCMessageWriter._write_message branch
where self._grpc_client is None is never exercised; update make_writer to accept
and pass through a sentinel (or allow grpc_client=None to be passed through) so
you can construct GRPCMessageWriter with grpc_client=None in the test that
targets the early-return path, and add/adjust the test at Line ~211 to call
make_writer(grpc_client=None) (or directly instantiate GRPCMessageWriter with
grpc_client=None) to verify the early return in
GRPCMessageWriter._write_message.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/runners/ambient-runner/ambient_runner/platform/auth.py (2)

167-195: ⚠️ Potential issue | 🟠 Major

Expired caller token + expired BOT_TOKEN never reaches the refresh path.

If the caller-token request gets 401/403 and the BOT_TOKEN fallback also gets 401/403, this branch raises PermissionError immediately. That skips _retry_with_fresh_bot_token(...), so the new CP refresh logic never runs in the exact case it is needed most.

Suggested fix
                 try:
                     with _urllib_request.urlopen(fallback_req, timeout=10) as resp:
                         return resp.read().decode("utf-8", errors="replace")
+                except _urllib_request.HTTPError as fallback_err:
+                    if fallback_err.code in (401, 403):
+                        return _retry_with_fresh_bot_token(fallback_err.code)
+                    logger.warning(
+                        f"{credential_type} BOT_TOKEN fallback failed: {fallback_err}"
+                    )
+                    raise PermissionError(
+                        f"{credential_type} authentication failed after caller-token fallback"
+                    ) from fallback_err
                 except Exception as fallback_err:
                     logger.warning(
                         f"{credential_type} BOT_TOKEN fallback also failed: {fallback_err}"
                     )
                     raise PermissionError(

Also applies to: 202-235

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around
lines 167 - 195, The current fallback except block raises PermissionError
immediately when the BOT_TOKEN fallback fails, which prevents the
`_retry_with_fresh_bot_token(...)` refresh path from running; change the except
handler inside the caller-token-expired branch to detect if `fallback_err` is an
HTTP error with code 401 or 403 and, in that case, call and return
`_retry_with_fresh_bot_token(fallback_err.code)` instead of raising; otherwise
log and raise the existing PermissionError. Apply the same adjustment to the
parallel block handling the other occurrence so that both places reference
`_retry_with_fresh_bot_token`, `credential_type`, and the fallback request flow
(`get_bot_token`, `fallback_req`, `context.current_user_id`) consistently.

125-140: ⚠️ Potential issue | 🔴 Critical

Reject non-HTTP(S) backend URLs before urlopen().

The current guard only checks parsed.hostname, so values like file:///... skip the allowlist entirely and still reach urlopen() with the Authorization header attached, defeating the exfiltration protection. Schemes like ftp:// with a truthy hostname are caught, but file:// with no hostname bypass the check.

Suggested fix
     url = (
         f"{base}/api/ambient/v1/projects/{project_id}/credentials/{credential_id}/token"
     )

     # Reject non-cluster URLs to prevent token exfiltration via user-overridden env vars
     parsed = urlparse(base)
+    if parsed.scheme not in {"http", "https"} or not parsed.hostname:
+        logger.error(f"Refusing to send credentials to invalid backend URL: {base}")
+        return {}
     if parsed.hostname and not (
         parsed.hostname.endswith(".svc.cluster.local")
         or parsed.hostname.endswith(".svc")
         or parsed.hostname == "localhost"
         or parsed.hostname == "127.0.0.1"
     ):
         logger.error(
             f"Refusing to send credentials to external host: {parsed.hostname}"
         )
         return {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around
lines 125 - 140, The guard currently only checks parsed.hostname and misses
non-HTTP(S) schemes like file:// which have no hostname; update the validation
around url/parsed (constructed from base, project_id, credential_id) to first
ensure parsed.scheme is either "http" or "https" and reject any other scheme,
then perform the existing hostname allowlist checks
(endswith(".svc.cluster.local"), ".svc", "localhost", "127.0.0.1"); if the
scheme is invalid or the hostname fails the allowlist, log the refusal (use the
same logger message) and return {} before calling urlopen().
♻️ Duplicate comments (4)
components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py (1)

72-72: ⚠️ Potential issue | 🟠 Major

Do not log raw AMBIENT_MCP_URL

Line 72 logs the full URL value; this can leak secrets if userinfo or query tokens are present.

Suggested fix
-        logger.info("Added ambient MCP sidecar server (SSE): %s", ambient_mcp_url)
+        logger.info("Added ambient MCP sidecar server (SSE)")

As per coding guidelines, components/runners/ambient-runner/**/*.py: "Check subprocess handling, timeout management, and that secrets are not logged."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py` at
line 72, The log on logger.info currently prints the raw ambient_mcp_url which
can leak credentials; update the code around the logger.info call in mcp.py to
sanitize ambient_mcp_url before logging by parsing the URL (e.g., via
urllib.parse), removing any userinfo (username:password) and
stripping/obfuscating sensitive query parameters (e.g., token, api_key, auth) or
replacing them with "[REDACTED]", then log the sanitized URL or only host/path
instead of the raw ambient_mcp_url; ensure you reference ambient_mcp_url and the
logger.info call when making the change.
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (2)

482-508: ⚠️ Potential issue | 🟠 Major

Wait for gRPC listener readiness before reporting startup success.

The docstring promises a bounded readiness timeout, but start_grpc_listener() returns right after start() and logs success before the listener is actually ready. A bad grpc_url can therefore slip past startup and fail later.

Possible fix
         self._grpc_listener = GRPCSessionListener(
             bridge=self,
             session_id=session_id,
             grpc_url=grpc_url,
         )
         self._grpc_listener.start()
+        try:
+            await asyncio.wait_for(self._grpc_listener.ready.wait(), timeout=5)
+        except Exception:
+            await self._grpc_listener.stop()
+            self._grpc_listener = None
+            raise
         logger.info(
             "gRPC listener started: session=%s url=%s",
             session_id,
             grpc_url,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 482 - 508, The start_grpc_listener function currently logs success
immediately after calling self._grpc_listener.start(); change it to wait for the
GRPCSessionListener to report readiness (or fail) before logging/returning:
after creating GRPCSessionListener (bridge=self, session_id=session_id,
grpc_url=grpc_url) call its readiness API (e.g., await
self._grpc_listener.wait_ready(timeout=...) or poll
self._grpc_listener.is_ready() with a bounded timeout), raise or log an error
and clear self._grpc_listener on failure, and only log "gRPC listener started"
and return after the listener is confirmed ready; ensure you propagate or
surface the underlying connection error when readiness times out or fails.

215-219: ⚠️ Potential issue | 🔴 Critical

Serialize credential refresh and cleanup across runs.

Line 216 and Line 335 still mutate process-global env vars and shared token files without a bridge-wide lock. Two overlapping turns can clear or overwrite each other’s credentials mid-run.

Possible fix
 class ClaudeBridge(PlatformBridge):
     def __init__(self) -> None:
         super().__init__()
+        self._credentials_lock = asyncio.Lock()
         self._adapter: ClaudeAgentAdapter | None = None
         ...

-        clear_runtime_credentials()
-        await populate_runtime_credentials(self._context)
-        await populate_mcp_server_credentials(self._context)
-        self._last_creds_refresh = time.monotonic()
+        async with self._credentials_lock:
+            clear_runtime_credentials()
+            await populate_runtime_credentials(self._context)
+            await populate_mcp_server_credentials(self._context)
+            self._last_creds_refresh = time.monotonic()
 ...
-                    clear_runtime_credentials()
+                    async with self._credentials_lock:
+                        clear_runtime_credentials()

Also applies to: 330-335

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 215 - 219, The credential refresh/cleanup calls
(clear_runtime_credentials(), populate_runtime_credentials(self._context),
populate_mcp_server_credentials(self._context)) must be serialized with a
bridge-wide async lock to prevent concurrent runs from clearing/overwriting each
other’s env vars and token files; add a bridge-scoped asyncio.Lock (e.g.,
self._creds_lock) or module-level lock, acquire it before calling
clear_runtime_credentials()/populate_runtime_credentials()/populate_mcp_server_credentials
and release it after, and apply the same locking around the other
credential-mutation site (the block referenced around lines 330-335) so all
credential mutations use the same lock. Ensure the lock is created once
(constructor or module init) and used with async with self._creds_lock to avoid
deadlocks.
components/runners/ambient-runner/ambient_runner/_session_messages_api.py (1)

103-117: ⚠️ Potential issue | 🟠 Major

Keep reconnect failures inside the fire-and-forget contract.

push() says it logs and returns None on transport errors, but this retry branch still lets reconnect() / session_messages setup exceptions escape to the caller. Wrap the reconnect/update block locally and fall back to None if it fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/_session_messages_api.py`
around lines 103 - 117, The retry branch inside push() can raise from
self._grpc_client.reconnect() or accessing new_api =
self._grpc_client.session_messages, so wrap the reconnect and metadata/_push_rpc
update (the block that calls self._grpc_client.reconnect(), reads
self._grpc_client.session_messages, assigns self._push_rpc and self._metadata)
in a local try/except; on any exception log a warning/error and break/return
None to preserve the fire-and-forget contract instead of letting exceptions
escape to the caller. Ensure you reference push(),
self._grpc_client.reconnect(), session_messages, _push_rpc and _metadata when
locating and protecting the code.
🧹 Nitpick comments (2)
components/runners/ambient-runner/tests/test_grpc_client.py (1)

95-97: Assert specific exception type for invalid public key.

pytest.raises(Exception) is too broad. The cryptography library raises ValueError or TypeError for invalid PEM data — asserting on the specific type improves test reliability.

♻️ Proposed fix
     def test_invalid_public_key_raises(self):
-        with pytest.raises(Exception):
+        with pytest.raises((ValueError, TypeError)):
             _encrypt_session_id("not a pem key", "session-id")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_grpc_client.py` around lines 95
- 97, The test test_invalid_public_key_raises is using a too-broad
pytest.raises(Exception); update it to assert the specific exceptions raised by
the cryptography PEM parsing (e.g., use pytest.raises((ValueError, TypeError)))
when calling _encrypt_session_id("not a pem key", "session-id") so the test
verifies the exact error types from the PEM parsing path in _encrypt_session_id.
components/manifests/overlays/mpp-openshift/ambient-api-server.yaml (1)

78-83: Enable readOnlyRootFilesystem: true on the main container.

The main api-server container (lines 78–83) can safely use readOnlyRootFilesystem: true by mounting writable paths (logs, temp) as emptyDir volumes. The initContainer (lines 137–142) likely needs write access for database migration and should remain false.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server.yaml` around
lines 78 - 83, Set readOnlyRootFilesystem: true on the main api-server
container's securityContext (replace the current false) and ensure any paths
that need to be writable by that container (logs, tmp, etc.) are mounted as
emptyDir volumes; keep the initContainer's readOnlyRootFilesystem as false
because it requires write access for DB migrations. Update the container spec
where securityContext is defined (securityContext -> readOnlyRootFilesystem) and
add corresponding emptyDir volume mounts (and volume entries) for the writable
paths so the main container remains functional while being read-only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/manifests/base/ambient-control-plane-service.yml`:
- Around line 4-18: The Deployment lacks a namespace so the pod cannot resolve
serviceAccountName ambient-control-plane (and its secret
ambient-control-plane-token) which live in ambient-code; add metadata.namespace:
ambient-code to the Deployment top-level metadata block (the metadata section
for the resource that contains spec.replicas and selector) and likewise add
namespace: ambient-code to any other resource metadata blocks that reference
serviceAccountName ambient-control-plane (the template/spec area referenced
around lines 33–37) so the manifest explicitly targets the ambient-code
namespace.

In `@components/manifests/base/rbac/control-plane-clusterrole.yaml`:
- Around line 22-24: The ClusterRole currently grants broad create/update/delete
on cluster-scoped "secrets" and "services" via the resources: ["secrets",
"serviceaccounts", "services", "pods"] and verbs:
["get","list","watch","create","update","patch","delete","deletecollection"];
change this to follow least-privilege by removing mutation verbs for
cluster-scoped secrets/services (keep only read verbs if cluster-wide read is
needed) and/or move secrets/services mutations into a namespaced
Role/RoleBinding targeted at the specific namespaces that need it; update the
ClusterRole resource list (the resources array and verbs array) to only include
safe verbs for cluster scope and create a separate Role (with
create/update/patch/delete verbs) for "secrets" and "services" limited to the
required namespace(s), then bind that Role where necessary instead of using the
ClusterRole binding.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml`:
- Around line 47-97: The PostgreSQL container (name: postgresql) is missing a
resources stanza; add a resources block to the container spec for postgresql
with appropriate requests and limits (CPU and memory) to ensure scheduler
placement and stable QoS (e.g., requests for CPU/memory and corresponding
limits), and apply the same policy to any similar DB containers; update the
container spec near readinessProbe/livenessProbe (within the same container
definition that mounts ambient-api-server-db-data) to include resources so the
pod gets Guaranteed/Burstable QoS as required by the manifests guidelines.
- Around line 95-97: The volume named ambient-api-server-db-data is configured
as emptyDir which makes /var/lib/pgsql/data ephemeral; replace this with a
PersistentVolumeClaim-backed volume and update the workload to use that PVC (or
convert the Deployment to a StatefulSet) so ambient-api-server retains
PostgreSQL data across restarts; create a matching PersistentVolumeClaim
resource (or reference an existing storageClass) and mount it where
ambient-api-server uses /var/lib/pgsql/data, ensuring the PVC name replaces
ambient-api-server-db-data in the pod spec and the claim has appropriate
accessMode and storage size for the database.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`:
- Around line 83-84: The CP_TOKEN_URL environment variable points to the wrong
namespace; update the value of CP_TOKEN_URL in the Deployment manifest from
"http://ambient-control-plane.ambient-code--ambient-s0.svc:8080/token" to use
the runtime-int namespace so the service can be reached (change to
"http://ambient-control-plane.ambient-code--runtime-int.svc:8080/token"); locate
the CP_TOKEN_URL entry in the ambient-control-plane Deployment/overlay and
replace the namespace portion of the URL accordingly so it matches the service's
namespace used by the ambient-control-plane service.

In
`@components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py`:
- Around line 359-360: The listener is incorrectly removing shared SSE queues
from bridge._active_streams (the run_queue/active_streams.pop(thread_id, None)
call), which can drop other /events subscribers; remove that unconditional pop
and instead either (A) don’t modify bridge._active_streams from this listener at
all and let the endpoint-owned cleanup path handle removal, or (B) if this
listener must remove queues it created, track ownership when creating the queue
(e.g., a self._owned_streams set keyed by thread_id) and only pop
bridge._active_streams[thread_id] when thread_id is in that ownership set;
update the code around run_queue, active_streams, thread_id and
bridge._active_streams accordingly.
- Around line 420-427: The current assignment to assistant_text selects the
first (oldest) assistant message from self._accumulated_messages, causing stale
content to be resent; change the selection to pick the most recent assistant
message by iterating the messages in reverse (e.g.,
reversed(self._accumulated_messages) or indexing from the end) and return the
first item whose role == "assistant" (update the expression that computes
assistant_text to use the reversed sequence so PushSessionMessage uses the
latest assistant reply and not the oldest snapshot).

In `@components/runners/ambient-runner/ambient_runner/endpoints/events.py`:
- Around line 70-73: The code creates an unbounded asyncio.Queue (queue =
asyncio.Queue()) contrary to the function docstring and tests that expect a
bounded queue; update the queue creation in the active_streams initialization
branch to use a bounded queue (e.g., asyncio.Queue(maxsize=100)) so the variable
queue and the active_streams[thread_id] entry are created with a max size,
keeping the rest of the logic in this function (references: active_streams,
thread_id, queue) unchanged.

In `@docs/internal/proposals/alpha-to-main-migration.md`:
- Around line 75-78: The doc states that manifests are combined into "PR 6 —
runners + manifests: Auth, Credentials, gRPC, SSE, and Kustomize Overlays" but
later sections still model a separate PR 7; update the ordering graph and all
narrative references so PR7 is removed and its manifests path is merged into
PR6. Locate the heading/title "PR 6 — runners + manifests: Auth, Credentials,
gRPC, SSE, and Kustomize Overlays" and the later paragraphs/graph nodes that
mention "PR 7" or a separate "manifests" path (the ordering graph and the
narrative around PR sequencing) and replace them to reflect a single
consolidated PR6 path, ensuring dependencies/“Depends on” lines and any
sequencing arrows reference PR6 (and PR2/PR4 as needed) instead of PR7 so
there’s no conflicting execution guidance.

---

Outside diff comments:
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 167-195: The current fallback except block raises PermissionError
immediately when the BOT_TOKEN fallback fails, which prevents the
`_retry_with_fresh_bot_token(...)` refresh path from running; change the except
handler inside the caller-token-expired branch to detect if `fallback_err` is an
HTTP error with code 401 or 403 and, in that case, call and return
`_retry_with_fresh_bot_token(fallback_err.code)` instead of raising; otherwise
log and raise the existing PermissionError. Apply the same adjustment to the
parallel block handling the other occurrence so that both places reference
`_retry_with_fresh_bot_token`, `credential_type`, and the fallback request flow
(`get_bot_token`, `fallback_req`, `context.current_user_id`) consistently.
- Around line 125-140: The guard currently only checks parsed.hostname and
misses non-HTTP(S) schemes like file:// which have no hostname; update the
validation around url/parsed (constructed from base, project_id, credential_id)
to first ensure parsed.scheme is either "http" or "https" and reject any other
scheme, then perform the existing hostname allowlist checks
(endswith(".svc.cluster.local"), ".svc", "localhost", "127.0.0.1"); if the
scheme is invalid or the hostname fails the allowlist, log the refusal (use the
same logger message) and return {} before calling urlopen().

---

Duplicate comments:
In `@components/runners/ambient-runner/ambient_runner/_session_messages_api.py`:
- Around line 103-117: The retry branch inside push() can raise from
self._grpc_client.reconnect() or accessing new_api =
self._grpc_client.session_messages, so wrap the reconnect and metadata/_push_rpc
update (the block that calls self._grpc_client.reconnect(), reads
self._grpc_client.session_messages, assigns self._push_rpc and self._metadata)
in a local try/except; on any exception log a warning/error and break/return
None to preserve the fire-and-forget contract instead of letting exceptions
escape to the caller. Ensure you reference push(),
self._grpc_client.reconnect(), session_messages, _push_rpc and _metadata when
locating and protecting the code.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 482-508: The start_grpc_listener function currently logs success
immediately after calling self._grpc_listener.start(); change it to wait for the
GRPCSessionListener to report readiness (or fail) before logging/returning:
after creating GRPCSessionListener (bridge=self, session_id=session_id,
grpc_url=grpc_url) call its readiness API (e.g., await
self._grpc_listener.wait_ready(timeout=...) or poll
self._grpc_listener.is_ready() with a bounded timeout), raise or log an error
and clear self._grpc_listener on failure, and only log "gRPC listener started"
and return after the listener is confirmed ready; ensure you propagate or
surface the underlying connection error when readiness times out or fails.
- Around line 215-219: The credential refresh/cleanup calls
(clear_runtime_credentials(), populate_runtime_credentials(self._context),
populate_mcp_server_credentials(self._context)) must be serialized with a
bridge-wide async lock to prevent concurrent runs from clearing/overwriting each
other’s env vars and token files; add a bridge-scoped asyncio.Lock (e.g.,
self._creds_lock) or module-level lock, acquire it before calling
clear_runtime_credentials()/populate_runtime_credentials()/populate_mcp_server_credentials
and release it after, and apply the same locking around the other
credential-mutation site (the block referenced around lines 330-335) so all
credential mutations use the same lock. Ensure the lock is created once
(constructor or module init) and used with async with self._creds_lock to avoid
deadlocks.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py`:
- Line 72: The log on logger.info currently prints the raw ambient_mcp_url which
can leak credentials; update the code around the logger.info call in mcp.py to
sanitize ambient_mcp_url before logging by parsing the URL (e.g., via
urllib.parse), removing any userinfo (username:password) and
stripping/obfuscating sensitive query parameters (e.g., token, api_key, auth) or
replacing them with "[REDACTED]", then log the sanitized URL or only host/path
instead of the raw ambient_mcp_url; ensure you reference ambient_mcp_url and the
logger.info call when making the change.

---

Nitpick comments:
In `@components/manifests/overlays/mpp-openshift/ambient-api-server.yaml`:
- Around line 78-83: Set readOnlyRootFilesystem: true on the main api-server
container's securityContext (replace the current false) and ensure any paths
that need to be writable by that container (logs, tmp, etc.) are mounted as
emptyDir volumes; keep the initContainer's readOnlyRootFilesystem as false
because it requires write access for DB migrations. Update the container spec
where securityContext is defined (securityContext -> readOnlyRootFilesystem) and
add corresponding emptyDir volume mounts (and volume entries) for the writable
paths so the main container remains functional while being read-only.

In `@components/runners/ambient-runner/tests/test_grpc_client.py`:
- Around line 95-97: The test test_invalid_public_key_raises is using a
too-broad pytest.raises(Exception); update it to assert the specific exceptions
raised by the cryptography PEM parsing (e.g., use pytest.raises((ValueError,
TypeError))) when calling _encrypt_session_id("not a pem key", "session-id") so
the test verifies the exact error types from the PEM parsing path in
_encrypt_session_id.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 79983dab-f08b-4ce6-8720-198bc1e49513

📥 Commits

Reviewing files that changed from the base of the PR and between fe35a64 and 7f7227d.

📒 Files selected for processing (81)
  • components/manifests/base/ambient-control-plane-service.yml
  • components/manifests/base/kustomization.yaml
  • components/manifests/base/rbac/control-plane-clusterrole.yaml
  • components/manifests/base/rbac/control-plane-clusterrolebinding.yaml
  • components/manifests/base/rbac/control-plane-sa.yaml
  • components/manifests/base/rbac/kustomization.yaml
  • components/manifests/overlays/cluster-reader/README.md
  • components/manifests/overlays/cluster-reader/kustomization.yaml
  • components/manifests/overlays/cluster-reader/service-account.yaml
  • components/manifests/overlays/kind-local/operator-env-patch.yaml
  • components/manifests/overlays/local-dev/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/README.md
  • components/manifests/overlays/mpp-openshift/ambient-api-server-args-patch.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-service-ca-patch.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-tenant-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yaml
  • components/manifests/overlays/mpp-openshift/ambient-tenant-ingress-netpol.yaml
  • components/manifests/overlays/mpp-openshift/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-runtime-int.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-s0.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/kustomization.yaml
  • components/manifests/overlays/openshift-dev/ambient-api-server-args-patch.yaml
  • components/manifests/overlays/openshift-dev/ambient-api-server-env-patch.yaml
  • components/manifests/overlays/openshift-dev/kustomization.yaml
  • components/manifests/overlays/production/ambient-api-server-route.yaml
  • components/manifests/overlays/production/kustomization.yaml
  • components/runners/ambient-runner/.mcp.json
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/adapter.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/types.py
  • components/runners/ambient-runner/ambient_runner/_grpc_client.py
  • components/runners/ambient-runner/ambient_runner/_inbox_messages_api.py
  • components/runners/ambient-runner/ambient_runner/_session_messages_api.py
  • components/runners/ambient-runner/ambient_runner/app.py
  • components/runners/ambient-runner/ambient_runner/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/tools.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/events.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/runners/ambient-runner/ambient_runner/middleware/__init__.py
  • components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/ambient_runner/platform/prompts.py
  • components/runners/ambient-runner/ambient_runner/platform/utils.py
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/runners/ambient-runner/architecture.md
  • components/runners/ambient-runner/pyproject.toml
  • components/runners/ambient-runner/tests/test_app_initial_prompt.py
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/tests/test_bridge_claude.py
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/runners/ambient-runner/tests/test_events_endpoint.py
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/runners/ambient-runner/tests/test_grpc_client.py
  • components/runners/ambient-runner/tests/test_grpc_transport.py
  • components/runners/ambient-runner/tests/test_grpc_writer.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py
  • docs/internal/proposals/alpha-to-main-migration.md
💤 Files with no reviewable changes (10)
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/manifests/overlays/cluster-reader/service-account.yaml
  • components/manifests/overlays/cluster-reader/kustomization.yaml
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/manifests/overlays/cluster-reader/README.md
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
✅ Files skipped from review due to trivial changes (36)
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/manifests/overlays/local-dev/ambient-api-server-route.yaml
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/manifests/base/rbac/kustomization.yaml
  • components/runners/ambient-runner/pyproject.toml
  • components/manifests/overlays/production/kustomization.yaml
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yaml
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-sa.yaml
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/manifests/overlays/mpp-openshift/ambient-api-server-service-ca-patch.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-runtime-int.yaml
  • components/manifests/overlays/mpp-openshift/ambient-tenant-ingress-netpol.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac.yaml
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/manifests/overlays/production/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-tenant-sa.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-s0.yaml
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/manifests/overlays/mpp-openshift/tenant-rbac/kustomization.yaml
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/manifests/overlays/mpp-openshift/kustomization.yaml
  • components/manifests/base/kustomization.yaml
  • components/manifests/overlays/openshift-dev/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/README.md
  • components/manifests/base/rbac/control-plane-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-route.yaml
  • components/runners/ambient-runner/tests/test_app_initial_prompt.py
  • components/runners/ambient-runner/architecture.md
🚧 Files skipped from review as they are similar to previous changes (8)
  • components/runners/ambient-runner/.mcp.json
  • components/runners/ambient-runner/ambient_runner/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/platform/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/tools.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/types.py
  • components/runners/ambient-runner/tests/test_grpc_writer.py
  • components/runners/ambient-runner/ambient_runner/app.py

Comment on lines +4 to +18
metadata:
name: ambient-control-plane
labels:
app: ambient-control-plane
spec:
replicas: 1
selector:
matchLabels:
app: ambient-control-plane
template:
metadata:
labels:
app: ambient-control-plane
spec:
serviceAccountName: ambient-control-plane
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Deployment namespace + SA/secret refs
rg -n -C2 'kind:\s*Deployment|name:\s*ambient-control-plane|namespace:|serviceAccountName|ambient-control-plane-token' \
  components/manifests/base/ambient-control-plane-service.yml

# SA/Secret namespaces
rg -n -C2 'kind:\s*ServiceAccount|kind:\s*Secret|name:\s*ambient-control-plane|ambient-control-plane-token|namespace:' \
  components/manifests/base/rbac/control-plane-sa.yaml

# Check whether base kustomization enforces a namespace
rg -n -C2 '^namespace:|resources:' components/manifests/base/kustomization.yaml

Repository: ambient-code/platform

Length of output: 1164


Add metadata.namespace: ambient-code to Deployment

This Deployment references serviceAccountName: ambient-control-plane and secret ambient-control-plane-token (lines 18, 36), both of which exist in namespace ambient-code. Without an explicit namespace in the Deployment metadata and no namespace transformer in kustomization, pod startup will fail with service account resolution errors. Add namespace: ambient-code to lines 4–7 metadata block.

Also applies to: 33-37

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/base/ambient-control-plane-service.yml` around lines 4 -
18, The Deployment lacks a namespace so the pod cannot resolve
serviceAccountName ambient-control-plane (and its secret
ambient-control-plane-token) which live in ambient-code; add metadata.namespace:
ambient-code to the Deployment top-level metadata block (the metadata section
for the resource that contains spec.replicas and selector) and likewise add
namespace: ambient-code to any other resource metadata blocks that reference
serviceAccountName ambient-control-plane (the template/spec area referenced
around lines 33–37) so the manifest explicitly targets the ambient-code
namespace.

Comment on lines +22 to +24
- apiGroups: [""]
resources: ["secrets", "serviceaccounts", "services", "pods"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cluster-wide secret/service mutation is over-privileged

Lines 22–24 grant create/update/delete (plus read/list/watch) on secrets and services at ClusterRole scope. Through the binding, this gives broad credential and service-plane control across namespaces, which is a high-risk blast radius.

As per coding guidelines, components/manifests/**/*.yaml: RBAC must follow least-privilege.

🧰 Tools
🪛 Trivy (0.69.3)

[error] 22-24: Manage secrets

ClusterRole 'ambient-control-plane' shouldn't have access to manage resource 'secrets'

Rule: KSV-0041

Learn more

(IaC/Kubernetes)


[error] 22-24: Manage Kubernetes networking

ClusterRole 'ambient-control-plane' should not have access to resources ["services", "endpoints", "endpointslices", "networkpolicies", "ingresses"] for verbs ["create", "update", "patch", "delete", "deletecollection", "impersonate", "*"]

Rule: KSV-0056

Learn more

(IaC/Kubernetes)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/base/rbac/control-plane-clusterrole.yaml` around lines
22 - 24, The ClusterRole currently grants broad create/update/delete on
cluster-scoped "secrets" and "services" via the resources: ["secrets",
"serviceaccounts", "services", "pods"] and verbs:
["get","list","watch","create","update","patch","delete","deletecollection"];
change this to follow least-privilege by removing mutation verbs for
cluster-scoped secrets/services (keep only read verbs if cluster-wide read is
needed) and/or move secrets/services mutations into a namespaced
Role/RoleBinding targeted at the specific namespaces that need it; update the
ClusterRole resource list (the resources array and verbs array) to only include
safe verbs for cluster scope and create a separate Role (with
create/update/patch/delete verbs) for "secrets" and "services" limited to the
required namespace(s), then bind that Role where necessary instead of using the
ClusterRole binding.

Comment on lines +47 to +97
- name: postgresql
image: registry.redhat.io/rhel9/postgresql-16:latest
ports:
- containerPort: 5432
name: postgresql
env:
- name: POSTGRESQL_USER
valueFrom:
secretKeyRef:
key: db.user
name: ambient-api-server-db
- name: POSTGRESQL_PASSWORD
valueFrom:
secretKeyRef:
key: db.password
name: ambient-api-server-db
- name: POSTGRESQL_DATABASE
valueFrom:
secretKeyRef:
key: db.name
name: ambient-api-server-db
volumeMounts:
- name: ambient-api-server-db-data
mountPath: /var/lib/pgsql/data
subPath: pgdata
readinessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U "$POSTGRESQL_USER"
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U "$POSTGRESQL_USER"
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volumes:
- name: ambient-api-server-db-data
emptyDir: {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add resource requests/limits for the new PostgreSQL container.

This workload ships without a resources stanza, so scheduler placement and QoS are undefined for the DB pod.

Possible fix
       containers:
         - name: postgresql
           image: registry.redhat.io/rhel9/postgresql-16:latest
+          resources:
+            requests:
+              cpu: 250m
+              memory: 512Mi
+            limits:
+              cpu: "1"
+              memory: 1Gi
           ports:
             - containerPort: 5432
               name: postgresql

As per coding guidelines, "components/manifests/**/*.yaml: - RBAC must follow least-privilege. Resource limits/requests required on containers."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: postgresql
image: registry.redhat.io/rhel9/postgresql-16:latest
ports:
- containerPort: 5432
name: postgresql
env:
- name: POSTGRESQL_USER
valueFrom:
secretKeyRef:
key: db.user
name: ambient-api-server-db
- name: POSTGRESQL_PASSWORD
valueFrom:
secretKeyRef:
key: db.password
name: ambient-api-server-db
- name: POSTGRESQL_DATABASE
valueFrom:
secretKeyRef:
key: db.name
name: ambient-api-server-db
volumeMounts:
- name: ambient-api-server-db-data
mountPath: /var/lib/pgsql/data
subPath: pgdata
readinessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U "$POSTGRESQL_USER"
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U "$POSTGRESQL_USER"
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volumes:
- name: ambient-api-server-db-data
emptyDir: {}
- name: postgresql
image: registry.redhat.io/rhel9/postgresql-16:latest
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: "1"
memory: 1Gi
ports:
- containerPort: 5432
name: postgresql
env:
- name: POSTGRESQL_USER
valueFrom:
secretKeyRef:
key: db.user
name: ambient-api-server-db
- name: POSTGRESQL_PASSWORD
valueFrom:
secretKeyRef:
key: db.password
name: ambient-api-server-db
- name: POSTGRESQL_DATABASE
valueFrom:
secretKeyRef:
key: db.name
name: ambient-api-server-db
volumeMounts:
- name: ambient-api-server-db-data
mountPath: /var/lib/pgsql/data
subPath: pgdata
readinessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U "$POSTGRESQL_USER"
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U "$POSTGRESQL_USER"
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volumes:
- name: ambient-api-server-db-data
emptyDir: {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml`
around lines 47 - 97, The PostgreSQL container (name: postgresql) is missing a
resources stanza; add a resources block to the container spec for postgresql
with appropriate requests and limits (CPU and memory) to ensure scheduler
placement and stable QoS (e.g., requests for CPU/memory and corresponding
limits), and apply the same policy to any similar DB containers; update the
container spec near readinessProbe/livenessProbe (within the same container
definition that mounts ambient-api-server-db-data) to include resources so the
pod gets Guaranteed/Burstable QoS as required by the manifests guidelines.

Comment on lines +95 to +97
volumes:
- name: ambient-api-server-db-data
emptyDir: {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

emptyDir makes this database state ephemeral.

Any pod restart, eviction, or rollout will wipe /var/lib/pgsql/data, which means ambient-api-server loses its backing store. This needs a PVC-backed volume (ideally behind a StatefulSet) unless the overlay is intentionally disposable.

Possible direction
-      volumes:
-        - name: ambient-api-server-db-data
-          emptyDir: {}
+      volumes:
+        - name: ambient-api-server-db-data
+          persistentVolumeClaim:
+            claimName: ambient-api-server-db-data
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml`
around lines 95 - 97, The volume named ambient-api-server-db-data is configured
as emptyDir which makes /var/lib/pgsql/data ephemeral; replace this with a
PersistentVolumeClaim-backed volume and update the workload to use that PVC (or
convert the Deployment to a StatefulSet) so ambient-api-server retains
PostgreSQL data across restarts; create a matching PersistentVolumeClaim
resource (or reference an existing storageClass) and mount it where
ambient-api-server uses /var/lib/pgsql/data, ensuring the PVC name replaces
ambient-api-server-db-data in the pod spec and the claim has appropriate
accessMode and storage size for the database.

Comment on lines +83 to +84
- name: CP_TOKEN_URL
value: "http://ambient-control-plane.ambient-code--ambient-s0.svc:8080/token"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the configured CP token URL in this Deployment
rg -n -C2 'CP_TOKEN_URL|ambient-code--ambient-s0|ambient-control-plane' \
  components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml

# Inspect service identity/namespace in the same overlay
rg -n -C2 'kind:\s*Service|name:\s*ambient-control-plane|namespace:' \
  components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml

# Confirm both manifests are part of this overlay build
rg -n -C2 'resources:|ambient-control-plane.yaml|ambient-control-plane-svc.yaml' \
  components/manifests/overlays/mpp-openshift/kustomization.yaml

Repository: ambient-code/platform

Length of output: 1815


Fix namespace mismatch in CP_TOKEN_URL

Line 84 references ambient-code--ambient-s0 but the service is deployed in ambient-code--runtime-int (line 5 of the service manifest). The Deployment will fail to reach the token endpoint. Update the URL to http://ambient-control-plane.ambient-code--runtime-int.svc:8080/token.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`
around lines 83 - 84, The CP_TOKEN_URL environment variable points to the wrong
namespace; update the value of CP_TOKEN_URL in the Deployment manifest from
"http://ambient-control-plane.ambient-code--ambient-s0.svc:8080/token" to use
the runtime-int namespace so the service can be reached (change to
"http://ambient-control-plane.ambient-code--runtime-int.svc:8080/token"); locate
the CP_TOKEN_URL entry in the ambient-control-plane Deployment/overlay and
replace the namespace portion of the URL accordingly so it matches the service's
namespace used by the ambient-control-plane service.

Comment on lines +420 to +427
assistant_text = next(
(
m.get("content") or ""
for m in self._accumulated_messages
if m.get("role") == "assistant"
),
"",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Select the latest assistant reply, not the first one in the snapshot.

MESSAGES_SNAPSHOT is the whole conversation, so this picks the oldest assistant message. After the first turn, PushSessionMessage will resend stale text instead of the reply that just finished.

Suggested fix
         assistant_text = next(
             (
                 m.get("content") or ""
-                for m in self._accumulated_messages
+                for m in reversed(self._accumulated_messages)
                 if m.get("role") == "assistant"
             ),
             "",
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assistant_text = next(
(
m.get("content") or ""
for m in self._accumulated_messages
if m.get("role") == "assistant"
),
"",
)
assistant_text = next(
(
m.get("content") or ""
for m in reversed(self._accumulated_messages)
if m.get("role") == "assistant"
),
"",
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py`
around lines 420 - 427, The current assignment to assistant_text selects the
first (oldest) assistant message from self._accumulated_messages, causing stale
content to be resent; change the selection to pick the most recent assistant
message by iterating the messages in reverse (e.g.,
reversed(self._accumulated_messages) or indexing from the end) and return the
first item whose role == "assistant" (update the expression that computes
assistant_text to use the reversed sequence so PushSessionMessage uses the
latest assistant reply and not the oldest snapshot).

Comment on lines +70 to +73
queue: asyncio.Queue = existing
else:
queue = asyncio.Queue()
active_streams[thread_id] = queue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unbounded queue contradicts docstring claim.

Line 49 docstring states "Creates a bounded asyncio.Queue" but line 72 creates asyncio.Queue() without maxsize. If the consumer is slow (network lag), this could grow unboundedly. Consider adding maxsize to match the bounded queue semantics documented and used in tests (maxsize=100).

🔧 Proposed fix
     else:
-        queue = asyncio.Queue()
+        queue = asyncio.Queue(maxsize=100)
         active_streams[thread_id] = queue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
queue: asyncio.Queue = existing
else:
queue = asyncio.Queue()
active_streams[thread_id] = queue
queue: asyncio.Queue = existing
else:
queue = asyncio.Queue(maxsize=100)
active_streams[thread_id] = queue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/endpoints/events.py` around
lines 70 - 73, The code creates an unbounded asyncio.Queue (queue =
asyncio.Queue()) contrary to the function docstring and tests that expect a
bounded queue; update the queue creation in the active_streams initialization
branch to use a bounded queue (e.g., asyncio.Queue(maxsize=100)) so the variable
queue and the active_streams[thread_id] entry are created with a max size,
keeping the rest of the logic in this function (references: active_streams,
thread_id, queue) unchanged.

Comment thread docs/internal/proposals/alpha-to-main-migration.md
@jeremyeder jeremyeder enabled auto-merge (squash) April 21, 2026 18:04
@markturansky
Copy link
Copy Markdown
Contributor Author

Already merged to main as 7f7227d via squash merge. Closing.

auto-merge was automatically disabled April 21, 2026 18:06

Pull request was closed

@markturansky markturansky reopened this Apr 21, 2026
@markturansky markturansky force-pushed the chore/alpha-migration-pr6-runners branch from 7f7227d to f23044a Compare April 21, 2026 18:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/runners/ambient-runner/tests/test_shared_session_credentials.py (1)

855-864: ⚠️ Potential issue | 🟠 Major

Cleanup fails when _GH_WRAPPER_PATH is empty.

Pipeline failure: wrapper.unlink(missing_ok=True) raises IsADirectoryError because _GH_WRAPPER_PATH is "" at import time (set dynamically by install_gh_wrapper). Path("") resolves to . (current directory).

Proposed fix
     def _cleanup(self):
         """Remove wrapper artifacts created during tests."""
         import ambient_runner.platform.auth as _auth_mod

         _auth_mod._gh_wrapper_installed = False
-        wrapper = Path(_GH_WRAPPER_PATH)
-        wrapper.unlink(missing_ok=True)
+        if _GH_WRAPPER_PATH:
+            wrapper = Path(_GH_WRAPPER_PATH)
+            if wrapper.exists() and wrapper.is_file():
+                wrapper.unlink(missing_ok=True)
         wrapper_dir = Path(_GH_WRAPPER_DIR)
-        if wrapper_dir.exists() and not any(wrapper_dir.iterdir()):
-            wrapper_dir.rmdir()
+        if _GH_WRAPPER_DIR and wrapper_dir.exists():
+            if wrapper_dir.is_dir() and not any(wrapper_dir.iterdir()):
+                wrapper_dir.rmdir()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_shared_session_credentials.py`
around lines 855 - 864, The test _cleanup function currently calls
Path(_GH_WRAPPER_PATH).unlink which fails when _GH_WRAPPER_PATH is an empty
string; modify _cleanup to first check that _GH_WRAPPER_PATH is a non-empty
string and that Path(_GH_WRAPPER_PATH).exists() and is_file() before calling
unlink, and only then attempt to remove the wrapper file; leave the existing
logic that resets ambient_runner.platform.auth._gh_wrapper_installed and that
removes _GH_WRAPPER_DIR only if it exists and is empty (use Path(...).iterdir()
to check emptiness) so we never attempt to unlink the current directory when
install_gh_wrapper hasn't set a path.
♻️ Duplicate comments (3)
components/manifests/base/rbac/control-plane-clusterrole.yaml (1)

22-24: ⚠️ Potential issue | 🟠 Major

Over-privileged cluster-wide mutation on secrets/services

Line 22–24 gives cluster-wide mutate/delete on secrets and services, which is excessive for a ClusterRole and materially increases blast radius. Keep only read verbs here (if needed), and move mutation of these resources to namespace-scoped Role/RoleBinding in only the required namespaces.

Suggested RBAC tightening (example)
 - apiGroups: [""]
-  resources: ["secrets", "serviceaccounts", "services", "pods"]
-  verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
+  resources: ["serviceaccounts", "pods"]
+  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
+
+- apiGroups: [""]
+  resources: ["secrets", "services"]
+  verbs: ["get", "list", "watch"]

As per coding guidelines, components/manifests/**/*.yaml: RBAC must follow least-privilege.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/base/rbac/control-plane-clusterrole.yaml` around lines
22 - 24, The ClusterRole currently grants cluster-wide mutation and deletion on
"secrets" and "services" via the verbs list in the ClusterRole resource; remove
mutation verbs ("create","update","patch","delete","deletecollection") for the
"secrets" and "services" resource entries in this ClusterRole and restrict them
to only read verbs ("get","list","watch") if any cluster-wide read is required,
then create namespace-scoped Role(s) and RoleBinding(s) that grant the necessary
mutating verbs for "secrets" and "services" only in the specific namespaces that
need them; update the ClusterRole's resources/verbs block (the ClusterRole
resource definition containing resources:
["secrets","serviceaccounts","services","pods"] and verbs: [...]) and add
corresponding Role/RoleBinding manifests targeting the required namespaces.
components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml (1)

83-84: ⚠️ Potential issue | 🔴 Critical

Fix broken in-cluster token endpoint namespace

Line 84 points CP_TOKEN_URL to ambient-code--ambient-s0, while this Deployment runs in ambient-code--runtime-int; token fetches will fail.

Suggested fix
-            - name: CP_TOKEN_URL
-              value: "http://ambient-control-plane.ambient-code--ambient-s0.svc:8080/token"
+            - name: CP_TOKEN_URL
+              value: "http://ambient-control-plane.$(CP_RUNTIME_NAMESPACE).svc:8080/token"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`
around lines 83 - 84, CP_TOKEN_URL in the manifest is hardcoded to the wrong
namespace; update the CP_TOKEN_URL environment variable value (the "value" for
name: CP_TOKEN_URL) to point to the in-cluster token service in the current
Deployment namespace (ambient-code--runtime-int) instead of
ambient-code--ambient-s0 so token fetches succeed; locate the CP_TOKEN_URL env
var in the ambient-control-plane Deployment and change the URL host segment to
ambient-code--runtime-int (or derive it from the Pod namespace at runtime if
preferred).
components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml (1)

47-97: ⚠️ Potential issue | 🟠 Major

PostgreSQL workload is under-specified and currently non-durable.

Line 47 defines a DB container without resources, and Line 97 stores database data on emptyDir. That combination gives unpredictable scheduling/QoS and data loss on restart/eviction/rollout. Add CPU/memory requests+limits and move /var/lib/pgsql/data to a PVC-backed volume (or explicitly document this overlay as disposable).

Suggested patch direction
       containers:
         - name: postgresql
           image: registry.redhat.io/rhel9/postgresql-16:latest
+          resources:
+            requests:
+              cpu: 250m
+              memory: 512Mi
+            limits:
+              cpu: "1"
+              memory: 1Gi
@@
       volumes:
         - name: ambient-api-server-db-data
-          emptyDir: {}
+          persistentVolumeClaim:
+            claimName: ambient-api-server-db-data

As per coding guidelines, "components/manifests/**/*.yaml: - RBAC must follow least-privilege. Resource limits/requests required on containers."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml`
around lines 47 - 97, The PostgreSQL container "postgresql" is missing resource
requests/limits and is using an ephemeral volume "ambient-api-server-db-data"
backed by emptyDir; add a resources block (cpu/memory requests and limits) to
the container spec for QoS and predictable scheduling, replace the emptyDir
volume with a PVC reference (volumes: name: ambient-api-server-db-data ->
persistentVolumeClaim: claimName: e.g. ambient-api-server-db-pvc) so
/var/lib/pgsql/data (volumeMount on that volume, subPath: pgdata) is durable,
and ensure a matching PVC manifest is created or documented; keep existing
readinessProbe/livenessProbe and securityContext unchanged.
🧹 Nitpick comments (1)
components/runners/ambient-runner/tests/test_grpc_client.py (1)

95-97: Catch the specific exception type from cryptography.

pytest.raises(Exception) is too broad. The cryptography library raises ValueError for invalid PEM data.

Proposed fix
     def test_invalid_public_key_raises(self):
-        with pytest.raises(Exception):
+        with pytest.raises(ValueError):
             _encrypt_session_id("not a pem key", "session-id")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_grpc_client.py` around lines 95
- 97, The test test_invalid_public_key_raises currently uses
pytest.raises(Exception) which is too broad; change it to expect the specific
exception type raised by the cryptography library (ValueError) when
_encrypt_session_id is given invalid PEM input — update the
pytest.raises(Exception) call in test_invalid_public_key_raises to
pytest.raises(ValueError) so the test asserts the correct error from
_encrypt_session_id.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml`:
- Line 48: Replace the non-deterministic image tag
"registry.redhat.io/rhel9/postgresql-16:latest" with a pinned immutable
reference (either a specific patch tag like postgresql-16.Y.Z or the image
digest sha256:...) in the container spec, and add a resources block under that
same container (resources.requests and resources.limits) to specify CPU and
memory for the DB workload (e.g., set reasonable DB-specific cpu/memory request
and higher limits) so the manifest is reproducible and meets resource
guidelines.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server.yaml`:
- Around line 56-57: The manifest currently uses the floating tag
quay.io/ambient_code/vteam_api_server:latest for both the init and main
containers (seen on the image lines around where the init container and main
container are defined); update both occurrences to use immutable image digests
(quay.io/ambient_code/vteam_api_server@sha256:...) or a fixed release tag
instead of :latest so the deployed image is deterministic and cannot change
silently at runtime—replace the two image fields accordingly.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`:
- Line 25: The manifest uses mutable image tags (e.g., the image entry
"quay.io/ambient_code/vteam_control_plane:latest") which should be pinned;
update each image reference (the control-plane image at the "image:
quay.io/ambient_code/vteam_control_plane:latest" entry and the other two image
entries referenced at the same pattern on lines ~54 and ~86) to use either a
specific version tag or an immutable digest (sha256) instead of :latest, e.g.,
replace ":latest" with a concrete semver tag or "@sha256:..." for each image
reference to ensure deterministic, auditable releases.

In `@components/runners/ambient-runner/ambient_runner/bridge.py`:
- Around line 230-252: The docstring for inject_message is incorrect: it says
the default is a no-op but the method actually raises NotImplementedError;
update the inject_message docstring (the method in bridge.py) to state that the
default implementation raises NotImplementedError and that subclasses must
override inject_message to handle inbound messages (include mention of the
raised exception and the parameters session_id, event_type, payload for
clarity).

---

Outside diff comments:
In `@components/runners/ambient-runner/tests/test_shared_session_credentials.py`:
- Around line 855-864: The test _cleanup function currently calls
Path(_GH_WRAPPER_PATH).unlink which fails when _GH_WRAPPER_PATH is an empty
string; modify _cleanup to first check that _GH_WRAPPER_PATH is a non-empty
string and that Path(_GH_WRAPPER_PATH).exists() and is_file() before calling
unlink, and only then attempt to remove the wrapper file; leave the existing
logic that resets ambient_runner.platform.auth._gh_wrapper_installed and that
removes _GH_WRAPPER_DIR only if it exists and is empty (use Path(...).iterdir()
to check emptiness) so we never attempt to unlink the current directory when
install_gh_wrapper hasn't set a path.

---

Duplicate comments:
In `@components/manifests/base/rbac/control-plane-clusterrole.yaml`:
- Around line 22-24: The ClusterRole currently grants cluster-wide mutation and
deletion on "secrets" and "services" via the verbs list in the ClusterRole
resource; remove mutation verbs
("create","update","patch","delete","deletecollection") for the "secrets" and
"services" resource entries in this ClusterRole and restrict them to only read
verbs ("get","list","watch") if any cluster-wide read is required, then create
namespace-scoped Role(s) and RoleBinding(s) that grant the necessary mutating
verbs for "secrets" and "services" only in the specific namespaces that need
them; update the ClusterRole's resources/verbs block (the ClusterRole resource
definition containing resources: ["secrets","serviceaccounts","services","pods"]
and verbs: [...]) and add corresponding Role/RoleBinding manifests targeting the
required namespaces.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml`:
- Around line 47-97: The PostgreSQL container "postgresql" is missing resource
requests/limits and is using an ephemeral volume "ambient-api-server-db-data"
backed by emptyDir; add a resources block (cpu/memory requests and limits) to
the container spec for QoS and predictable scheduling, replace the emptyDir
volume with a PVC reference (volumes: name: ambient-api-server-db-data ->
persistentVolumeClaim: claimName: e.g. ambient-api-server-db-pvc) so
/var/lib/pgsql/data (volumeMount on that volume, subPath: pgdata) is durable,
and ensure a matching PVC manifest is created or documented; keep existing
readinessProbe/livenessProbe and securityContext unchanged.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`:
- Around line 83-84: CP_TOKEN_URL in the manifest is hardcoded to the wrong
namespace; update the CP_TOKEN_URL environment variable value (the "value" for
name: CP_TOKEN_URL) to point to the in-cluster token service in the current
Deployment namespace (ambient-code--runtime-int) instead of
ambient-code--ambient-s0 so token fetches succeed; locate the CP_TOKEN_URL env
var in the ambient-control-plane Deployment and change the URL host segment to
ambient-code--runtime-int (or derive it from the Pod namespace at runtime if
preferred).

---

Nitpick comments:
In `@components/runners/ambient-runner/tests/test_grpc_client.py`:
- Around line 95-97: The test test_invalid_public_key_raises currently uses
pytest.raises(Exception) which is too broad; change it to expect the specific
exception type raised by the cryptography library (ValueError) when
_encrypt_session_id is given invalid PEM input — update the
pytest.raises(Exception) call in test_invalid_public_key_raises to
pytest.raises(ValueError) so the test asserts the correct error from
_encrypt_session_id.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4ba40ae2-323e-432e-bb47-2eae8c345c3f

📥 Commits

Reviewing files that changed from the base of the PR and between 7f7227d and 1568ef4.

📒 Files selected for processing (80)
  • components/manifests/base/ambient-control-plane-service.yml
  • components/manifests/base/kustomization.yaml
  • components/manifests/base/rbac/control-plane-clusterrole.yaml
  • components/manifests/base/rbac/control-plane-clusterrolebinding.yaml
  • components/manifests/base/rbac/control-plane-sa.yaml
  • components/manifests/base/rbac/kustomization.yaml
  • components/manifests/overlays/cluster-reader/README.md
  • components/manifests/overlays/cluster-reader/kustomization.yaml
  • components/manifests/overlays/cluster-reader/service-account.yaml
  • components/manifests/overlays/kind-local/operator-env-patch.yaml
  • components/manifests/overlays/local-dev/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/README.md
  • components/manifests/overlays/mpp-openshift/ambient-api-server-args-patch.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-service-ca-patch.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-tenant-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yaml
  • components/manifests/overlays/mpp-openshift/ambient-tenant-ingress-netpol.yaml
  • components/manifests/overlays/mpp-openshift/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-runtime-int.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-s0.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/kustomization.yaml
  • components/manifests/overlays/openshift-dev/ambient-api-server-args-patch.yaml
  • components/manifests/overlays/openshift-dev/ambient-api-server-env-patch.yaml
  • components/manifests/overlays/openshift-dev/kustomization.yaml
  • components/manifests/overlays/production/ambient-api-server-route.yaml
  • components/manifests/overlays/production/kustomization.yaml
  • components/runners/ambient-runner/.mcp.json
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/types.py
  • components/runners/ambient-runner/ambient_runner/_grpc_client.py
  • components/runners/ambient-runner/ambient_runner/_inbox_messages_api.py
  • components/runners/ambient-runner/ambient_runner/_session_messages_api.py
  • components/runners/ambient-runner/ambient_runner/app.py
  • components/runners/ambient-runner/ambient_runner/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/tools.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/events.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/runners/ambient-runner/ambient_runner/middleware/__init__.py
  • components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/ambient_runner/platform/prompts.py
  • components/runners/ambient-runner/ambient_runner/platform/utils.py
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/runners/ambient-runner/architecture.md
  • components/runners/ambient-runner/pyproject.toml
  • components/runners/ambient-runner/tests/test_app_initial_prompt.py
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/tests/test_bridge_claude.py
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/runners/ambient-runner/tests/test_events_endpoint.py
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/runners/ambient-runner/tests/test_grpc_client.py
  • components/runners/ambient-runner/tests/test_grpc_transport.py
  • components/runners/ambient-runner/tests/test_grpc_writer.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py
  • docs/internal/proposals/alpha-to-main-migration.md
💤 Files with no reviewable changes (9)
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/manifests/overlays/cluster-reader/README.md
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/manifests/overlays/cluster-reader/kustomization.yaml
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/manifests/overlays/cluster-reader/service-account.yaml
✅ Files skipped from review due to trivial changes (37)
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/pyproject.toml
  • components/manifests/overlays/local-dev/ambient-api-server-route.yaml
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/manifests/base/rbac/kustomization.yaml
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yaml
  • components/manifests/base/rbac/control-plane-sa.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-runtime-int.yaml
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-sa.yaml
  • components/manifests/overlays/production/kustomization.yaml
  • components/manifests/base/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/runners/ambient-runner/.mcp.json
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-s0.yaml
  • components/manifests/overlays/production/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-tenant-sa.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-service-ca-patch.yaml
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/manifests/overlays/mpp-openshift/ambient-tenant-ingress-netpol.yaml
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/manifests/overlays/mpp-openshift/tenant-rbac/kustomization.yaml
  • components/runners/ambient-runner/tests/test_grpc_writer.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/manifests/overlays/mpp-openshift/README.md
  • components/manifests/base/rbac/control-plane-clusterrolebinding.yaml
  • components/manifests/base/ambient-control-plane-service.yml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-route.yaml
  • components/runners/ambient-runner/architecture.md
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
  • docs/internal/proposals/alpha-to-main-migration.md
  • components/manifests/overlays/mpp-openshift/kustomization.yaml
  • components/runners/ambient-runner/tests/test_bridge_claude.py
  • components/runners/ambient-runner/ambient_runner/platform/utils.py
  • components/manifests/overlays/openshift-dev/kustomization.yaml
  • components/runners/ambient-runner/ambient_runner/platform/prompts.py
  • components/runners/ambient-runner/ambient_runner/app.py

type: RuntimeDefault
containers:
- name: postgresql
image: registry.redhat.io/rhel9/postgresql-16:latest
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find mutable image tags across manifest overlays
rg -n --iglob 'components/manifests/**/*.yaml' '^\s*image:\s+\S+:latest\s*$'

Repository: ambient-code/platform

Length of output: 2704


🏁 Script executed:

cat -n components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml | head -60

Repository: ambient-code/platform

Length of output: 1835


🏁 Script executed:

cat -n components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml | sed -n '48,80p'

Repository: ambient-code/platform

Length of output: 1378


🏁 Script executed:

cat -n components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml | sed -n '80,95p'

Repository: ambient-code/platform

Length of output: 620


Add resource limits and pin DB image to immutable digest.

Line 48 uses :latest tag, weakening reproducibility and rollback safety. Additionally, the container lacks required resource limits and requests per manifest guidelines. Pin to an explicit patch version or digest, and set resource constraints (CPU/memory limits for database workload).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml` at
line 48, Replace the non-deterministic image tag
"registry.redhat.io/rhel9/postgresql-16:latest" with a pinned immutable
reference (either a specific patch tag like postgresql-16.Y.Z or the image
digest sha256:...) in the container spec, and add a resources block under that
same container (resources.requests and resources.limits) to specify CPU and
memory for the DB workload (e.g., set reasonable DB-specific cpu/memory request
and higher limits) so the manifest is reproducible and meets resource
guidelines.

Comment on lines +56 to +57
image: quay.io/ambient_code/vteam_api_server:latest
imagePullPolicy: Always
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and get its content
if [ -f "components/manifests/overlays/mpp-openshift/ambient-api-server.yaml" ]; then
  echo "=== File exists, showing content with line numbers ==="
  cat -n "components/manifests/overlays/mpp-openshift/ambient-api-server.yaml"
else
  echo "File not found at specified path"
  # Try to search for similar files
  fd -t f "ambient-api-server.yaml"
fi

Repository: ambient-code/platform

Length of output: 7736


Pin vteam_api_server images to immutable digests.

Lines 56 and 86 both use :latest, which can silently change at runtime. Use immutable digests or fixed release tags for both init and main containers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-api-server.yaml` around
lines 56 - 57, The manifest currently uses the floating tag
quay.io/ambient_code/vteam_api_server:latest for both the init and main
containers (seen on the image lines around where the init container and main
container are defined); update both occurrences to use immutable image digests
(quay.io/ambient_code/vteam_api_server@sha256:...) or a fixed release tag
instead of :latest so the deployed image is deterministic and cannot change
silently at runtime—replace the two image fields accordingly.

type: RuntimeDefault
containers:
- name: ambient-control-plane
image: quay.io/ambient_code/vteam_control_plane:latest
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin container references to immutable versions

Using :latest for control-plane, runner, and MCP images makes releases non-deterministic and increases supply-chain risk. Pin tags or digests.

Suggested fix pattern
-          image: quay.io/ambient_code/vteam_control_plane:latest
+          image: quay.io/ambient_code/vteam_control_plane@sha256:<control-plane-digest>
...
-            - name: RUNNER_IMAGE
-              value: "quay.io/ambient_code/vteam_claude_runner:latest"
+            - name: RUNNER_IMAGE
+              value: "quay.io/ambient_code/vteam_claude_runner@sha256:<runner-digest>"
...
-            - name: MCP_IMAGE
-              value: "quay.io/ambient_code/vteam_mcp:latest"
+            - name: MCP_IMAGE
+              value: "quay.io/ambient_code/vteam_mcp@sha256:<mcp-digest>"

Also applies to: 54-54, 86-86

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml` at
line 25, The manifest uses mutable image tags (e.g., the image entry
"quay.io/ambient_code/vteam_control_plane:latest") which should be pinned;
update each image reference (the control-plane image at the "image:
quay.io/ambient_code/vteam_control_plane:latest" entry and the other two image
entries referenced at the same pattern on lines ~54 and ~86) to use either a
specific version tag or an immutable digest (sha256) instead of :latest, e.g.,
replace ":latest" with a concrete semver tag or "@sha256:..." for each image
reference to ensure deterministic, auditable releases.

Comment on lines +230 to +252
async def inject_message(
self, session_id: str, event_type: str, payload: str
) -> None:
"""Inject an inbound session message into the active run.

Called by the run endpoint for each ``SessionMessage`` received via
``WatchSessionMessages`` gRPC stream while a run is in progress.

Override in bridge subclasses to handle inbound messages — e.g. to
interrupt the current Claude turn and inject a new user message.

Default: no-op (inbound messages are silently dropped).

Args:
session_id: The session this message belongs to.
event_type: The message event type string.
payload: The raw JSON payload string.
"""
raise NotImplementedError(
f"{type(self).__name__} does not support inject_message "
f"(session_id={session_id!r}, event_type={event_type!r}). "
"Override inject_message() in your bridge subclass to handle inbound messages."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring contradicts implementation.

Line 241 says "Default: no-op (inbound messages are silently dropped)" but the implementation raises NotImplementedError. Update the docstring to reflect the actual behavior.

Proposed fix
-        Default: no-op (inbound messages are silently dropped).
+        Default: raises NotImplementedError — override in subclasses.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def inject_message(
self, session_id: str, event_type: str, payload: str
) -> None:
"""Inject an inbound session message into the active run.
Called by the run endpoint for each ``SessionMessage`` received via
``WatchSessionMessages`` gRPC stream while a run is in progress.
Override in bridge subclasses to handle inbound messagese.g. to
interrupt the current Claude turn and inject a new user message.
Default: no-op (inbound messages are silently dropped).
Args:
session_id: The session this message belongs to.
event_type: The message event type string.
payload: The raw JSON payload string.
"""
raise NotImplementedError(
f"{type(self).__name__} does not support inject_message "
f"(session_id={session_id!r}, event_type={event_type!r}). "
"Override inject_message() in your bridge subclass to handle inbound messages."
)
async def inject_message(
self, session_id: str, event_type: str, payload: str
) -> None:
"""Inject an inbound session message into the active run.
Called by the run endpoint for each ``SessionMessage`` received via
``WatchSessionMessages`` gRPC stream while a run is in progress.
Override in bridge subclasses to handle inbound messagese.g. to
interrupt the current Claude turn and inject a new user message.
Default: raises NotImplementedErroroverride in subclasses.
Args:
session_id: The session this message belongs to.
event_type: The message event type string.
payload: The raw JSON payload string.
"""
raise NotImplementedError(
f"{type(self).__name__} does not support inject_message "
f"(session_id={session_id!r}, event_type={event_type!r}). "
"Override inject_message() in your bridge subclass to handle inbound messages."
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridge.py` around lines 230
- 252, The docstring for inject_message is incorrect: it says the default is a
no-op but the method actually raises NotImplementedError; update the
inject_message docstring (the method in bridge.py) to state that the default
implementation raises NotImplementedError and that subclasses must override
inject_message to handle inbound messages (include mention of the raised
exception and the parameters session_id, event_type, payload for clarity).

user and others added 2 commits April 21, 2026 15:59
…hancements, and Kustomize overlays from alpha

PR 6 of alpha-to-main migration (combined with PR 7 — manifests).

Runners:
- gRPC transport for session message streaming
- gRPC client for control-plane token endpoint
- Inbox and session messages APIs with delta buffer
- Credential system: fetch/populate/clear, gh CLI wrapper
- SSE flush-per-chunk, unbounded tap queue
- CP OIDC token for backend credential fetches (RSA keypair auth)
- New deps: cryptography, grpcio, protobuf
- Tests: grpc_client, grpc_transport, grpc_writer, events_endpoint,
  app_initial_prompt, expanded bridge_claude and shared_session_credentials

Manifests:
- mpp-openshift overlay: NetworkPolicy, gRPC Route, CP token Service,
  RBAC, MCP sidecar, RoleBinding namespace fixes
- production overlay updates
- openshift-dev overlay
- Removed deprecated cluster-reader overlay
- All overlays pass kustomize build

Migration plan updated: PRs 1-5 marked merged, PR 6+7 combined.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
TestGhWrapper._cleanup() used stale top-level imports of _GH_WRAPPER_PATH
(always ""), causing Path(".").unlink() → IsADirectoryError. Now reads from
the module object with empty-string guards. Also fixes e2e workflow Docker
build context from components/runners to components/runners/ambient-runner
so pyproject.toml is found during image build.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@markturansky markturansky force-pushed the chore/alpha-migration-pr6-runners branch from cd8d939 to c586206 Compare April 21, 2026 20:01
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
components/runners/ambient-runner/tests/test_shared_session_credentials.py (1)

452-453: Avoid hard-coded request count in the test server loop

Line 452 pins the handler loop to 6 requests. This is brittle and can leave the thread blocked when provider calls change.

Suggested change
-        thread = Thread(
-            target=lambda: [server.handle_request() for _ in range(6)], daemon=True
-        )
+        thread = Thread(target=server.serve_forever, daemon=True)
         thread.start()
@@
         finally:
+            server.shutdown()
             server.server_close()
             thread.join(timeout=2)

Also applies to: 485-486

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_shared_session_credentials.py`
around lines 452 - 453, The test currently hard-codes a fixed number of calls to
server.handle_request() (target=lambda: [server.handle_request() for _ in
range(6)]) which is brittle; change the thread target to run the server loop
until explicitly stopped (e.g., use server.serve_forever() in the thread or
replace the fixed-list lambda with a while loop controlled by a threading.Event
that calls server.handle_request() repeatedly), and ensure the test calls
server.shutdown() or sets the stop event to terminate the thread; update both
occurrences that use server.handle_request() so the test no longer depends on a
hard-coded request count.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@components/runners/ambient-runner/tests/test_shared_session_credentials.py`:
- Around line 452-453: The test currently hard-codes a fixed number of calls to
server.handle_request() (target=lambda: [server.handle_request() for _ in
range(6)]) which is brittle; change the thread target to run the server loop
until explicitly stopped (e.g., use server.serve_forever() in the thread or
replace the fixed-list lambda with a while loop controlled by a threading.Event
that calls server.handle_request() repeatedly), and ensure the test calls
server.shutdown() or sets the stop event to terminate the thread; update both
occurrences that use server.handle_request() so the test no longer depends on a
hard-coded request count.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 25d9e292-aeaf-424c-b317-588183a93a09

📥 Commits

Reviewing files that changed from the base of the PR and between 1568ef4 and cd8d939.

📒 Files selected for processing (2)
  • .github/workflows/e2e.yml
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py

mergify Bot and others added 2 commits April 21, 2026 20:03
clear_runtime_credentials was deleting the Google Workspace credentials
file between turns, causing workspace-mcp to fall back to an inaccessible
localhost OAuth flow. The file is now intentionally preserved.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (6)
components/manifests/base/rbac/control-plane-clusterrole.yaml (1)

17-27: ⚠️ Potential issue | 🟠 Major

Cluster-wide mutation permissions violate least-privilege.

Lines 18-20 grant cluster-wide RoleBinding CRUD; lines 22-27 grant cluster-wide secrets/services/pods/jobs CRUD. This allows control-plane to mutate sensitive resources and escalate privileges across all namespaces. Scope these to namespace-specific Roles/RoleBindings in project namespaces instead.

As per coding guidelines, RBAC must follow least-privilege.

🔒 Restrict to namespace-scoped Roles

Remove cluster-wide mutation verbs from ClusterRole:

-# RoleBindings (reconcile group access from ProjectSettings)
-- apiGroups: ["rbac.authorization.k8s.io"]
-  resources: ["rolebindings"]
-  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
-# Session runner resources (provision/deprovision per-session workloads in project namespaces)
-- apiGroups: [""]
-  resources: ["secrets", "serviceaccounts", "services", "pods"]
-  verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
-- apiGroups: ["batch"]
-  resources: ["jobs"]
-  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
+# RoleBindings (read-only cluster-wide; mutations via namespace Roles)
++ apiGroups: ["rbac.authorization.k8s.io"]
+  resources: ["rolebindings"]
+  verbs: ["get", "list", "watch"]
+# Session runner resources (read-only cluster-wide; mutations via namespace Roles)
++ apiGroups: [""]
+  resources: ["secrets", "serviceaccounts", "services", "pods"]
+  verbs: ["get", "list", "watch"]
++ apiGroups: ["batch"]
+  resources: ["jobs"]
+  verbs: ["get", "list", "watch"]

Create namespace-scoped Role (e.g., base/rbac/control-plane-project-role.yaml):

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ambient-control-plane-project
  # Applied per project namespace via RoleBinding
rules:
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["rolebindings"]
  verbs: ["create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["secrets", "serviceaccounts", "services", "pods"]
  verbs: ["create", "update", "patch", "delete", "deletecollection"]
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["create", "update", "patch", "delete"]

Bind this Role in each project namespace where control-plane provisions resources.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/base/rbac/control-plane-clusterrole.yaml` around lines
17 - 27, The ClusterRole currently grants cluster-wide mutation permissions for
rolebindings, secrets, serviceaccounts, services, pods and jobs; remove those
mutation rules from the ClusterRole in control-plane-clusterrole.yaml and
instead create a namespace-scoped Role (e.g., name it
ambient-control-plane-project) that contains rules for rbacs rolebindings and
core/batch resources with only the necessary verbs (create, update, patch,
delete, deletecollection as appropriate) and then bind that Role into each
project namespace via a RoleBinding; update any references that expect
cluster-wide permissions to reference the new RoleBinding so control-plane
provisioning is limited to project namespaces only.
components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml (2)

83-84: ⚠️ Potential issue | 🔴 Critical

Fix CP token endpoint namespace mismatch.

Line 84 points to ambient-code--ambient-s0, but this deployment is in ambient-code--runtime-int (Line 5). This will break token fetches from control-plane.

Suggested fix
-            - name: CP_TOKEN_URL
-              value: "http://ambient-control-plane.ambient-code--ambient-s0.svc:8080/token"
+            - name: CP_TOKEN_URL
+              value: "http://ambient-control-plane.ambient-code--runtime-int.svc:8080/token"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`
around lines 83 - 84, The CP_TOKEN_URL environment variable is pointing to the
wrong namespace (ambient-code--ambient-s0) causing token fetches to fail; update
the CP_TOKEN_URL value to use the deployment's actual namespace
(ambient-code--runtime-int) so the URL becomes
http://ambient-control-plane.ambient-code--runtime-int.svc:8080/token and ensure
CP_TOKEN_URL in the manifest for the ambient-control-plane container is the
corrected value.

25-25: ⚠️ Potential issue | 🟠 Major

Replace mutable :latest image tags with immutable versions.

Line 25, Line 54, and Line 86 use mutable tags, which makes rollouts non-deterministic and weakens supply-chain control.

Suggested fix pattern
-          image: quay.io/ambient_code/vteam_control_plane:latest
+          image: quay.io/ambient_code/vteam_control_plane@sha256:<control-plane-digest>
...
-            - name: RUNNER_IMAGE
-              value: "quay.io/ambient_code/vteam_claude_runner:latest"
+            - name: RUNNER_IMAGE
+              value: "quay.io/ambient_code/vteam_claude_runner@sha256:<runner-digest>"
...
-            - name: MCP_IMAGE
-              value: "quay.io/ambient_code/vteam_mcp:latest"
+            - name: MCP_IMAGE
+              value: "quay.io/ambient_code/vteam_mcp@sha256:<mcp-digest>"

Also applies to: 54-54, 86-86

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml` at
line 25, Replace all occurrences of the mutable image reference
"quay.io/ambient_code/vteam_control_plane:latest" with an immutable identifier
(either a fixed semver tag or an image digest like
quay.io/ambient_code/vteam_control_plane@sha256:<digest>) so rollouts are
deterministic; update every instance of that image string in the manifest (the
three occurrences currently using :latest) and ensure any deployment/Pod specs
that reference the image use the same pinned value.
components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py (1)

101-105: ⚠️ Potential issue | 🟠 Major

Avoid blocking the async stream with synchronous gRPC pushes

Line 104 executes a synchronous network call in the async loop, which can stall streaming on slow/unavailable gRPC.

Suggested fix
+import asyncio
...
     try:
         async for event in event_stream:
             yield event
-            _push_event(grpc_client, sid, event)
+            # fire-and-forget without blocking the event loop
+            loop = asyncio.get_running_loop()
+            loop.run_in_executor(None, _push_event, grpc_client, sid, event)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py`
around lines 101 - 105, The async generator in grpc_push.py yields events while
calling the synchronous _push_event (in the loop that iterates over
event_stream), which can block the event stream; change the call to run
_push_event off the event loop (e.g., use asyncio.to_thread or
loop.run_in_executor) or convert _push_event into an async non-blocking function
and await it, so the async for event in event_stream loop no longer performs
blocking network I/O and the generator remains responsive.
components/runners/ambient-runner/tests/test_grpc_writer.py (1)

27-34: ⚠️ Potential issue | 🟠 Major

None client path is still untested

Line 27 converts explicit grpc_client=None into MagicMock, so Line 211 does not exercise the no-client early return.

Suggested fix
+_DEFAULT = object()
 
-def make_writer(grpc_client=None):
-    if grpc_client is None:
+def make_writer(grpc_client=_DEFAULT):
+    if grpc_client is _DEFAULT:
         grpc_client = MagicMock()
     return GRPCMessageWriter(
         session_id="sess-1",
         run_id="run-1",
         grpc_client=grpc_client,
     )

Also applies to: 208-213

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_grpc_writer.py` around lines 27
- 34, The helper make_writer currently replaces a passed grpc_client=None with a
MagicMock so tests never exercise the no-client early-return in
GRPCMessageWriter; update tests to explicitly construct
GRPCMessageWriter(session_id="sess-1", run_id="run-1", grpc_client=None) or
modify make_writer to accept a sentinel (e.g., keep_none=True) that preserves
None so you can assert the early-return behavior when grpc_client is None
(targeting make_writer and GRPCMessageWriter constructor/write path).
components/runners/ambient-runner/ambient_runner/endpoints/run.py (1)

85-97: ⚠️ Potential issue | 🟠 Major

Preserve gRPC persistence when thread_id is absent

Line 85 falls back to "", so gRPC push can be disabled even though bridge.run() still succeeds via context/session fallback. Derive session_id from thread_id or bridge context.

Suggested fix
-    session_id = run_agent_input.thread_id or ""
+    session_id = (
+        run_agent_input.thread_id
+        or getattr(getattr(bridge, "context", None), "session_id", None)
+    )
...
-                session_id=session_id,
+                session_id=session_id,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py` around
lines 85 - 97, The code currently sets session_id = run_agent_input.thread_id or
"" which disables grpc_push_middleware when thread_id is absent; change
session_id derivation to use run_agent_input.thread_id if present else obtain
the session id from the bridge's context (e.g., bridge.context.session_id or
bridge.context.get("session_id")) so grpc_push_middleware receives a persistent
session id even when thread_id is missing; update the session_id assignment near
the grpc_push_middleware call (symbols: session_id, run_agent_input.thread_id,
bridge.run, grpc_push_middleware) to prefer thread_id then bridge context before
falling back to empty.
🧹 Nitpick comments (1)
components/runners/ambient-runner/tests/test_grpc_transport.py (1)

579-594: Minor: unused client variable.

Line 581 unpacks client but only uses writer. Consider writer, _client = self._writer() or just writer, _ = self._writer().

Otherwise, test coverage is comprehensive.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/tests/test_grpc_transport.py` around lines
579 - 594, The test test_push_offloaded_to_executor_not_inline unpacks an unused
second value from self._writer(); change the unpack to ignore the unused client
(e.g., writer, _ = self._writer() or writer, _client = self._writer()) so the
unused-variable warning is resolved while keeping the call to self._writer()
intact; update the unpack at the start of
test_push_offloaded_to_executor_not_inline to use an underscore-prefixed
variable name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`:
- Around line 55-64: The overlay references two missing
Secrets—ambient-api-server (keys clientId, clientSecret) used by the
ambient-control-plane container env vars and ambient-api-server-db (keys
db.user, db.password, db.name) used by the ambient-api-server-db
Deployment—causing CreateContainerConfigError; add Kubernetes Secret manifests
named ambient-api-server and ambient-api-server-db to this overlay with the
referenced keys (clientId, clientSecret and db.user, db.password, db.name
respectively) or update the env/valueFrom to point to existing Secrets; ensure
the secret keys match exactly and are included in the overlay manifests so the
pods can mount them at deploy time.

In
`@components/manifests/overlays/openshift-dev/ambient-api-server-args-patch.yaml`:
- Around line 20-21: The patch currently disables authentication/authorization
via the flags --enable-jwt=false and --enable-authz=false while also allowing
wildcard CORS origins (origins: "*"), which is unsafe; update the manifest so
you do not ship both JWT and authz disabled together — re-enable at least one of
them (e.g., remove or set --enable-authz=true or --enable-jwt=true in the
ambient API server args) and replace the wildcard CORS entry with a restricted
dev-only origin list or make the CORS origin configurable via an
environment/overlay value (reference the flags --enable-jwt, --enable-authz and
the CORS origins entry) so dev defaults are safer and can be tightened in
CI/prod.
- Around line 10-12: Add an explicit securityContext and resource limits to the
api-server container: in the containers -> - name: api-server block, add a
container-level securityContext with runAsNonRoot: true, runAsUser (non-zero
UID), allowPrivilegeEscalation: false, privileged: false, capabilities: { drop:
["ALL"] }, readOnlyRootFilesystem: true and seccompProfile: { type:
"RuntimeDefault" } (or Pod-level equivalents if used); also add
resources.requests and resources.limits (cpu/memory) to satisfy resource policy;
ensure no conflicting fields remain and that these settings appear alongside the
existing command and args for the api-server container.

In `@components/runners/ambient-runner/ambient_runner/app.py`:
- Around line 272-280: The events_router is being registered twice via
app.include_router(events_router), causing duplicate route registration; locate
both occurrences of app.include_router(events_router) (one with the core
endpoints block and one in the between-run event stream block) and remove the
redundant include so events_router is included only once in the intended
location; ensure any related comments/documentation reflect the single
registration and keep app.include_router(run_router),
app.include_router(interrupt_router), and app.include_router(health_router)
as-is.

---

Duplicate comments:
In `@components/manifests/base/rbac/control-plane-clusterrole.yaml`:
- Around line 17-27: The ClusterRole currently grants cluster-wide mutation
permissions for rolebindings, secrets, serviceaccounts, services, pods and jobs;
remove those mutation rules from the ClusterRole in
control-plane-clusterrole.yaml and instead create a namespace-scoped Role (e.g.,
name it ambient-control-plane-project) that contains rules for rbacs
rolebindings and core/batch resources with only the necessary verbs (create,
update, patch, delete, deletecollection as appropriate) and then bind that Role
into each project namespace via a RoleBinding; update any references that expect
cluster-wide permissions to reference the new RoleBinding so control-plane
provisioning is limited to project namespaces only.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`:
- Around line 83-84: The CP_TOKEN_URL environment variable is pointing to the
wrong namespace (ambient-code--ambient-s0) causing token fetches to fail; update
the CP_TOKEN_URL value to use the deployment's actual namespace
(ambient-code--runtime-int) so the URL becomes
http://ambient-control-plane.ambient-code--runtime-int.svc:8080/token and ensure
CP_TOKEN_URL in the manifest for the ambient-control-plane container is the
corrected value.
- Line 25: Replace all occurrences of the mutable image reference
"quay.io/ambient_code/vteam_control_plane:latest" with an immutable identifier
(either a fixed semver tag or an image digest like
quay.io/ambient_code/vteam_control_plane@sha256:<digest>) so rollouts are
deterministic; update every instance of that image string in the manifest (the
three occurrences currently using :latest) and ensure any deployment/Pod specs
that reference the image use the same pinned value.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py`:
- Around line 85-97: The code currently sets session_id =
run_agent_input.thread_id or "" which disables grpc_push_middleware when
thread_id is absent; change session_id derivation to use
run_agent_input.thread_id if present else obtain the session id from the
bridge's context (e.g., bridge.context.session_id or
bridge.context.get("session_id")) so grpc_push_middleware receives a persistent
session id even when thread_id is missing; update the session_id assignment near
the grpc_push_middleware call (symbols: session_id, run_agent_input.thread_id,
bridge.run, grpc_push_middleware) to prefer thread_id then bridge context before
falling back to empty.

In `@components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py`:
- Around line 101-105: The async generator in grpc_push.py yields events while
calling the synchronous _push_event (in the loop that iterates over
event_stream), which can block the event stream; change the call to run
_push_event off the event loop (e.g., use asyncio.to_thread or
loop.run_in_executor) or convert _push_event into an async non-blocking function
and await it, so the async for event in event_stream loop no longer performs
blocking network I/O and the generator remains responsive.

In `@components/runners/ambient-runner/tests/test_grpc_writer.py`:
- Around line 27-34: The helper make_writer currently replaces a passed
grpc_client=None with a MagicMock so tests never exercise the no-client
early-return in GRPCMessageWriter; update tests to explicitly construct
GRPCMessageWriter(session_id="sess-1", run_id="run-1", grpc_client=None) or
modify make_writer to accept a sentinel (e.g., keep_none=True) that preserves
None so you can assert the early-return behavior when grpc_client is None
(targeting make_writer and GRPCMessageWriter constructor/write path).

---

Nitpick comments:
In `@components/runners/ambient-runner/tests/test_grpc_transport.py`:
- Around line 579-594: The test test_push_offloaded_to_executor_not_inline
unpacks an unused second value from self._writer(); change the unpack to ignore
the unused client (e.g., writer, _ = self._writer() or writer, _client =
self._writer()) so the unused-variable warning is resolved while keeping the
call to self._writer() intact; update the unpack at the start of
test_push_offloaded_to_executor_not_inline to use an underscore-prefixed
variable name.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5de9ffc6-c173-42d3-a50d-6651218ebf70

📥 Commits

Reviewing files that changed from the base of the PR and between cd8d939 and fdf839c.

📒 Files selected for processing (80)
  • components/manifests/base/ambient-control-plane-service.yml
  • components/manifests/base/kustomization.yaml
  • components/manifests/base/rbac/control-plane-clusterrole.yaml
  • components/manifests/base/rbac/control-plane-clusterrolebinding.yaml
  • components/manifests/base/rbac/control-plane-sa.yaml
  • components/manifests/base/rbac/kustomization.yaml
  • components/manifests/overlays/cluster-reader/README.md
  • components/manifests/overlays/cluster-reader/kustomization.yaml
  • components/manifests/overlays/cluster-reader/service-account.yaml
  • components/manifests/overlays/kind-local/operator-env-patch.yaml
  • components/manifests/overlays/local-dev/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/README.md
  • components/manifests/overlays/mpp-openshift/ambient-api-server-args-patch.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-route.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-service-ca-patch.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-tenant-sa.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yaml
  • components/manifests/overlays/mpp-openshift/ambient-tenant-ingress-netpol.yaml
  • components/manifests/overlays/mpp-openshift/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-runtime-int.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-s0.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/kustomization.yaml
  • components/manifests/overlays/openshift-dev/ambient-api-server-args-patch.yaml
  • components/manifests/overlays/openshift-dev/ambient-api-server-env-patch.yaml
  • components/manifests/overlays/openshift-dev/kustomization.yaml
  • components/manifests/overlays/production/ambient-api-server-route.yaml
  • components/manifests/overlays/production/kustomization.yaml
  • components/runners/ambient-runner/.mcp.json
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/runners/ambient-runner/ag_ui_gemini_cli/types.py
  • components/runners/ambient-runner/ambient_runner/_grpc_client.py
  • components/runners/ambient-runner/ambient_runner/_inbox_messages_api.py
  • components/runners/ambient-runner/ambient_runner/_session_messages_api.py
  • components/runners/ambient-runner/ambient_runner/app.py
  • components/runners/ambient-runner/ambient_runner/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/tools.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/events.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/runners/ambient-runner/ambient_runner/middleware/__init__.py
  • components/runners/ambient-runner/ambient_runner/middleware/grpc_push.py
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/ambient_runner/platform/prompts.py
  • components/runners/ambient-runner/ambient_runner/platform/utils.py
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/runners/ambient-runner/architecture.md
  • components/runners/ambient-runner/pyproject.toml
  • components/runners/ambient-runner/tests/test_app_initial_prompt.py
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/tests/test_bridge_claude.py
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/runners/ambient-runner/tests/test_events_endpoint.py
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/runners/ambient-runner/tests/test_grpc_client.py
  • components/runners/ambient-runner/tests/test_grpc_transport.py
  • components/runners/ambient-runner/tests/test_grpc_writer.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py
  • docs/internal/proposals/alpha-to-main-migration.md
💤 Files with no reviewable changes (9)
  • components/runners/ambient-runner/tests/test_e2e_api.py
  • components/manifests/overlays/cluster-reader/kustomization.yaml
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/session.py
  • components/runners/ambient-runner/tests/test_gemini_session.py
  • components/runners/ambient-runner/tests/test_auto_push.py
  • components/runners/ambient-runner/tests/test_gemini_cli_adapter.py
  • components/runners/ambient-runner/tests/test_gemini_auth.py
  • components/manifests/overlays/cluster-reader/service-account.yaml
  • components/manifests/overlays/cluster-reader/README.md
✅ Files skipped from review due to trivial changes (38)
  • components/runners/ambient-runner/ambient_runner/middleware/secret_redaction.py
  • components/runners/ambient-runner/tests/test_google_drive_e2e.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/reasoning_events.py
  • components/manifests/overlays/local-dev/ambient-api-server-route.yaml
  • components/runners/ambient-runner/ambient_runner/tools/backend_api.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/handlers.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ag_ui_claude_sdk/hooks.py
  • components/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yaml
  • components/manifests/base/rbac/control-plane-sa.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-runtime-int.yaml
  • components/runners/ambient-runner/.mcp.json
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-sa.yaml
  • components/manifests/base/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/ambient-tenant-ingress-netpol.yaml
  • components/manifests/overlays/mpp-openshift/ambient-cp-tenant-sa.yaml
  • components/manifests/base/rbac/kustomization.yaml
  • components/runners/ambient-runner/ambient_runner/endpoints/tasks.py
  • components/manifests/overlays/production/ambient-api-server-route.yaml
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/manifests/overlays/production/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/tenant-rbac/kustomization.yaml
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac.yaml
  • components/runners/ambient-runner/ag_ui_claude_sdk/adapter.py
  • components/manifests/overlays/openshift-dev/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-service-ca-patch.yaml
  • components/manifests/base/ambient-control-plane-service.yml
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/manifests/overlays/mpp-openshift/README.md
  • components/manifests/overlays/mpp-openshift/tenant-rbac/ambient-control-plane-rbac-s0.yaml
  • components/manifests/base/rbac/control-plane-clusterrolebinding.yaml
  • components/manifests/overlays/mpp-openshift/ambient-api-server-route.yaml
  • components/runners/ambient-runner/tests/test_events_endpoint.py
  • components/runners/ambient-runner/tests/test_app_initial_prompt.py
  • components/runners/ambient-runner/architecture.md
  • components/manifests/overlays/mpp-openshift/kustomization.yaml
  • components/manifests/overlays/mpp-openshift/ambient-control-plane-svc.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py
  • components/runners/ambient-runner/pyproject.toml
  • docs/internal/proposals/alpha-to-main-migration.md
  • components/runners/ambient-runner/ambient_runner/bridges/claude/prompts.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py

Comment on lines +55 to +64
- name: OIDC_CLIENT_ID
valueFrom:
secretKeyRef:
name: ambient-api-server
key: clientId
- name: OIDC_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: ambient-api-server
key: clientSecret
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find Secret/ExternalSecret/SealedSecret definitions for ambient-api-server.
rg -n -C3 'kind:\s*(Secret|ExternalSecret|SealedSecret)|name:\s*ambient-api-server|namespace:\s*ambient-code--runtime-int' components/manifests

# 2) Check kustomizations that include secret resources or generators.
rg -n -C4 'secretGenerator|resources:|ambient-api-server' components/manifests

# Expected:
# - At least one manifest path that provisions ambient-api-server for ambient-code--runtime-int,
#   or a clearly referenced external secret mechanism included by the overlay.

Repository: ambient-code/platform

Length of output: 50378


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check what's actually in the ambient-api-server.yaml file for mpp-openshift
echo "=== Content of mpp-openshift/ambient-api-server.yaml ==="
head -160 components/manifests/overlays/mpp-openshift/ambient-api-server.yaml | tail -30

# Look specifically for Secret definitions for ambient-api-server in mpp-openshift
echo -e "\n=== Searching for Secret kind in mpp-openshift overlay ==="
rg -n "^kind: Secret" components/manifests/overlays/mpp-openshift/

# Check mpp-openshift kustomization for secretGenerator or secret resources
echo -e "\n=== Searching mpp-openshift kustomization.yaml for secrets ==="
rg -n -A2 -B2 "secretGenerator|resources:" components/manifests/overlays/mpp-openshift/kustomization.yaml | head -60

# Check if ambient-api-server Secret exists explicitly
echo -e "\n=== Checking for Secret ambient-api-server in entire mpp-openshift ==="
rg -n "kind: Secret" components/manifests/overlays/mpp-openshift/ -A2 | grep -A2 "ambient-api-server" || echo "No ambient-api-server Secret found"

Repository: ambient-code/platform

Length of output: 1349


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Final check: is there any ExternalSecret or SealedSecret for ambient-api-server in mpp-openshift?
echo "=== Checking for ExternalSecret/SealedSecret definitions ==="
rg -n "kind: (ExternalSecret|SealedSecret)" components/manifests/overlays/mpp-openshift/ || echo "None found"

# Check if ambient-api-server-db is defined (to see pattern for secrets in this overlay)
echo -e "\n=== Checking ambient-api-server-db.yaml to understand secret provisioning pattern ==="
rg -n "kind: Secret" components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml -A5 | head -20

Repository: ambient-code/platform

Length of output: 223


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if ambient-api-server-db.yaml exists and what it contains
echo "=== Checking ambient-api-server-db.yaml ==="
if [ -f components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml ]; then
  cat components/manifests/overlays/mpp-openshift/ambient-api-server-db.yaml
else
  echo "File does not exist"
fi

# List all files in mpp-openshift to see what's actually there
echo -e "\n=== Files in mpp-openshift overlay ==="
ls -la components/manifests/overlays/mpp-openshift/

Repository: ambient-code/platform

Length of output: 3813


Add Secret ambient-api-server to the mpp-openshift overlay.

Lines 58–64 reference keys clientId and clientSecret from Secret ambient-api-server, which is not defined in this overlay. The pod will fail with CreateContainerConfigError on deployment.

Additionally, the ambient-api-server-db Deployment also references a missing ambient-api-server-db Secret (with keys db.user, db.password, db.name). Both Secrets must be provisioned before the overlay can run.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/manifests/overlays/mpp-openshift/ambient-control-plane.yaml`
around lines 55 - 64, The overlay references two missing
Secrets—ambient-api-server (keys clientId, clientSecret) used by the
ambient-control-plane container env vars and ambient-api-server-db (keys
db.user, db.password, db.name) used by the ambient-api-server-db
Deployment—causing CreateContainerConfigError; add Kubernetes Secret manifests
named ambient-api-server and ambient-api-server-db to this overlay with the
referenced keys (clientId, clientSecret and db.user, db.password, db.name
respectively) or update the env/valueFrom to point to existing Secrets; ensure
the secret keys match exactly and are included in the overlay manifests so the
pods can mount them at deploy time.

Comment on lines +10 to +12
containers:
- name: api-server
command:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add explicit pod/container securityContext hardening

The container currently relies on default security context; set explicit non-root and no-privilege-escalation controls to avoid root-capable defaults (Line 10–12, Line 48–68).

Suggested hardening patch
 spec:
   template:
     spec:
+      securityContext:
+        seccompProfile:
+          type: RuntimeDefault
       containers:
         - name: api-server
+          securityContext:
+            runAsNonRoot: true
+            allowPrivilegeEscalation: false
+            readOnlyRootFilesystem: true
+            capabilities:
+              drop:
+                - ALL
           command:
             - /usr/local/bin/ambient-api-server
As per coding guidelines, "`**/*`: Flag security vulnerabilities" and "`components/manifests/**/*.yaml`: Resource limits/requests required on containers."

Also applies to: 48-68

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@components/manifests/overlays/openshift-dev/ambient-api-server-args-patch.yaml`
around lines 10 - 12, Add an explicit securityContext and resource limits to the
api-server container: in the containers -> - name: api-server block, add a
container-level securityContext with runAsNonRoot: true, runAsUser (non-zero
UID), allowPrivilegeEscalation: false, privileged: false, capabilities: { drop:
["ALL"] }, readOnlyRootFilesystem: true and seccompProfile: { type:
"RuntimeDefault" } (or Pod-level equivalents if used); also add
resources.requests and resources.limits (cpu/memory) to satisfy resource policy;
ensure no conflicting fields remain and that these settings appear alongside the
existing command and args for the api-server container.

Comment on lines +20 to +21
- --enable-jwt=false
- --enable-authz=false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid shipping fully disabled auth with wildcard CORS

Line 20–21 disables both JWT and authz, and Line 39 allows * origins. Even for dev, this combination is high risk if route/network boundaries are misconfigured.

Safer dev defaults
-            - --enable-jwt=false
-            - --enable-authz=false
+            - --enable-jwt=true
+            - --enable-authz=true
@@
-            - --cors-allowed-origins=*
+            - --cors-allowed-origins=https://ambient-dev.example.com
As per coding guidelines, "`**/*`: Flag bugs, security vulnerabilities ... and meaningful refactoring opportunities."

Also applies to: 39-40

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@components/manifests/overlays/openshift-dev/ambient-api-server-args-patch.yaml`
around lines 20 - 21, The patch currently disables authentication/authorization
via the flags --enable-jwt=false and --enable-authz=false while also allowing
wildcard CORS origins (origins: "*"), which is unsafe; update the manifest so
you do not ship both JWT and authz disabled together — re-enable at least one of
them (e.g., remove or set --enable-authz=true or --enable-jwt=true in the
ambient API server args) and replace the wildcard CORS entry with a restricted
dev-only origin list or make the CORS origin configurable via an
environment/overlay value (reference the flags --enable-jwt, --enable-authz and
the CORS origins entry) so dev defaults are safer and can be tightened in
CI/prod.

Comment thread components/runners/ambient-runner/ambient_runner/app.py
user and others added 2 commits April 21, 2026 16:21
Tests were refactored to remove CREDENTIAL_IDS but _fetch_credential
still requires it to look up the credential_id and build the API URL.
Without it, the function returns {} immediately without hitting the
test HTTP server. Also restores credential_id-based response matching
in the lifecycle test.

713 passed, 11 skipped locally.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
…hot tests

The session-list screenshot test failed because setTheme clicked the
toggle button before it was visible in the DOM. Add explicit visibility
wait with 10s timeout to setTheme, matching the pattern already used
in the waitForThemeToggle setup step.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@mergify mergify Bot added the queued label Apr 21, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 21, 2026

Merge Queue Status

  • Entered queue2026-04-21 21:05 UTC · Rule: default
  • Checks skipped · PR is already up-to-date
  • Merged2026-04-21 21:05 UTC · at f240cd91df5d881888913980ddd329bce1a88a6e · squash

This pull request spent 11 seconds in the queue, including 1 second running CI.

Required conditions to merge

@mergify mergify Bot merged commit 1aa8b42 into main Apr 21, 2026
60 of 61 checks passed
@mergify mergify Bot deleted the chore/alpha-migration-pr6-runners branch April 21, 2026 21:05
@mergify mergify Bot removed the queued label Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ SDD Preflight — Managed Paths Modified

This PR modifies files in SDD-managed component(s). These components are migrating to Spec-Driven Development.

File Component Mode
components/runners/ambient-runner/.mcp.json runner warn
components/runners/ambient-runner/architecture.md runner warn
components/runners/ambient-runner/pyproject.toml runner warn

No action required — these components are in warn mode. Consider using the component's agent workflow for future changes.

📖 Specs: Runner Spec · Runner Constitution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant