feat(specs): add custom runner image specification by jbpratt · Pull Request #1563 · ambient-code/platform

jbpratt · 2026-05-12T13:18:33Z

Summary

Adds specs/agents/runner-image.spec.md defining the stable runner contract and a workspace-level custom image override
Custom images are built via Dockerfile FROM on a published base image — no init hooks
New runner_image and runner_image_pull_secret fields on ProjectSettings let workspace admins configure a custom runner per project
Defines stable interfaces: AG-UI HTTP endpoints, filesystem layout, entrypoint contract, environment variables, security constraints, and Python runtime requirements
Includes image selection precedence (ProjectSettings > agent registry > operator default), registry allowlist validation, RBAC, and failure mode scenarios

Details

The spec establishes the boundary between "what the platform guarantees" and "what custom images can change." Key design decisions:

Dockerfile FROM only — init hooks rejected due to non-reproducibility, startup latency, network dependency, and OpenShift SCC conflicts
ProjectSettings, not Session — image trust is an admin concern; all sessions in a project use the same vetted image
Agent registry is orthogonal — custom image overrides the container image but preserves RUNNER_TYPE, resources, and sandbox config from the registry

Test plan

Review spec for completeness against runner.spec.md and control-plane.spec.md
Verify GIVEN/WHEN/THEN scenarios are testable
Confirm implementation touchpoints table is accurate

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Added a comprehensive spec for custom runner images: required AG-UI HTTP endpoints/port, Python 3.12+ with preserved minor version, mandatory filesystem paths, and lifecycle/startup expectations.
- Clarified runtime/security constraints: prefer no CMD/ENTRYPOINT overrides, non-root runtime (build-time root allowed), OpenShift arbitrary-UID guidance, forbidden overrides of CP-injected env vars, and graceful shutdown handling.
- ProjectSettings support: per-project runner image/pull-secret, precedence rules, pull-policy logic, RBAC/update requirement, and note that changes affect new sessions only.
- Added base image publishing requirements and a conformance test suite for endpoint, filesystem, env, and runtime checks.

netlify · 2026-05-12T13:18:48Z

✅ Deploy Preview for cheerful-kitten-f556a0 canceled.

Name	Link
🔨 Latest commit	`6e3e36f`
🔍 Latest deploy log	https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/6a05f818b91bb10008c3b0ef

coderabbitai · 2026-05-12T13:18:50Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Defines a stable custom runner image contract: required AG‑UI HTTP endpoints and response formats, Python/runtime and filesystem invariants, entrypoint/signal/non‑root constraints, ProjectSettings runner_image override (validation, secrets, pull policy, precedence), failure modes, security expectations, and base-image OCI contract labeling.

Changes

Runner image contract & ProjectSettings override

Layer / File(s)	Summary
AG‑UI HTTP contract and required endpoints `specs/agents/runner-image.spec.md` (lines 1–173)	Specifies `AGUI_PORT` (default 8001) and required endpoints: `/`, `/interrupt`, `/health`, `/capabilities`, `/events/{thread_id}`. Endpoint URLs and response formats are stable and must not be removed or changed; additional AG‑UI endpoints may be provided by `ambient_runner`.
Python, packaging, and filesystem invariants `specs/agents/runner-image.spec.md` (lines 61–167)	Requires Python ≥3.12; runner must use the base image's Python major.minor and include the `ambient_runner` package. Mandates preserved paths: `/workspace`, `/app`, `/app/ambient-runner`, `/app/vertex`, `/tmp`.
Entrypoint, startup, and signal handling `specs/agents/runner-image.spec.md` (lines 61–167)	Describes CMD/ENTRYPOINT expectations (prefer not overriding; wrappers must `exec` the runner), runner must listen on `AGUI_PORT`, handle SIGTERM for graceful shutdown (runner as PID 1 or direct child), and start within pod startup timeout.
Control‑plane environment variables and runtime UID/security rules `specs/agents/runner-image.spec.md` (lines 61–167)	Enumerates CP-injected env vars that custom images must not override (e.g., `SESSION_ID`, `PROJECT_NAME`, `WORKSPACE_PATH`, `AGUI_PORT`, backend/grpc/token endpoints, `INITIAL_PROMPT`, `IS_RESUME`, `CREDENTIAL_IDS`, `RUNNER_TYPE`). Requires non‑root runtime (UID 1001), `allowPrivilegeEscalation: false`, and dropped capabilities; root allowed only during build.
ProjectSettings fields, precedence, and scope `specs/agents/runner-image.spec.md` (lines 178–318)	Adds `runner_image` and `runner_image_pull_secret` to ProjectSettings. Selection precedence: ProjectSettings > agent registry image > operator `RUNNER_IMAGE`. Project-level overrides affect only newly created sessions and do not change agent-type-specific settings.
Image reference validation, allowlist, and pull credentials `specs/agents/runner-image.spec.md` (lines 178–318)	Defines image reference syntax/host validation, optional registry allowlist via `RUNNER_IMAGE_ALLOWED_REGISTRIES`, and `imagePullSecrets` usage limited to the project namespace with `kubernetes.io/dockerconfigjson` Secret type. ImagePullPolicy: `IfNotPresent` for digests and `localhost/` refs; `Always` for other tags.
RBAC and operational constraints `specs/agents/runner-image.spec.md` (lines 178–318)	Requires `project_settings:update` permission to modify `runner_image` or `runner_image_pull_secret`. Changes apply only to subsequently created sessions.
Failure modes and session state transitions `specs/agents/runner-image.spec.md` (lines 321–360)	Documents outcomes for failures: AG‑UI health/readiness timeouts → session `Failed`; startup crashes or missing bridge for declared `RUNNER_TYPE` → session error/`Failed`; image pull failures → pull error/backoff and `Failed` as applicable.
Security boundary and isolation expectations `specs/agents/runner-image.spec.md` (lines 362–417)	Specifies platform responsibilities: enforce non‑root PodSecurityContext, per‑session ServiceAccount isolation, NetworkPolicy inheritance, and that credentials/tokens must be fetched/cleared by `ambient_runner`; custom images must not bake credentials.
Base image publishing, OCI contract label, and conformance tests `specs/agents/runner-image.spec.md` (lines 428–476)	Requires base images to publish a stable versioned base and include OCI label `io.ambient-code.runner-contract-version="1"`. Label mismatches surface a pod-creation warning. Requires a conformance test suite verifying AG‑UI endpoints, filesystem and permission expectations, non‑root runtime, env var non‑overrides, and bridge availability.

🚥 Pre-merge checks | ✅ 8

✅ Passed checks (8 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title follows Conventional Commits format and clearly describes the main change: a new specification for custom runner images.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Performance And Algorithmic Complexity	✅ Passed	PR adds only documentation (specs/agents/runner-image.spec.md, 481 lines Markdown). No code, algorithms, or performance-critical patterns present. Check not applicable to specification-only changes.
Security And Secret Handling	✅ Passed	Spec-only PR. Properly defines RBAC, credential isolation (no bundled secrets, per-turn clear), non-root execution, namespace-scoped Secrets. No hardcoded secrets or auth bypasses detected.
Kubernetes Resource Safety	✅ Passed	Documentation spec only. Properly addresses pod security context, RBAC, and namespace-scoped secrets. Check not applicable to specification files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

specs/agents/runner-image.spec.md (2)
93-93: 💤 Low value

Clarify path description to avoid confusion.

The phrase "MUST contain installed ambient_runner package" could be misread to mean the pip package must be installed at /app/ambient-runner, when it actually means this directory contains the application code (main.py) that imports the package installed elsewhere in site-packages.
📝 Clearer phrasing
-| `/app/ambient-runner` | Runner package source and working directory | MUST contain installed `ambient_runner` package |
+| `/app/ambient-runner` | Runner application root and working directory | MUST contain main.py and application code; requires `ambient_runner` package installed via pip |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specs/agents/runner-image.spec.md` at line 93, The spec line for
`/app/ambient-runner` is ambiguous about where the pip-installed ambient_runner
resides; update the wording so it clearly states that `/app/ambient-runner`
contains the application source (e.g., main.py) which imports the
`ambient_runner` package installed in site-packages, not that the pip package
itself is installed at that path; reference the `/app/ambient-runner` directory,
the application entrypoint `main.py`, and the `ambient_runner` package in the
revised sentence to make this distinction explicit.
461-461: ⚡ Quick win

Consider blocking contract version mismatches by default.

The spec makes version checking advisory-only (CP logs warning but creates pod anyway). However, if a custom image uses contract v2 with breaking changes and the CP expects v1, the session will fail unpredictably at runtime rather than being rejected upfront.
💡 Alternative design

Make blocking the default with operator opt-in for mismatches:
-The CP MAY read this label at pod creation time and log a warning if the contract version does not match the expected version. This is advisory — the CP SHALL NOT block pod creation based on contract version mismatch.
+The CP SHALL read this label at pod creation time. If the contract version does not match the expected version, the CP SHALL transition the session to `Failed` with a condition describing the mismatch UNLESS the operator has set `ALLOW_CONTRACT_VERSION_MISMATCH=true`.
This preserves flexibility for operators who explicitly opt in while preventing accidental incompatibilities.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specs/agents/runner-image.spec.md` at line 461, Update the sentence about
contract-version handling so the Control Plane (CP) SHALL by default reject pod
creation on a contract version mismatch instead of merely warning; add a clear
operator-configurable override (e.g., an "allowContractMismatch" opt-in flag)
that, when enabled, permits the previous advisory behavior and logs a warning;
ensure the wording references the "contract version" label and the CP's behavior
at "pod creation" so readers can locate and implement the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specs/agents/runner-image.spec.md`:
- Around line 274-276: Document that ProjectSettings.runner_image can override
image but not agent-type config (RUNNER_TYPE, resource limits, state dir) and
add a Failure Modes entry describing the cryptic Python import error when a
custom image lacks the required bridge implementation (e.g., ClaudeBridge,
GeminiCLIBridge, LangGraphBridge) for the session's runner type; update the
recommendations to advise building custom images FROM the standard base to
inherit all bridges and add a runtime validation step in the session creation
flow (where ProjectSettings.runner_image is applied) that inspects the image or
performs a quick probe to confirm the presence of the required bridge for the
requested RUNNER_TYPE and surface a clear, actionable error if missing.

---

Nitpick comments:
In `@specs/agents/runner-image.spec.md`:
- Line 93: The spec line for `/app/ambient-runner` is ambiguous about where the
pip-installed ambient_runner resides; update the wording so it clearly states
that `/app/ambient-runner` contains the application source (e.g., main.py) which
imports the `ambient_runner` package installed in site-packages, not that the
pip package itself is installed at that path; reference the
`/app/ambient-runner` directory, the application entrypoint `main.py`, and the
`ambient_runner` package in the revised sentence to make this distinction
explicit.
- Line 461: Update the sentence about contract-version handling so the Control
Plane (CP) SHALL by default reject pod creation on a contract version mismatch
instead of merely warning; add a clear operator-configurable override (e.g., an
"allowContractMismatch" opt-in flag) that, when enabled, permits the previous
advisory behavior and logs a warning; ensure the wording references the
"contract version" label and the CP's behavior at "pod creation" so readers can
locate and implement the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 34cd54f5-c174-486c-a499-0113c9af9cf5

📥 Commits

Reviewing files that changed from the base of the PR and between 28874a9 and 0add287.

📒 Files selected for processing (1)

specs/agents/runner-image.spec.md

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specs/agents/runner-image.spec.md`:
- Around line 154-164: The spec enforces a contradictory UID requirement: it
mandates a fixed UID 1001 via Dockerfile `USER 1001` while also recommending
OpenShift arbitrary-UID compatibility (e.g. `chmod -R g=u`), which conflicts
under restrictive SCCs; change the normative contract to require non-root
runtime behavior (`runAsNonRoot: true`, `allowPrivilegeEscalation: false`,
`drop: ["ALL"]` and no root at runtime) and demote `UID 1001`/`Dockerfile USER
1001` to a base-image default or recommendation, keeping the OpenShift
compatibility guidance (`chmod -R g=u` on writable paths) as a SHOULD rather
than a SHALL so implementations can satisfy `runAsNonRoot` without a fixed UID.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7fd1e84c-8a85-4262-a291-b54a4719e4c9

📥 Commits

Reviewing files that changed from the base of the PR and between 0add287 and e576f72.

📒 Files selected for processing (1)

specs/agents/runner-image.spec.md

jeremyeder · 2026-05-13T15:43:58Z

+- GIVEN a project with `runner_image` unset
+- AND a session with a specific runner type
+- WHEN the CP provisions the pod
+- THEN the pod uses the image from the agent registry entry for that runner type


Meaning there can be only one base image for each runner type.

jeremyeder · 2026-05-13T15:44:31Z

+
+The CP SHALL support an operator-level allowlist of permitted registries via `RUNNER_IMAGE_ALLOWED_REGISTRIES` (comma-separated hostnames). When set, images from unlisted registries SHALL be rejected and the session SHALL transition to `Failed` with a descriptive condition.
+
+When the allowlist is unset, the CP SHALL accept any registry. Operators SHOULD configure the allowlist in production deployments.


Should we set a default?

jeremyeder · 2026-05-13T15:48:18Z

+| Runner token authentication (AG-UI, CP, gRPC) | `ambient_runner` package | Preserve the package |
+| NetworkPolicy (ingress and egress) | Cluster operator | None — pod inherits namespace policies |
+
+Custom images MUST NOT bundle credentials, tokens, or secrets in the image layers. All credentials SHALL be fetched at runtime via cluster-internal API endpoints as defined in `../security/security.spec.md`.


Security scans like this should be included in the test suite. Prodsec also has skills.

jeremyeder · 2026-05-13T15:49:48Z

A few comments added.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specs/agents/runner-image.spec.md`:
- Around line 334-347: Add a conformance test that verifies the runner bridge
for each supported RUNNER_TYPE is present and loadable in custom images: for
each RUNNER_TYPE, start a session using a custom image lacking that bridge and
assert the pod logs contain an error naming the missing bridge module, the pod
exits non-zero, and the control plane marks the session Failed; update the
conformance checklist/spec sections referencing RUNNER_TYPE and the "Bridge
Mismatch" requirement so the suite explicitly covers bridge
availability/loadability and log assertion for missing modules.
- Line 86: Update the requirement that "/app MUST exist; writeable by UID 1001"
to be compatible with arbitrary non-root UIDs: replace the hardcoded "UID 1001"
reference with wording that /app must be writeable by the container's effective
non-root UID (for example "writeable by the container's effective non-root UID
or a group writable location when running under an arbitrary UID such as
OpenShift SCC"), and apply the same change to the related statements referenced
around lines 164-167 so the spec consistently allows arbitrary non-root UIDs
instead of a fixed UID.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 34201e0f-462a-4b2b-9b12-a2cc0c0e66ed

📥 Commits

Reviewing files that changed from the base of the PR and between 903e3e4 and 0a117ad.

📒 Files selected for processing (1)

specs/agents/runner-image.spec.md

markturansky · 2026-05-13T17:22:10Z

Amber Code Review — PR #1563

PR: Custom Runner Image Specification (specs/agents/runner-image.spec.md)
Scope: Pure spec addition — no production code changed.

Summary

This PR adds a well-researched specification for custom runner image support. The stable contract sections (HTTP endpoints, filesystem layout, environment variables, SecurityContext) are thorough and security-aware. The spec is ready for structural review — the findings below are about gaps that need to be addressed before implementation begins, not errors in the spec's logic.

Findings

Blocker

None.

Critical

1. No feature flag gating mentioned — specs/agents/runner-image.spec.md

CLAUDE.md mandates: "Feature flags strongly recommended: Gate new features behind Unleash flags." This is a new, admin-facing platform capability. The spec defines behavior for ProjectSettings, pod provisioning, and RBAC — but says nothing about gating this behind a flag. Without an explicit gate, implementation PRs may ship this directly to production without a rollout control mechanism.

Standard violated: CLAUDE.md § Feature flags strongly recommended.
Suggested fix: Add a requirement section stating that runner_image support shall be gated behind an Unleash flag (e.g., custom-runner-image), with behavior when flag is disabled (fall through to existing image selection, ignore the field).

2. Missing full-stack consumer enumeration — specs/agents/runner-image.spec.md

CLAUDE.md states: "Auth/credential/API changes must update ALL consumers (backend, CLI, SDK, runner, sidecar) in the same PR." The spec adds runner_image and runner_image_pull_secret to ProjectSettings — but never enumerates which components need to change: backend handler, CLI (acpctl), ambient-sdk, and the frontend UI. Without this, implementation PRs will miss consumers.

Standard violated: CLAUDE.md § Full-stack awareness.
Suggested fix: Add an "Implementation Impact" section that explicitly lists every component that must be updated: backend ProjectSettings handler, operator pod provisioning logic, CLI acpctl project-settings command, Go + Python SDK types, and the frontend Settings page — referencing which files are the entry points.

Major

3. Broken relative path for security spec — line 8

  - `../security/security.spec.md` — Per-session SA isolation, credential boundaries

The actual file is at specs/standards/security/security.spec.md. Using ../security/security.spec.md resolves to specs/agents/../security/security.spec.md = specs/security/security.spec.md, which does not exist.

Suggested fix:

  - `../standards/security/security.spec.md` — Per-session SA isolation, credential boundaries

4. No API surface defined for the new ProjectSettings fields

The RBAC requirement states users need project_settings:update to modify runner_image, but the spec never defines whether this goes through an existing REST endpoint (PATCH /api/projects/{project}/settings) or a new one. The frontend and SDK cannot be implemented without knowing the API shape.

Suggested fix: Add a section (or a cross-reference to the ambient-model.spec.md) that explicitly shows the API payload shape for the new fields and which existing endpoint handles them.

5. Conformance test suite ownership is unspecified — "Base Image Publishing" section

The spec says "The platform SHALL publish a conformance test suite" but doesn't define: who owns it, where it lives in the repo, when it ships relative to the feature, or what CI workflow executes it. This will be silently dropped during implementation.

Suggested fix: Add: The conformance test suite SHALL live at scripts/runner-conformance/ and SHALL be invoked in CI via the existing component-benchmarks workflow or a dedicated workflow. The test suite SHALL ship in the same milestone as the base image.

Minor

6. Image pull policy table doesn't address cost/predictability tradeoff

The table sets Always for tag-based images (the common case). This means every pod start triggers a registry round-trip. The spec doesn't acknowledge this or recommend digest-pinning for production. Given the existing guidance in the Security Boundary section about building FROM digest, this inconsistency may cause surprise.

Suggested fix: Add a note after the table: "Workspace admins SHOULD pin production images by digest to guarantee reproducibility and avoid unnecessary registry round-trips on pod start."

7. runner_image_pull_secret validation doesn't cover wrong-namespace case

The spec covers: Secret doesn't exist → Failed. But it doesn't address: Secret exists in a different namespace. In multi-tenant clusters, a user could reference a secret name that resolves in a different project's namespace.

Suggested fix: Add: "The CP SHALL only look up runner_image_pull_secret in the project's own namespace. References that resolve to a Secret in a different namespace SHALL be rejected."

Positive Highlights

The stable runner contract is an excellent design artifact — explicit tables for HTTP endpoints, filesystem paths, and env vars give implementors and custom image authors a clear, unambiguous target.
Security boundary section cleanly separates what the image is responsible for vs. what the platform enforces externally. This prevents reimplementation anti-patterns.
Failure mode scenarios are detailed and cover the real failure paths (bridge mismatch, ImagePullBackOff, crash on start) that matter operationally.
Init hook rejection rationale is well-argued (non-reproducibility, startup latency, NetworkPolicy conflicts, OpenShift SCC).
The image selection precedence (ProjectSettings → agent registry → operator env var → hardcoded) is clear and extensible.

Recommendations (Prioritized)

Add feature flag requirement — block implementation PRs from shipping uncontrolled.
Add implementation impact section — backend, operator, CLI, SDK, frontend, with file-level pointers.
Fix the broken security spec path — trivial, do now.
Define the API shape — PATCH endpoint + payload schema for the new fields.
Nail down conformance test suite ownership — file location, CI integration, milestone.
Address the pull-secret namespace scoping gap.
Add digest-pinning production recommendation to the pull policy table.

Review by Amber — Ambient Code automated review agent

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specs/agents/runner-image.spec.md`:
- Around line 178-323: The spec adds ProjectSettings.runner_image and
runner_image_pull_secret without gating; add a new Requirement that these fields
are gated by the Unleash flag custom-runner-image: when disabled, PATCH/PUT
attempts to set runner_image or runner_image_pull_secret must be rejected with a
descriptive error (403/validation), and session provisioning logic (where image
selection occurs) must ignore ProjectSettings.runner_image and fall back to
normal precedence; reference ProjectSettings.runner_image,
ProjectSettings.runner_image_pull_secret, the session creation/path that selects
runner image, and the flag name custom-runner-image in the new paragraph.
- Around line 13-21: Add an "Implementation Impact" section to the spec that
enumerates all components that must change for ProjectSettings runner image
support and point to relevant files: list Backend (ProjectSettings handler
including validation, RBAC, registry allowlist check — reference
internal/api/project_settings.go), Operator (image selection precedence and pod
spec generation — reference pkg/operator/session_controller.go), CLI (acpctl
commands to set/get runner_image and runner_image_pull_secret), SDKs (client
library updates for new ProjectSettings fields), and Frontend (UI for workspace
admins to configure custom images); for each bullet include one-line notes on
what to update (e.g., validation rules, RBAC checks, precedence logic, new CLI
flags/commands, SDK model fields, UI form and permissions).
- Around line 454-467: Under the "Requirement: Conformance Test Suite" section
add explicit metadata: declare a repository location (e.g.,
scripts/runner-conformance/ or tests/conformance/runner/) and the exact
command/Make target to run the suite (e.g., make test-conformance
IMAGE=<custom-image>), name the owning team (e.g., platform-agents) responsible
for maintenance and updates, describe the CI integration (e.g., a GitHub Actions
workflow that runs on base-image builds and pull requests touching runner
images), and specify the shipping milestone (e.g., "must ship with the first
base image public release"); place this text directly beneath the "Requirement:
Conformance Test Suite" heading so the spec explicitly covers location,
ownership, CI workflow, and delivery timing.
- Around line 180-184: The spec adds ProjectSettings fields `runner_image` and
`runner_image_pull_secret` but omits the API surface; update the document by
either adding a cross-reference to the ProjectSettings API in
control-plane.spec.md or ambient-model.spec.md or by embedding an explicit API
contract: declare the endpoint (e.g., PATCH /api/projects/{project}/settings),
the JSON payload keys (`runner_image`, `runner_image_pull_secret`), validation
rules (400 for invalid image/ref syntax, 403 for disallowed registry, 404 for
missing secret), and expected responses (200 on success, 400/403/404 as above),
and ensure the ProjectSettings resource definition includes these two string
fields so implementers can locate `runner_image` and `runner_image_pull_secret`.
- Line 9: Update the incorrect relative link string in runner-image.spec.md:
replace the reference `../security/security.spec.md` with the correct path
`../standards/security/security.spec.md` so the spec in specs/agents points to
specs/standards/security/security.spec.md.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2def2899-7590-4cdd-ab8d-01a160bcd98a

📥 Commits

Reviewing files that changed from the base of the PR and between 0a117ad and 22ec809.

📒 Files selected for processing (1)

specs/agents/runner-image.spec.md

coderabbitai · 2026-05-13T17:41:43Z

+  - `runner.spec.md` — Runner runtime, AG-UI protocol, bridge layer
+  - `../control-plane/control-plane.spec.md` — Pod provisioning, image selection, env var injection
+  - `../api/ambient-model.spec.md` — ProjectSettings, Session data model
+  - `../security/security.spec.md` — Per-session SA isolation, credential boundaries


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the relative path to the security spec.

The path ../security/security.spec.md is incorrect. From specs/agents/, the correct relative path to specs/standards/security/security.spec.md is ../standards/security/security.spec.md.

📝 Proposed fix

- - `../security/security.spec.md` — Per-session SA isolation, credential boundaries + - `../standards/security/security.spec.md` — Per-session SA isolation, credential boundaries

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- `../security/security.spec.md` — Per-session SA isolation, credential boundaries

- `../standards/security/security.spec.md` — Per-session SA isolation, credential boundaries

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@specs/agents/runner-image.spec.md` at line 9, Update the incorrect relative link string in runner-image.spec.md: replace the reference `../security/security.spec.md` with the correct path `../standards/security/security.spec.md` so the spec in specs/agents points to specs/standards/security/security.spec.md.

coderabbitai · 2026-05-13T17:41:43Z

+## Purpose
+
+The Ambient Runner ships a single image containing Python, git, Node.js, Go, and several CLI tools. Workspace admins who need additional tools — Terraform, kubectl, language-specific SDKs, internal CLIs — have no supported extension path short of forking the image.
+
+This spec defines a **stable runner contract** (the set of filesystem paths, HTTP endpoints, environment variables, and security constraints that custom images must preserve), a **Dockerfile FROM extension model** (users layer tools onto a published base image), and a **ProjectSettings-driven image override** (workspace admins declare a custom image per project).
+
+The extension model is Dockerfile FROM only. Init hooks (scripts run at pod startup) were rejected: they are non-reproducible across pods, add startup latency, require runtime network egress that conflicts with NetworkPolicy isolation, and create OpenShift SCC conflicts when installing system packages.
+
+This spec covers only the **image boundary** — what must be true about a container image for the platform to run it as a runner. Runner internals (bridge layer, gRPC transport, credential management) are defined in `runner.spec.md`. Pod provisioning mechanics are defined in `control-plane.spec.md`.


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add an Implementation Impact section listing all affected components.

The spec adds ProjectSettings fields but does not enumerate which components must be modified. Per reviewer feedback, this blocks implementation planning.

Add a section (e.g., after Purpose or before Stable Runner Contract) listing:

Backend: ProjectSettings handler (validation, RBAC, registry allowlist check)

Operator: Image selection precedence logic, pod spec generation

CLI (acpctl): Commands to set/get runner_image and runner_image_pull_secret

SDKs: Client library updates for new ProjectSettings fields

Frontend: UI for workspace admins to configure custom images

Include file-level pointers where applicable (e.g., pkg/operator/session_controller.go, internal/api/project_settings.go).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@specs/agents/runner-image.spec.md` around lines 13 - 21, Add an "Implementation Impact" section to the spec that enumerates all components that must change for ProjectSettings runner image support and point to relevant files: list Backend (ProjectSettings handler including validation, RBAC, registry allowlist check — reference internal/api/project_settings.go), Operator (image selection precedence and pod spec generation — reference pkg/operator/session_controller.go), CLI (acpctl commands to set/get runner_image and runner_image_pull_secret), SDKs (client library updates for new ProjectSettings fields), and Frontend (UI for workspace admins to configure custom images); for each bullet include one-line notes on what to update (e.g., validation rules, RBAC checks, precedence logic, new CLI flags/commands, SDK model fields, UI form and permissions).

coderabbitai · 2026-05-13T17:41:44Z

+### Requirement: Conformance Test Suite
+
+The platform SHALL publish a conformance test suite that validates a custom runner image against the stable contract. The test suite SHALL verify:
+
+- AG-UI endpoints respond correctly (`/health`, `/capabilities`, `/`)
+- Required filesystem paths exist and are writeable
+- The runner process starts within the expected timeout
+- The runner runs as a non-root user
+- CP-injected environment variables are not overridden by the image
+- Required bridge implementations are present and loadable for each supported `RUNNER_TYPE`
+
+The test suite SHALL produce a pass/fail result suitable for CI/CD integration.
+
+The test suite SHOULD include security checks: non-root user verification, no SUID binaries, and base image provenance validation. Operators MAY extend the suite with additional security scanning (vulnerability scanning, SBOM generation) using their existing tooling.


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Specify conformance test suite location, ownership, and CI integration.

The spec requires a conformance test suite but does not define where it lives, who owns it, how it's invoked, or when it ships.

Add details for:

Location: Repository path (e.g., scripts/runner-conformance/ or tests/conformance/runner/)

Ownership: Team responsible for maintenance

CI workflow: How the suite is run (GitHub Actions workflow, manual script)

Shipping milestone: When the suite must be available (e.g., "ships with first base image release")

📋 Example addition

The conformance test suite SHALL be published at `scripts/runner-conformance/` and maintained by the platform-agents team. The suite SHALL be runnable via `make test-conformance IMAGE=<custom-image>` and integrated into CI for base image builds. The suite MUST ship before the first public release of the base image.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@specs/agents/runner-image.spec.md` around lines 454 - 467, Under the "Requirement: Conformance Test Suite" section add explicit metadata: declare a repository location (e.g., scripts/runner-conformance/ or tests/conformance/runner/) and the exact command/Make target to run the suite (e.g., make test-conformance IMAGE=<custom-image>), name the owning team (e.g., platform-agents) responsible for maintenance and updates, describe the CI integration (e.g., a GitHub Actions workflow that runs on base-image builds and pull requests touching runner images), and specify the shipping milestone (e.g., "must ship with the first base image public release"); place this text directly beneath the "Requirement: Conformance Test Suite" heading so the spec explicitly covers location, ownership, CI workflow, and delivery timing.

This is implementation detail specific, not really for the spec

coderabbitai · 2026-05-13T17:41:46Z

CodeRabbit chat interactions are restricted to organization members for this repository. Ask an organization member to interact with CodeRabbit, or set chat.allow_non_org_members: true in your configuration.

Define the stable runner contract and a ProjectSettings-driven image override so workspace admins can layer tools onto the base runner via Dockerfile FROM. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Address PR feedback: - Add venv guidance to Python Runtime Contract - Add agent registry inline explanation - Clarify runner type / custom image relationship - Add network hardening forward-looking note - Add Conformance Test Suite requirement with security checks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Comment thread specs/agents/runner-image.spec.md Outdated

jbpratt force-pushed the spec/custom-runner-image branch from b49e8eb to 2308ab4 Compare May 12, 2026 13:50

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Comment thread specs/agents/runner-image.spec.md Outdated

jbpratt force-pushed the spec/custom-runner-image branch 2 times, most recently from 903e3e4 to ac61fed Compare May 13, 2026 08:02