Skip to content

feat: add OpenShift deployment templates#1642

Merged
mergify[bot] merged 1 commit into
mainfrom
feat/add-openshift-templates
Jun 3, 2026
Merged

feat: add OpenShift deployment templates#1642
mergify[bot] merged 1 commit into
mainfrom
feat/add-openshift-templates

Conversation

@maknop
Copy link
Copy Markdown
Collaborator

@maknop maknop commented Jun 3, 2026

Add OpenShift Template resources for deploying the Ambient Code Platform via AppSRE tooling. Includes operator and services templates with parameterized configuration for image tags and environment settings.

Templates include:

  • template-operator.yaml: CRDs, operator deployment, RBAC, and runner configuration
  • template-services.yaml: Backend, frontend, OAuth proxy, and ingress configuration
  • validate.sh: Template validation script using oc process

Summary by CodeRabbit

Release Notes

  • New Features

    • Operator template now includes AgenticSession and ProjectSettings custom resource types with complete configuration schemas for managing sessions and project settings.
    • Services template provides full Ambient Code Platform deployment including API server, backend, frontend, and public API gateway with proper networking and security policies.
  • Chores

    • Added validation script for platform templates.

Add OpenShift Template resources for deploying the Ambient Code Platform
via AppSRE tooling. Includes operator and services templates with
parameterized configuration for image tags and environment settings.

Templates include:
- template-operator.yaml: CRDs, operator deployment, RBAC, and runner configuration
- template-services.yaml: Backend, frontend, OAuth proxy, and ingress configuration
- validate.sh: Template validation script using oc process

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Jun 3, 2026

Deploy Preview for cheerful-kitten-f556a0 canceled.

Name Link
🔨 Latest commit 9764ed0
🔍 Latest deploy log https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/6a2078fb2b263f00083d319c

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

📝 Walkthrough

Walkthrough

Two new OpenShift Templates provision the Ambient Code Platform: ambient-code-operator defines operator infrastructure (CRDs for AgenticSession and ProjectSettings, RBAC, ConfigMaps, NetworkPolicy, and agentic-operator Deployment); ambient-code-services deploys the service stack (Namespace, API server, backend, frontend, public API with routes and TLS). A validation script confirms both templates process correctly.

Changes

Operator and Services Platform Templates

Layer / File(s) Summary
Operator CRD contracts
components/manifests/templates/template-operator.yaml (lines 1–344)
AgenticSession and ProjectSettings CRDs with full OpenAPI v3 schemas defining spec/status fields, validation rules (including singleton enforcement), and lifecycle semantics for session and project configuration.
Operator RBAC and configuration
components/manifests/templates/template-operator.yaml (lines 344–1287)
Service accounts, ClusterRoles (operator, admin, project, backend, MLflow auth), ClusterRoleBindings, and ConfigMaps for model registry, feature flags, auth ACLs, and operator environment settings.
Operator deployment and networking
components/manifests/templates/template-operator.yaml (lines 1288–1456)
Agentic-operator Deployment with image wiring, environment configuration, probes, and resource limits. Namespace-scoped NetworkPolicy restricting backend pod ingress from runner-labeled pods.
Services base infrastructure
components/manifests/templates/template-services.yaml (lines 1–195)
Template parameters (image tags, namespace, upstream timeout), target Namespace, API server Secret, Service definitions for API/backend/frontend/public, LimitRange, and persistent volume claims for database and state.
Ambient API server deployment
components/manifests/templates/template-services.yaml (lines 196–339)
Deployment with command-line args, init container for DB migrations, JWT secret mounting, HTTP/GRPC port exposure, security configuration, and health probes.
Backend API deployment
components/manifests/templates/template-services.yaml (lines 340–617)
Deployment with extensive environment configuration (OAuth, GitHub, Google, LDAP, Unleash, operator settings), PVC-mounted state, optional read-only config volumes (models, flags, registry), and health checks.
Frontend and public API deployments
components/manifests/templates/template-services.yaml (lines 618–824)
Frontend Deployment with oauth-proxy sidecar for OpenShift TLS/auth, explicit proxy args, and health probes. Public-api Deployment with hardened security context, rate limiting, and PodDisruptionBudget.
OpenShift Routes
components/manifests/templates/template-services.yaml (lines 825–914)
Routes for ambient API (REST and GRPC), backend, frontend, and public API with TLS termination policies and HAProxy routing annotations.
Template validation script
components/manifests/templates/validate.sh
Bash script validating both templates via oc process with fixed parameters, reporting per-template status and failing fast on errors.

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Security And Secret Handling ❌ Error Secrets/PVCs missing ownerReferences; hardcoded jsell-ambient-sso-poc in ClusterRoleBindings (lines 1197, 1212, 1239) create privilege leaks. Add ownerReferences to Secret and 3 PVCs. Remove hardcoded jsell-ambient-sso-poc subjects from ClusterRoleBindings.
Kubernetes Resource Safety ⚠️ Warning Missing OwnerReferences on Secrets/PVCs; backend-api ClusterRole overly permissive; Deployments lack pod securityContext; hardcoded jsell-ambient-sso-poc subjects. Add ownerReferences to Secrets/PVCs; scope backend-api RBAC to namespace Roles only; add pod/container securityContext with runAsNonRoot and drop ALL; remove jsell-ambient-sso-poc subjects.
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title follows Conventional Commits format (feat: scope) and accurately describes the main change: adding OpenShift deployment templates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Performance And Algorithmic Complexity ✅ Passed Kubernetes manifests only. No O(n²) algorithms, N+1 patterns, unbounded caches, missing pagination, or expensive loops detected. Resource limits and connection pooling properly configured.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/add-openshift-templates
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/add-openshift-templates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (1)
components/manifests/templates/validate.sh (1)

1-16: 💤 Low value

LGTM! Script correctly validates both templates with proper error handling.

Optional improvement: Consider adding command -v oc >/dev/null 2>&1 || { echo "oc command not found"; exit 1; } after line 4 for explicit feedback if OpenShift CLI is missing, though the default error is clear enough.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/validate.sh` around lines 1 - 16, Add an
explicit check for the OpenShift CLI in validate.sh by testing for the presence
of the oc binary immediately after the set -e line; if command -v oc fails,
print a clear "oc command not found" message and exit 1 so the later oc process
calls (which validate template-operator.yaml and template-services.yaml) never
run without the CLI available.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@components/manifests/templates/template-operator.yaml`:
- Around line 973-1162: The ClusterRole named "backend-api" grants broad
cluster-wide permissions; split it into a minimal ClusterRole containing only
truly cluster-scoped verbs/resources (e.g., clusterroles binding, tokenreviews,
subjectaccessreviews, project-level reads) and move namespace-scoped permissions
(namespaces, secrets, serviceaccounts, roles/rolebindings, jobs, pods, services,
cronjobs, configmaps, pvcs, mlflow CRDs) into a separate Role (e.g.,
"backend-api-namespace") that is applied per-target-namespace and bound via
RoleBinding to the backend service account; update the ClusterRoleBinding that
currently binds the "backend-api" ClusterRole to only reference the new minimal
ClusterRole and create RoleBindings for each namespace the backend manages to
reference the namespaced Role so the service account gets least-privilege access
only inside those namespaces.
- Around line 1320-1446: The Deployment is missing pod and container
SecurityContext hardening for the agentic-operator container; add a pod-level
securityContext (e.g., runAsNonRoot: true, runAsUser: 1000) and a
container-level securityContext on the agentic-operator container with
runAsNonRoot: true, readOnlyRootFilesystem: true, allowPrivilegeEscalation:
false and capabilities.drop: ["ALL"] (and optionally set runAsUser/uid) to
enforce the repo policy that all containers run non-root, drop all capabilities,
and have a read-only root filesystem; place these settings under the spec ->
securityContext for the pod and under the container entry for the container
named agentic-operator.
- Around line 1184-1239: The ClusterRoleBinding entries agentic-operator,
ambient-frontend-auth, ambient-users-can-list-projects, and backend-api include
hardcoded subjects referencing the jsell-ambient-sso-poc namespace; remove those
hardcoded ServiceAccount subject lines (the "- kind: ServiceAccount / name: ...
/ namespace: jsell-ambient-sso-poc" entries) so the bindings no longer grant
privileges to that unrelated namespace, leaving only the intended subjects (or
replace with a parameterized/templated namespace variable if a namespace must be
bound dynamically).

In `@components/manifests/templates/template-services.yaml`:
- Around line 52-55: The Route is configured with termination: reencrypt but the
ambient-api-server Deployment runs with --enable-https=false and does not mount
the ambient-api-server-tls secret, so the router will attempt TLS to an HTTP
port; fix by either (A) enable HTTPS in the ambient-api-server Deployment (set
--enable-https=true, change --api-server-bindaddress to :8443 or desired HTTPS
port, mount the ambient-api-server-tls secret into the pod and update Service
targetPort to that HTTPS port) or (B) change the Route annotations/termination
for ambient-api-server and ambient-api-server-grpc from reencrypt to
passthrough/edge so no backend TLS is expected; update the Service/Route
targetPort and any health checks accordingly to match the chosen approach
(referencing ambient-api-server, ambient-api-server-tls, --enable-https, and
--api-server-bindaddress).
- Around line 219-324: Harden the pod and containers by adding a pod-level
securityContext and per-container securityContexts: for the api-server container
(name: api-server) set runAsNonRoot: true and make readOnlyRootFilesystem: true
(currently false) while keeping allowPrivilegeEscalation: false and
capabilities.drop: [ALL]; for the migration initContainer (name: migration) add
runAsNonRoot: true, set readOnlyRootFilesystem: true, and add resources.requests
and resources.limits (match sensible values used by api-server or cluster
policy); also add equivalent securityContext entries (runAsNonRoot: true,
readOnlyRootFilesystem: true, allowPrivilegeEscalation: false,
capabilities.drop: [ALL]) for other containers referenced in this template
(backend-api, frontend, oauth-proxy) and add a pod-level securityContext to
enforce non-root UID defaults across the pod.
- Around line 23-26: The template declares parameters NAMESPACE and
UPSTREAM_TIMEOUT but resources still hardcode "ambient-code" and the oauth-proxy
flag uses "--upstream-timeout=5m"; update the manifest references to use the
template parameter placeholders ${NAMESPACE} and ${UPSTREAM_TIMEOUT} instead of
the hardcoded values (e.g., replace occurrences of ambient-code with
${NAMESPACE} and replace the oauth-proxy arg --upstream-timeout=5m with
--upstream-timeout=${UPSTREAM_TIMEOUT}) so parameter overrides will take effect
when the template is processed.
- Around line 39-48: The Secret "ambient-api-server" and the PVCs
"ambient-api-server-db-data", "backend-state-pvc", and "postgresql-data" in
template-services.yaml are missing metadata.ownerReferences; add a
controller-style ownerReference for each emitted child resource so Kubernetes
can garbage-collect them with their parent. Update the resources'
metadata.ownerReferences to include the parent's apiVersion, kind, name and uid
and set controller: true and blockOwnerDeletion as appropriate (use the same
parent resource used by the template that owns these objects), ensuring the
ownerReference fields are populated for manifest generation in
template-services.yaml.

---

Nitpick comments:
In `@components/manifests/templates/validate.sh`:
- Around line 1-16: Add an explicit check for the OpenShift CLI in validate.sh
by testing for the presence of the oc binary immediately after the set -e line;
if command -v oc fails, print a clear "oc command not found" message and exit 1
so the later oc process calls (which validate template-operator.yaml and
template-services.yaml) never run without the CLI available.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f3e510bb-e06f-4743-aac6-c6238d1df2c8

📥 Commits

Reviewing files that changed from the base of the PR and between 754ab13 and 9764ed0.

📒 Files selected for processing (3)
  • components/manifests/templates/template-operator.yaml
  • components/manifests/templates/template-services.yaml
  • components/manifests/templates/validate.sh

Comment on lines +973 to +1162
kind: ClusterRole
metadata:
name: backend-api
rules:
- apiGroups:
- vteam.ambient-code
resources:
- agenticsessions
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- vteam.ambient-code
resources:
- agenticsessions/status
verbs:
- get
- update
- patch
- apiGroups:
- ''
resources:
- serviceaccounts
verbs:
- get
- list
- create
- update
- patch
- apiGroups:
- ''
resources:
- serviceaccounts/token
verbs:
- create
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- rbac.authorization.k8s.io
resources:
- roles
- rolebindings
verbs:
- get
- list
- create
- update
- patch
- delete
- apiGroups:
- rbac.authorization.k8s.io
resourceNames:
- ambient-project-admin
- ambient-project-edit
- ambient-project-view
resources:
- clusterroles
verbs:
- bind
- apiGroups:
- ''
resources:
- secrets
verbs:
- get
- list
- create
- update
- patch
- delete
- apiGroups:
- ''
resources:
- configmaps
verbs:
- get
- create
- update
- patch
- apiGroups:
- ''
resources:
- namespaces
verbs:
- get
- list
- create
- update
- patch
- delete
- apiGroups:
- project.openshift.io
resources:
- projects
verbs:
- get
- list
- watch
- update
- patch
- apiGroups:
- batch
resources:
- jobs
verbs:
- get
- list
- watch
- create
- delete
- apiGroups:
- batch
resources:
- cronjobs
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ''
resources:
- pods
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- apiGroups:
- ''
resources:
- pods/log
verbs:
- get
- apiGroups:
- ''
resources:
- persistentvolumeclaims
verbs:
- get
- list
- watch
- apiGroups:
- ''
resources:
- services
verbs:
- get
- list
- create
- delete
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
- selfsubjectaccessreviews
verbs:
- create
# MLflow experiment tracking permissions (namespace-scoped CRDs)
# Required to grant MLflow access to session runner service accounts
# Sessions need these permissions to log experiments/runs to MLflow tracking server
# Uses Kubeflow MLflow Operator CRDs for Kubernetes-native RBAC authentication
- apiGroups:
- mlflow.kubeflow.org
resources:
- experiments
- runs
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiVersion: rbac.authorization.k8s.io/v1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Scope the backend-api RBAC down before binding it cluster-wide.

This ClusterRole can create/update/delete namespaces, secrets, serviceaccounts, roles, rolebindings, jobs, pods, services, and cronjobs, and the ClusterRoleBinding at Lines 1226-1239 applies that across the cluster. If the backend service account is compromised, this is enough to mutate workloads and credentials well outside Ambient-managed projects.

A safer split is: keep only the minimal cluster-scoped bootstrap verbs in a cluster role, and move namespace-local resource management into roles that are bound only inside the namespaces the backend owns.

As per coding guidelines, "RBAC must follow least-privilege."

Also applies to: 1226-1239

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-operator.yaml` around lines 973 -
1162, The ClusterRole named "backend-api" grants broad cluster-wide permissions;
split it into a minimal ClusterRole containing only truly cluster-scoped
verbs/resources (e.g., clusterroles binding, tokenreviews, subjectaccessreviews,
project-level reads) and move namespace-scoped permissions (namespaces, secrets,
serviceaccounts, roles/rolebindings, jobs, pods, services, cronjobs, configmaps,
pvcs, mlflow CRDs) into a separate Role (e.g., "backend-api-namespace") that is
applied per-target-namespace and bound via RoleBinding to the backend service
account; update the ClusterRoleBinding that currently binds the "backend-api"
ClusterRole to only reference the new minimal ClusterRole and create
RoleBindings for each namespace the backend manages to reference the namespaced
Role so the service account gets least-privilege access only inside those
namespaces.

Comment on lines +1184 to +1239
kind: ClusterRoleBinding
metadata:
name: agentic-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: agentic-operator
subjects:
- kind: ServiceAccount
name: agentic-operator
namespace: ambient-code
- kind: ServiceAccount
name: agentic-operator
namespace: jsell-ambient-sso-poc
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ambient-frontend-auth
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ambient-frontend-auth
subjects:
- kind: ServiceAccount
name: frontend
namespace: ambient-code
- kind: ServiceAccount
name: frontend
namespace: jsell-ambient-sso-poc
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ambient-users-can-list-projects
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ambient-namespace-viewer
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:authenticated
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: backend-api
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: backend-api
subjects:
- kind: ServiceAccount
name: backend-api
namespace: ambient-code
- kind: ServiceAccount
name: backend-api
namespace: jsell-ambient-sso-poc
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the hardcoded jsell-ambient-sso-poc subjects from these ClusterRoleBindings.

Lines 1197, 1212, and 1239 grant operator/frontend/backend privileges to service accounts in a specific unrelated namespace. Applying this template anywhere else will still pre-authorize that namespace if it exists later, which is a real privilege leak from a deployment artifact.

Suggested cleanup
   subjects:
   - kind: ServiceAccount
     name: agentic-operator
     namespace: ambient-code
-  - kind: ServiceAccount
-    name: agentic-operator
-    namespace: jsell-ambient-sso-poc
@@
   subjects:
   - kind: ServiceAccount
     name: frontend
     namespace: ambient-code
-  - kind: ServiceAccount
-    name: frontend
-    namespace: jsell-ambient-sso-poc
@@
   subjects:
   - kind: ServiceAccount
     name: backend-api
     namespace: ambient-code
-  - kind: ServiceAccount
-    name: backend-api
-    namespace: jsell-ambient-sso-poc
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-operator.yaml` around lines 1184 -
1239, The ClusterRoleBinding entries agentic-operator, ambient-frontend-auth,
ambient-users-can-list-projects, and backend-api include hardcoded subjects
referencing the jsell-ambient-sso-poc namespace; remove those hardcoded
ServiceAccount subject lines (the "- kind: ServiceAccount / name: ... /
namespace: jsell-ambient-sso-poc" entries) so the bindings no longer grant
privileges to that unrelated namespace, leaving only the intended subjects (or
replace with a parameterized/templated namespace variable if a namespace must be
bound dynamically).

Comment on lines +1320 to +1446
spec:
containers:
- args:
- --max-concurrent-reconciles=10
- --health-probe-bind-address=:8081
- --leader-elect=false
env:
- name: AMBIENT_CODE_RUNNER_IMAGE
value: ${IMAGE_AMBIENT_RUNNER}:${IMAGE_TAG}
- name: MAX_CONCURRENT_RECONCILES
value: '10'
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: BACKEND_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: BACKEND_API_URL
value: http://backend-service:8080/api
- name: IMAGE_PULL_POLICY
value: IfNotPresent
- name: USE_VERTEX
valueFrom:
configMapKeyRef:
key: USE_VERTEX
name: operator-config
optional: true
- name: CLOUD_ML_REGION
valueFrom:
configMapKeyRef:
key: CLOUD_ML_REGION
name: operator-config
optional: true
- name: ANTHROPIC_VERTEX_PROJECT_ID
valueFrom:
configMapKeyRef:
key: ANTHROPIC_VERTEX_PROJECT_ID
name: operator-config
optional: true
- name: GOOGLE_APPLICATION_CREDENTIALS
valueFrom:
configMapKeyRef:
key: GOOGLE_APPLICATION_CREDENTIALS
name: operator-config
optional: true
- name: LANGFUSE_ENABLED
valueFrom:
secretKeyRef:
key: LANGFUSE_ENABLED
name: ambient-admin-langfuse-secret
optional: true
- name: LANGFUSE_HOST
valueFrom:
secretKeyRef:
key: LANGFUSE_HOST
name: ambient-admin-langfuse-secret
optional: true
- name: LANGFUSE_PUBLIC_KEY
valueFrom:
secretKeyRef:
key: LANGFUSE_PUBLIC_KEY
name: ambient-admin-langfuse-secret
optional: true
- name: LANGFUSE_SECRET_KEY
valueFrom:
secretKeyRef:
key: LANGFUSE_SECRET_KEY
name: ambient-admin-langfuse-secret
optional: true
- name: GOOGLE_OAUTH_CLIENT_ID
valueFrom:
secretKeyRef:
key: GOOGLE_OAUTH_CLIENT_ID
name: google-workflow-app-secret
optional: true
- name: GOOGLE_OAUTH_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: GOOGLE_OAUTH_CLIENT_SECRET
name: google-workflow-app-secret
optional: true
- name: STATE_SYNC_IMAGE
value: quay.io/ambient_code/vteam_state_sync:latest
- name: S3_ENDPOINT
value: http://minio.ambient-code.svc:9000
- name: S3_BUCKET
value: ambient-sessions
- name: DEPLOYMENT_ENV
value: production
- name: VERSION
value: latest
image: ${IMAGE_OPERATOR}:${IMAGE_TAG}
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /healthz
port: health
initialDelaySeconds: 15
periodSeconds: 20
name: agentic-operator
ports:
- containerPort: 8081
name: health
protocol: TCP
readinessProbe:
httpGet:
path: /readyz
port: health
initialDelaySeconds: 5
periodSeconds: 10
resources:
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 100m
memory: 512Mi
volumeMounts:
- mountPath: /config/models
name: model-manifest
readOnly: true
- mountPath: /config/registry
name: agent-registry
readOnly: true
restartPolicy: Always
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add restricted pod/container security contexts to the operator Deployment.

The operator pod has limits/requests, but no pod securityContext and no container hardening. On OpenShift that still leaves capabilities/read-only FS unconstrained by this manifest, which violates the repo policy and weakens the default security posture.

Suggested hardening
       spec:
+        securityContext:
+          runAsNonRoot: true
+          seccompProfile:
+            type: RuntimeDefault
         containers:
         - args:
           - --max-concurrent-reconciles=10
@@
           image: ${IMAGE_OPERATOR}:${IMAGE_TAG}
           imagePullPolicy: Always
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities:
+              drop:
+              - ALL
+            readOnlyRootFilesystem: true

As per coding guidelines, "All containers must have restricted SecurityContext: runAsNonRoot, drop ALL capabilities, readOnlyRootFilesystem".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-operator.yaml` around lines 1320 -
1446, The Deployment is missing pod and container SecurityContext hardening for
the agentic-operator container; add a pod-level securityContext (e.g.,
runAsNonRoot: true, runAsUser: 1000) and a container-level securityContext on
the agentic-operator container with runAsNonRoot: true, readOnlyRootFilesystem:
true, allowPrivilegeEscalation: false and capabilities.drop: ["ALL"] (and
optionally set runAsUser/uid) to enforce the repo policy that all containers run
non-root, drop all capabilities, and have a read-only root filesystem; place
these settings under the spec -> securityContext for the pod and under the
container entry for the container named agentic-operator.

Comment on lines +23 to +26
- name: NAMESPACE
value: ambient-code
- name: UPSTREAM_TIMEOUT
value: 5m
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

echo "== File exists? =="; ls -l "$FILE"

echo "== Template parameter declarations (NAMESPACE, UPSTREAM_TIMEOUT) =="
rg -n "name:\s*(NAMESPACE|UPSTREAM_TIMEOUT)|parameters:|kind:\s*Template" "$FILE" || true

echo "== Hardcoded ambient-code occurrences ==" 
rg -n "ambient-code" "$FILE" || true

echo "== Parameter usage occurrences for ${NAMESPACE} and ${UPSTREAM_TIMEOUT} ==" 
rg -n "\$\{NAMESPACE\}|\$\{UPSTREAM_TIMEOUT\}" "$FILE" || true

echo "== oauth-proxy upstream-timeout hardcode occurrences ==" 
rg -n "--upstream-timeout" "$FILE" || true

echo "== Snippet around NAMESPACE/UPSTREAM_TIMEOUT parameter wiring area ==" 
# Print a limited region near where NAMESPACE and UPSTREAM_TIMEOUT are referenced in params list or elsewhere
rg -n "name:\s*UPSTREAM_TIMEOUT|name:\s*NAMESPACE" -n "$FILE" | cut -d: -f1 | head -n 5 > /tmp/lines.txt
cat /tmp/lines.txt
while read -r L; do
  echo "--- context around line $L ---"
  sed -n "$((L-10)),$((L+25))p" "$FILE"
done < /tmp/lines.txt

echo "== Snippet around oauth-proxy args ==" 
# find lines with oauth-proxy and show around
rg -n "oauth-proxy" "$FILE" | head -n 10 | cut -d: -f1 | while read -r L; do
  echo "--- context around oauth-proxy line $L ---"
  sed -n "$((L-20)),$((L+60))p" "$FILE" | sed -n '1,120p'
done

Repository: ambient-code/platform

Length of output: 2055


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

echo "== File exists? =="; ls -l "$FILE"

echo "== Template parameter declarations (NAMESPACE, UPSTREAM_TIMEOUT) =="
rg -n "name:\s*(NAMESPACE|UPSTREAM_TIMEOUT)|parameters:|kind:\s*Template" "$FILE" || true

echo "== Hardcoded ambient-code occurrences =="
rg -n "ambient-code" "$FILE" || true

echo "== Parameter usage occurrences for ${NAMESPACE} and ${UPSTREAM_TIMEOUT} =="
rg -n "\\$\\{NAMESPACE\\}|\\$\\{UPSTREAM_TIMEOUT\\}" "$FILE" || true

echo "== oauth-proxy upstream-timeout hardcode occurrences =="
rg -n "--upstream-timeout" "$FILE" || true

echo "== Snippet around NAMESPACE/UPSTREAM_TIMEOUT parameter declarations =="
rg -n "name:\s*UPSTREAM_TIMEOUT|name:\s*NAMESPACE" "$FILE" | head -n 20 | cut -d: -f1 | while read -r L; do
  echo "--- context around line $L ---"
  sed -n "$((L-12)),$((L+28))p" "$FILE"
done

echo "== Snippet around oauth-proxy args ==" 
rg -n "oauth-proxy" "$FILE" | head -n 20 | cut -d: -f1 | while read -r L; do
  echo "--- context around oauth-proxy line $L ---"
  sed -n "$((L-25)),$((L+70))p" "$FILE" | head -n 140
done

Repository: ambient-code/platform

Length of output: 2055


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

echo "== File exists? =="; ls -l "$FILE"

echo "== Template parameter declarations (NAMESPACE, UPSTREAM_TIMEOUT) =="
rg -n "name:\s*(NAMESPACE|UPSTREAM_TIMEOUT)|parameters:|kind:\s*Template" "$FILE" || true

echo "== Hardcoded ambient-code occurrences =="
rg -n "ambient-code" "$FILE" || true

echo "== Parameter usage occurrences for ${NAMESPACE} and ${UPSTREAM_TIMEOUT} =="
rg -n "\\$\\{NAMESPACE\\}|\\$\\{UPSTREAM_TIMEOUT\\}" "$FILE" || true

echo "== oauth-proxy upstream-timeout hardcode occurrences =="
rg -n "--upstream-timeout" "$FILE" || true

echo "== Snippet around NAMESPACE/UPSTREAM_TIMEOUT parameter declarations =="
rg -n "name:\s*UPSTREAM_TIMEOUT|name:\s*NAMESPACE" "$FILE" | head -n 20 | cut -d: -f1 | while read -r L; do
  echo "--- context around line $L ---"
  sed -n "$((L-12)),$((L+28))p" "$FILE"
done

echo "== Snippet around oauth-proxy args ==" 
rg -n "oauth-proxy" "$FILE" | head -n 20 | cut -d: -f1 | while read -r L; do
  echo "--- context around oauth-proxy line $L ---"
  sed -n "$((L-25)),$((L+70))p" "$FILE" | head -n 140
done

Repository: ambient-code/platform

Length of output: 2055


🏁 Script executed:

# (re-run with a smaller output to avoid truncation)
FILE="components/manifests/templates/template-services.yaml"

echo "== NAMESPACE / UPSTREAM_TIMEOUT param declarations =="
rg -n "name:\s*(NAMESPACE|UPSTREAM_TIMEOUT)|value:\s*(ambient-code|5m)" "$FILE" || true

echo "== hardcoded ambient-code =="
rg -n "ambient-code" "$FILE" || true

echo "== ${NAMESPACE} / ${UPSTREAM_TIMEOUT} usage =="
rg -n "\\$\\{NAMESPACE\\}|\\$\\{UPSTREAM_TIMEOUT\\}" "$FILE" || true

echo "== upstream-timeout usage =="
rg -n "--upstream-timeout" "$FILE" || true

Repository: ambient-code/platform

Length of output: 2417


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="components/manifests/templates/template-services.yaml"

echo "== param declarations =="
sed -n '1,120p' "$FILE" | cat

echo "== find relevant wiring lines =="
echo "[ambient-code occurrences]"
rg -n "ambient-code" "$FILE" || true
echo "[upstream-timeout occurrences]"
rg -n "upstream-timeout" "$FILE" || true
echo "[${NAMESPACE} occurrences]"
rg -n "\\$\\{NAMESPACE\\}" "$FILE" || true
echo "[${UPSTREAM_TIMEOUT} occurrences]"
rg -n "\\$\\{UPSTREAM_TIMEOUT\\}" "$FILE" || true

Repository: ambient-code/platform

Length of output: 4840


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="components/manifests/templates/template-services.yaml"

echo "== head(200) =="
sed -n '1,220p' "$FILE"

echo "== occurrences summary =="
echo "ambient-code:"; rg -n "ambient-code" "$FILE" || true
echo "upstream-timeout:"; rg -n "upstream-timeout" "$FILE" || true
echo "\\${NAMESPACE}:"; rg -n "\\$\\{NAMESPACE\\}" "$FILE" || true
echo "\\${UPSTREAM_TIMEOUT}:"; rg -n "\\$\\{UPSTREAM_TIMEOUT\\}" "$FILE" || true

echo "== oauth-proxy args snippet =="
# Print only the block with '--upstream-timeout'
line=$(rg -n "--upstream-timeout" "$FILE" | head -n 1 | cut -d: -f1 || true)
if [[ -n "${line:-}" ]]; then
  sed -n "$((line-40)),$((line+40))p" "$FILE"
fi

Repository: ambient-code/platform

Length of output: 6847


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="components/manifests/templates/template-services.yaml"

echo "PARAM_DECLARATIONS:"
rg -n "name:\s*(NAMESPACE|UPSTREAM_TIMEOUT)|value:\s*(ambient-code|5m)" "$FILE" || true

echo "HARDCODED_NAMESPACE_STRING:"
rg -n "ambient-code" "$FILE" || true

echo "TEMPLATE_PARAM_REFERENCES:"
rg -n "\\$\\{NAMESPACE\\}|\\$\\{UPSTREAM_TIMEOUT\\}" "$FILE" || true

echo "UPSTREAM_TIMEOUT_ARG_LINES:"
rg -n "--upstream-timeout" "$FILE" || true

Repository: ambient-code/platform

Length of output: 2392


🏁 Script executed:

ls -R | head

Repository: ambient-code/platform

Length of output: 164


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="components/manifests/templates/template-services.yaml"

echo "== Literal template var usage: \${NAMESPACE} / \${UPSTREAM_TIMEOUT} =="
rg -n '\$\{NAMESPACE\}|\$\{UPSTREAM_TIMEOUT\}' "$FILE" || true

echo "== oauth-proxy upstream-timeout occurrences =="
rg -n -- '--upstream-timeout' "$FILE" || true

echo "== Snippet around --upstream-timeout line(s) =="
while read -r line; do
  lnum="$(echo "$line" | cut -d: -f1)"
  sed -n "$((lnum-20)),$((lnum+20))p" "$FILE"
done < <(rg -n -- '--upstream-timeout' "$FILE" | cut -d: -f1 | sort -u)

Repository: ambient-code/platform

Length of output: 343


Wire the declared template parameters (NAMESPACE, UPSTREAM_TIMEOUT).

NAMESPACE and UPSTREAM_TIMEOUT are declared in the OpenShift Template, but the template never references ${NAMESPACE} / ${UPSTREAM_TIMEOUT}—resources hardcode ambient-code, and oauth-proxy still uses --upstream-timeout=5m (line 693). Parameter overrides therefore won’t affect the rendered manifests.

Representative fix
-    name: ambient-code
+    name: ${NAMESPACE}
@@
-    namespace: ambient-code
+    namespace: ${NAMESPACE}
@@
-          - --upstream-timeout=5m
+          - --upstream-timeout=${UPSTREAM_TIMEOUT}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-services.yaml` around lines 23 - 26,
The template declares parameters NAMESPACE and UPSTREAM_TIMEOUT but resources
still hardcode "ambient-code" and the oauth-proxy flag uses
"--upstream-timeout=5m"; update the manifest references to use the template
parameter placeholders ${NAMESPACE} and ${UPSTREAM_TIMEOUT} instead of the
hardcoded values (e.g., replace occurrences of ambient-code with ${NAMESPACE}
and replace the oauth-proxy arg --upstream-timeout=5m with
--upstream-timeout=${UPSTREAM_TIMEOUT}) so parameter overrides will take effect
when the template is processed.

Comment on lines +39 to +48
kind: Secret
metadata:
labels:
app: ambient-api-server
component: api
name: ambient-api-server
namespace: ambient-code
stringData:
sentry.key: ''
type: Opaque
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Show the exact lines around the reported snippet
FILE="components/manifests/templates/template-services.yaml"
echo "== Lines $FILE:39-70 =="
if [ -f "$FILE" ]; then
  nl -ba "$FILE" | sed -n '39,70p'
else
  echo "Missing file: $FILE"
fi

# 2) Search within that file for ownerReferences and other Secret/PVC blocks
echo
echo "== ownerReferences in $FILE =="
rg -n "ownerReferences" "$FILE" || true

echo
echo "== kind: Secret / kind: PersistentVolumeClaim in $FILE =="
rg -n "kind:\s*(Secret|PersistentVolumeClaim)\b" "$FILE" || true

# 3) Repo-wide: check whether ownerReferences are used anywhere in manifests templates
echo
echo "== ownerReferences in components/manifests =="
rg -n "ownerReferences" components/manifests || true

# 4) Repo-wide: locate Secrets in templates/manifests that lack ownerReferences (heuristic)
echo
echo "== Heuristic: Secret blocks without ownerReferences =="
rg -n --glob '*.{yaml,yml}' "kind:\s*Secret\b" components/manifests || true

# 5) Repo-wide: find PVC definitions and any ownerReferences nearby (heuristic)
echo
echo "== PVC blocks and nearby ownerReferences (heuristic) =="
rg -n --glob '*.{yaml,yml}' "kind:\s*PersistentVolumeClaim\b" components/manifests || true
rg -n --glob '*.{yaml,yml}' "PersistentVolumeClaim" components/manifests | head -n 50

Repository: ambient-code/platform

Length of output: 198


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

echo "== Lines $FILE:35-75 =="
if [ -f "$FILE" ]; then
  awk 'NR>=35 && NR<=75 {printf "%d:%s\n", NR, $0}' "$FILE"
else
  echo "Missing file: $FILE"
fi

echo
echo "== ownerReferences in $FILE =="
rg -n "ownerReferences" "$FILE" || true

echo
echo "== kind: Secret / kind: PersistentVolumeClaim in $FILE (first 50 matches) =="
rg -n --glob '*.{yaml,yml}' "kind:\s*(Secret|PersistentVolumeClaim)\b" "$FILE" | head -n 50 || true

echo
echo "== ownerReferences in components/manifests (first 50 matches) =="
rg -n "ownerReferences" components/manifests | head -n 50 || true

echo
echo "== Heuristic: Secret blocks without ownerReferences (first 50 matches; may be noisy) =="
rg -n --glob '*.{yaml,yml}' "kind:\s*Secret\b" components/manifests | head -n 50 || true

echo
echo "== Heuristic: PVC blocks (first 50 matches) =="
rg -n --glob '*.{yaml,yml}' "kind:\s*PersistentVolumeClaim\b" components/manifests | head -n 50 || true

Repository: ambient-code/platform

Length of output: 6305


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

echo "== Secret block (lines 35-60) =="
awk 'NR>=35 && NR<=60 {printf "%d:%s\n", NR, $0}' "$FILE"

echo
echo "== PVC blocks region (lines 130-210) =="
awk 'NR>=130 && NR<=210 {printf "%d:%s\n", NR, $0}' "$FILE"

echo
echo "== ownerReferences occurrences in $FILE =="
rg -n "ownerReferences" "$FILE" || true

Repository: ambient-code/platform

Length of output: 2920


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== ownerReferences across repo (first 50 matches) =="
rg -n "ownerReferences" . | head -n 50 || true

echo
echo "== Search for ambient-api-server-db-data / postgresql-data / backend-state-pvc occurrences =="
rg -n "ambient-api-server-db-data|backend-state-pvc|postgresql-data|kind:\s*Secret|kind:\s*PersistentVolumeClaim" components/manifests | head -n 200 || true

echo
echo "== Look for kustomize patches that mention these names =="
rg -n "name:\s*ambient-api-server-db-data|name:\s*backend-state-pvc|name:\s*postgresql-data|name:\s*ambient-api-server$|sentry\.key" components/manifests --glob='*.yaml' --glob='*.yml' | head -n 200 || true

echo
echo "== Any overlays patching template-services.yaml resources? (heuristic search) =="
rg -n "template-services\.yaml" components/manifests || true

Repository: ambient-code/platform

Length of output: 16556


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/validate.sh"
echo "== validate script: $FILE =="
if [ -f "$FILE" ]; then
  awk 'NR<=200{printf "%d:%s\n", NR, $0}' "$FILE"
else
  echo "Missing: $FILE"
fi

echo
echo "== template-services.yaml mentions in templates dir =="
rg -n "template-services\.yaml" components/manifests/templates || true

echo
echo "== ownerReferences generation: search templates dir for ownerReferences =="
rg -n "ownerReferences" components/manifests/templates || true

Repository: ambient-code/platform

Length of output: 767


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="components/manifests/templates/template-services.yaml"
echo "== PVC blocks (lines 145-195) =="
awk 'NR>=145 && NR<=195 {printf "%d:%s\n", NR, $0}' "$FILE"
echo
echo "== ownerReferences in this file =="
rg -n "ownerReferences" "$FILE" || true

Repository: ambient-code/platform

Length of output: 1393


Add controller metadata.ownerReferences to Secret/PVCs emitted by template-services.yaml

components/manifests/templates/template-services.yaml defines a standalone Secret (ambient-api-server) and standalone PVCs (ambient-api-server-db-data, backend-state-pvc, postgresql-data) with no metadata.ownerReferences, so garbage collection/cleanup won’t occur on partial teardown and secrets/data can be left behind.

As per repo policy: all child resources (Jobs, Secrets, PVCs) must have OwnerReferences set with controller owner refs.

Also applies to: 152-194

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-services.yaml` around lines 39 - 48,
The Secret "ambient-api-server" and the PVCs "ambient-api-server-db-data",
"backend-state-pvc", and "postgresql-data" in template-services.yaml are missing
metadata.ownerReferences; add a controller-style ownerReference for each emitted
child resource so Kubernetes can garbage-collect them with their parent. Update
the resources' metadata.ownerReferences to include the parent's apiVersion,
kind, name and uid and set controller: true and blockOwnerDeletion as
appropriate (use the same parent resource used by the template that owns these
objects), ensuring the ownerReference fields are populated for manifest
generation in template-services.yaml.

Comment on lines +52 to +55
annotations:
description: Exposes the ambient-api-server REST API
service.beta.openshift.io/serving-cert-secret-name: ambient-api-server-tls
labels:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="components/manifests/templates/template-services.yaml"

echo "Relevant TLS and route configuration:"
rg -n -C2 'service\.beta\.openshift\.io/serving-cert-secret-name: ambient-api-server-tls|--enable-https=false|--grpc-server-bindaddress=:9000|termination: reencrypt' "$file"

echo
echo "Checking whether the ambient-api-server deployment mounts the serving cert secret:"
if rg -n 'secretName: ambient-api-server-tls|tls\.crt|tls\.key|/etc/tls' "$file" >/dev/null; then
  rg -n -C2 'secretName: ambient-api-server-tls|tls\.crt|tls\.key|/etc/tls' "$file"
else
  echo "No ambient-api-server TLS material is mounted into the deployment."
fi

Repository: ambient-code/platform

Length of output: 1628


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="components/manifests/templates/template-services.yaml"

echo "Occurrences of ambient-api-server-tls:"
rg -n "ambient-api-server-tls" "$file" || true

echo
echo "Context around --enable-https=false and bindaddresses in ambient-api-server container:"
rg -n -C6 "--enable-https=false|--api-server-bindaddress=:8000|--grpc-server-bindaddress=:9000" "$file"

echo
echo "Search for /etc/tls mounts and tls.crt/tls.key locations: (focus on around ambient-api-server volumes/volumeMounts)"
rg -n -C2 "/etc/tls|tls\.crt|tls\.key|/configs/authentication|frontend-proxy-tls" "$file"

echo
echo "Route specs with termination: reencrypt and their backend service ports:"
rg -n -C6 "termination: reencrypt" "$file"

Repository: ambient-code/platform

Length of output: 711


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="components/manifests/templates/template-services.yaml"

echo "Occurrences of ambient-api-server-tls:"
rg -n -- "ambient-api-server-tls" "$file" || true

echo
echo "Context around ambient-api-server args (enable-https and bindaddresses):"
rg -n -C6 -e "--enable-https=false" -e "--api-server-bindaddress=:8000" -e "--grpc-server-bindaddress=:9000" "$file"

echo
echo "Search for ambient-api-server TLS secret/material references (mounts/volumeMounts/volumes):"
rg -n -C3 -e "ambient-api-server-tls" -e "/etc/tls" -e "tls\.crt|tls\.key" -e "secretName: ambient-api-server-tls" "$file" || true

echo
echo "Routes with termination: reencrypt, including their backend service/port targets:"
rg -n -C12 -e "termination: reencrypt" "$file"

Repository: ambient-code/platform

Length of output: 3515


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="components/manifests/templates/template-services.yaml"

echo "Service definitions for ambient-api-server (including ports targetPort):"
rg -n -C5 -e "kind: Service" -e "name: ambient-api-server" "$file"

echo
echo "Ports blocks near ambient-api-server container (if any):"
rg -n -C6 -e "containerPort" -e "name: api" -e "name: grpc" -e "api-server" "$file"

echo
echo "Look specifically for service port names api/grpc:"
rg -n -C4 -e "name: api" -e "name: grpc" "$file"

Repository: ambient-code/platform

Length of output: 13938


Reencrypt Routes will fail: ambient-api-server serves cleartext and never mounts ambient-api-server-tls

    annotations:
      description: Exposes the ambient-api-server REST API
      service.beta.openshift.io/serving-cert-secret-name: ambient-api-server-tls
    labels:

Both Routes (ambient-api-server and ambient-api-server-grpc) use termination: reencrypt and route to the ambient-api-server Service on targetPort: api (8000) and targetPort: grpc (9000), but the Deployment runs --enable-https=false and --api-server-bindaddress=:8000 (HTTP health checks), with no ambient-api-server-tls secret mounted into the pod (the only ambient-api-server-tls reference is the Route annotation). The router will therefore try to establish TLS to ports 8000/9000 that don’t speak HTTPS.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-services.yaml` around lines 52 - 55,
The Route is configured with termination: reencrypt but the ambient-api-server
Deployment runs with --enable-https=false and does not mount the
ambient-api-server-tls secret, so the router will attempt TLS to an HTTP port;
fix by either (A) enable HTTPS in the ambient-api-server Deployment (set
--enable-https=true, change --api-server-bindaddress to :8443 or desired HTTPS
port, mount the ambient-api-server-tls secret into the pod and update Service
targetPort to that HTTPS port) or (B) change the Route annotations/termination
for ambient-api-server and ambient-api-server-grpc from reencrypt to
passthrough/edge so no backend TLS is expected; update the Service/Route
targetPort and any health checks accordingly to match the chosen approach
(referencing ambient-api-server, ambient-api-server-tls, --enable-https, and
--api-server-bindaddress).

Comment on lines +219 to +324
spec:
containers:
- command:
- /usr/local/bin/ambient-api-server
- serve
- --db-host-file=/secrets/db/db.host
- --db-port-file=/secrets/db/db.port
- --db-user-file=/secrets/db/db.user
- --db-password-file=/secrets/db/db.password
- --db-name-file=/secrets/db/db.name
- --enable-jwt=true
- --enable-authz=false
- --jwk-cert-file=/configs/authentication/jwks.json
- --enable-https=false
- --api-server-bindaddress=:8000
- --metrics-server-bindaddress=:4433
- --health-check-server-bindaddress=:4434
- --db-sslmode=require
- --db-max-open-connections=50
- --enable-db-debug=false
- --enable-metrics-https=false
- --http-read-timeout=5s
- --http-write-timeout=30s
- --cors-allowed-origins=*
- --cors-allowed-headers=X-Ambient-Project
- --enable-grpc=true
- --grpc-server-bindaddress=:9000
- --alsologtostderr
- -v=4
env:
- name: AMBIENT_ENV
value: production
image: ${IMAGE_AMBIENT_API_SERVER}:${IMAGE_TAG}
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /api/ambient
port: 8000
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 5
name: api-server
ports:
- containerPort: 8000
name: api
protocol: TCP
- containerPort: 4433
name: metrics
protocol: TCP
- containerPort: 4434
name: health
protocol: TCP
- containerPort: 9000
name: grpc
protocol: TCP
readinessProbe:
httpGet:
httpHeaders:
- name: User-Agent
value: Probe
path: /healthcheck
port: 4434
scheme: HTTP
initialDelaySeconds: 20
periodSeconds: 10
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 200m
memory: 512Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
volumeMounts:
- mountPath: /secrets/db
name: db-secrets
- mountPath: /secrets/service
name: app-secrets
- mountPath: /configs/authentication
name: auth-config
initContainers:
- command:
- /usr/local/bin/ambient-api-server
- migrate
- --db-host-file=/secrets/db/db.host
- --db-port-file=/secrets/db/db.port
- --db-user-file=/secrets/db/db.user
- --db-password-file=/secrets/db/db.password
- --db-name-file=/secrets/db/db.name
- --db-sslmode=require
- --alsologtostderr
- -v=4
image: ${IMAGE_AMBIENT_API_SERVER}:${IMAGE_TAG}
imagePullPolicy: Always
name: migration
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

# Show the exact referenced line ranges
sed -n '200,360p' "$FILE" | nl -ba | sed -n '1,200p'
echo "----"
sed -n '350,680p' "$FILE" | nl -ba | sed -n '1,260p'
echo "----"
sed -n '610,780p' "$FILE" | nl -ba | sed -n '1,220p'

# Extract container blocks by name (best-effort) to check securityContext/resources presence
echo "==== rg: container names with securityContext ===="
rg -n "name:\s*(api-server|backend-api|frontend|oauth-proxy|migration)\b" "$FILE"

echo "==== For each container, show nearby securityContext/resources ===="
for n in api-server backend-api frontend oauth-proxy migration; do
  echo "---- $n ----"
  # Print 80 lines of context around "name: $n" occurrences
  rg -n "name:\s*$n\b" -n "$FILE" || true
  rg -n -C 6 "name:\s*$n\b" "$FILE" || true
  rg -n -C 6 "name:\s*$n\b" "$FILE" | head -n 120 || true
done

# Check whether any runAsNonRoot/readOnlyRootFilesystem appear
echo "==== Search for securityContext policy fields ===="
rg -n "runAsNonRoot|readOnlyRootFilesystem|allowPrivilegeEscalation|capabilities:\s*|drop:\s*-\s*ALL" "$FILE"

# Check initContainer migration resources
echo "==== initContainers migration resources check ===="
rg -n -C 20 "initContainers:" "$FILE" | sed -n '1,220p'
rg -n -C 20 "name:\s*migration\b" "$FILE" | sed -n '1,220p'

Repository: ambient-code/platform

Length of output: 108


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="components/manifests/templates/template-services.yaml"

# helper to print with line numbers (awk-based)
print_range () {
  local start=$1 end=$2 label=$3
  echo "==== ${label}: lines ${start}-${end} ===="
  awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d\t%s\n", NR, $0}' "$FILE"
  echo
}

# Show referenced areas
print_range 200 360 "range 200-360"
print_range 350 680 "range 350-680"
print_range 600 820 "range 600-820"

# Search for policy-related fields
echo "==== Search for securityContext policy fields ===="
rg -n "runAsNonRoot|readOnlyRootFilesystem|allowPrivilegeEscalation|capabilities:|drop:\s*-\s*ALL" "$FILE" || true

# Search for container securityContext blocks for relevant containers
for n in api-server migration backend-api frontend oauth-proxy; do
  echo "==== Context around container name: $n ===="
  rg -n -C 12 "name:\s*$n\b" "$FILE" || true
done

# Specifically look for initContainer migration resources
echo "==== Look for migration initContainer resources ===="
rg -n -C 30 "name:\s*migration\b" "$FILE" || true

# Also check if any pod-level securityContext exists in the same ranges
echo "==== Search for pod-level securityContext ===="
rg -n "^\s*securityContext:\s*$|securityContext:" "$FILE" || true

# Check if securityContext exists under containers only (rough heuristic)
echo "==== Count occurrences of container securityContext blocks ===="
rg -n "securityContext:" "$FILE" | wc -l

Repository: ambient-code/platform

Length of output: 37372


Harden pod/container SecurityContext for multiple workloads (restricted-cluster policy)

  • ambient-api-server and initContainer migration: missing runAsNonRoot and set readOnlyRootFilesystem: false (they do drop capabilities: drop: [ALL] / allowPrivilegeEscalation: false).
  • backend-api, frontend, oauth-proxy: containers have no securityContext, and the pod templates also define no pod-level securityContext.
  • initContainer migration: missing resources.requests and resources.limits.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/manifests/templates/template-services.yaml` around lines 219 -
324, Harden the pod and containers by adding a pod-level securityContext and
per-container securityContexts: for the api-server container (name: api-server)
set runAsNonRoot: true and make readOnlyRootFilesystem: true (currently false)
while keeping allowPrivilegeEscalation: false and capabilities.drop: [ALL]; for
the migration initContainer (name: migration) add runAsNonRoot: true, set
readOnlyRootFilesystem: true, and add resources.requests and resources.limits
(match sensible values used by api-server or cluster policy); also add
equivalent securityContext entries (runAsNonRoot: true, readOnlyRootFilesystem:
true, allowPrivilegeEscalation: false, capabilities.drop: [ALL]) for other
containers referenced in this template (backend-api, frontend, oauth-proxy) and
add a pod-level securityContext to enforce non-root UID defaults across the pod.

@mergify mergify Bot added the queued label Jun 3, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jun 3, 2026

Merge Queue Status

  • Entered queue2026-06-03 19:11 UTC · Rule: default
  • Checks skipped · PR is already up-to-date
  • Merged2026-06-03 19:11 UTC · at 9764ed09c3cd48c31b74ec4967f0edc797de2410 · squash

This pull request spent 50 seconds in the queue, including 10 seconds running CI.

Required conditions to merge

@mergify mergify Bot merged commit 1b01260 into main Jun 3, 2026
70 checks passed
@mergify mergify Bot deleted the feat/add-openshift-templates branch June 3, 2026 19:11
@mergify mergify Bot removed the queued label Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant