Skip to content

ci: use native ARM64 runners instead of QEMU for multi-arch builds#842

Merged
mergify[bot] merged 11 commits intomainfrom
ci/native-multiarch-builds
Mar 16, 2026
Merged

ci: use native ARM64 runners instead of QEMU for multi-arch builds#842
mergify[bot] merged 11 commits intomainfrom
ci/native-multiarch-builds

Conversation

@maskarb
Copy link
Contributor

@maskarb maskarb commented Mar 6, 2026

Summary

Replace QEMU-emulated cross-compilation with native per-architecture builders (ubuntu-latest for amd64, ubuntu-24.04-arm for arm64) and a manifest merge step. This eliminates SIGILL crashes caused by QEMU's incomplete instruction emulation and builds both arches in parallel for faster CI.

Changes

  • Split build-and-push into build (per-arch native runners) + merge-manifests (creates multi-arch manifest via docker buildx imagetools create)
  • Dynamic matrix: only build changed components, skip unchanged entirely
  • Applied to both components-build-deploy.yml (stage) and prod-release-deploy.yaml (production)
  • Removed stale public-api references

Speed gains

  • ARM64 builds run at native speed instead of ~5-10x slower QEMU emulation
  • Both architectures build in parallel on separate runners
  • Eliminates SIGILL crashes from QEMU's incomplete instruction emulation

Jira: RHOAIENG-52962

🤖 Generated with Claude Code

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Claude Code Review - Summary: Replaces QEMU cross-compilation with native runners, adds merge-manifests job, uses dynamic matrix. Sound approach, two architectural concerns to address.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Claude Code Review

Summary

This PR replaces QEMU-emulated cross-compilation with native per-architecture runners (ubuntu-latest for amd64, ubuntu-24.04-arm for arm64), adds a merge-manifests job to assemble the final multi-arch manifest list, and pairs that with dynamic matrix generation to skip building unchanged components. The core goal (eliminating QEMU SIGILL crashes) is sound, the restructuring is clean, and the conditional logic is correct. Two architectural concerns are worth addressing before merge, and a few minor inconsistencies are noted below.


Issues by Severity

Blocker Issues: None.

Critical Issues: None.


Major Issues

1. Hardcoded arch suffixes in merge-manifests create fragile coupling

Files: .github/workflows/components-build-deploy.yml and .github/workflows/prod-release-deploy.yaml (merge-manifests run steps)

The build job defines architectures through a matrix (suffix: amd64 / suffix: arm64) but merge-manifests hardcodes those same suffixes directly in the docker buildx imagetools create command. Adding or removing an architecture requires updating BOTH the build matrix AND the merge-manifests shell command, with no compile-time check that they agree. The IMPORTANT comment in the build matrix is helpful, but it is absent from the merge-manifests step in prod-release-deploy.yaml. A composite action or shared script constructing the tag list from a single arch definition would eliminate this coupling.

2. Matrix generation script is duplicated verbatim across both workflow files

Files: .github/workflows/components-build-deploy.yml and .github/workflows/prod-release-deploy.yaml (Build component matrices steps)

The jq-based filtering logic (gsub whitespace trimming, force-all handling, dispatch-vs-push branching) is copy-pasted between the two workflows. A shared composite action under .github/actions/build-matrix/ would centralise this so future changes only need to land in one place.


Minor Issues

1. prod-release-deploy.yaml builds 5 components; components-build-deploy.yml builds 7 -- undocumented

prod-release-deploy.yaml omits public-api and ambient-api-server. If intentional (e.g. those are not part of a versioned release), a brief inline comment would prevent future contributors from treating it as an oversight.

2. prod-release-deploy.yaml build job has no has-builds guard

components-build-deploy.yml guards the build job with has-builds == true to avoid a matrix-with-empty-array error. The prod release workflow does not, relying on the assumption that a release always builds at least one component. Adding the guard or a non-empty assertion in the matrix step would make this invariant explicit.


Positive Highlights

  • fail-fast: false correctly added to both build and merge-manifests matrices -- a single flaky ARM build will not cancel the amd64 builds or other components builds.
  • Per-component, per-arch GHA cache scoping (scope=component-arch) is a meaningful improvement over the prior single shared cache bucket, which could cause evictions between unrelated components.
  • Removing the repetitive if: conditions copy-pasted six times per step in the old build-and-push job, in favour of pre-filtering the matrix at the source, is a significant readability and correctness improvement.
  • always() and !cancelled() on update-rbac-and-crd correctly allows CRD/RBAC-only deployments (where merge-manifests is skipped because no images changed) to still proceed -- this was a subtle gap in the old design.
  • The jq pipeline in Build component matrices correctly uses --arg to pass user-supplied dispatch inputs into jq rather than interpolating them directly into the filter expression, avoiding a potential injection vector.

Recommendations

  1. (Major) Add the matching IMPORTANT comment alongside the merge-manifests shell command in prod-release-deploy.yaml; consider a composite action for the matrix script to remove the duplication in a follow-up PR.
  2. (Major) Document the 5-vs-7 component difference in prod-release-deploy.yaml with an inline comment, or add the missing components if the omission is unintentional.
  3. (Minor) Add a has-builds guard or non-empty assertion to the build job in prod-release-deploy.yaml for defensive consistency with components-build-deploy.yml.

maskarb and others added 5 commits March 10, 2026 14:59
Replace QEMU-emulated cross-compilation with native per-architecture
builders (ubuntu-latest for amd64, ubuntu-24.04-arm for arm64) and a
manifest merge step. This eliminates SIGILL crashes caused by QEMU's
incomplete instruction emulation and builds both arches in parallel
for faster CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of starting jobs for all components and skipping steps inside,
detect-changes now outputs a JSON matrix of only the components that
need building. Unchanged components don't appear as jobs at all,
making the Actions UI much clearer about what actually built.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add missing permissions block to merge-manifests in prod-release-deploy
- Add fail-fast: false to merge-manifests strategy in both workflows
- Document arch-suffixed tag accumulation as known trade-off

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add sync comment on arch suffixes that must match merge-manifests
- Remove cache-to on PR builds to avoid evicting main-branch cache
- Revert cosmetic backend banner change (CI trigger workaround)
- Fix inconsistent 3-space job indent in prod-release-deploy.yaml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@maskarb maskarb force-pushed the ci/native-multiarch-builds branch from b4b4efe to 8985952 Compare March 10, 2026 19:02
@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

Walkthrough

Replaced static per-component builds with a dynamic, multi-architecture matrix flow (amd64, arm64): added a "Build component matrices" step producing build-matrix, merge-matrix, and has-builds; introduced per-arch build jobs and a merge-manifests job; deployment jobs now depend on merged multi-arch manifests.

Changes

Cohort / File(s) Summary
Workflow matrix & outputs
.github/workflows/components-build-deploy.yml, .github/workflows/prod-release-deploy.yaml
Added "Build component matrices" step that emits build-matrix, merge-matrix, and has-builds; replaced static component matrices with dynamic matrix-driven builds that feed component and arch.
Multi-arch build & push
.github/workflows/components-build-deploy.yml, .github/workflows/prod-release-deploy.yaml
Introduced per-arch (amd64, arm64) build matrix entries, arch-specific cache keys, arch-suffix image tags, conditional PR vs push behavior, and per-arch image outputs (sha + arch suffix).
Manifest merging & deployment gating
.github/workflows/components-build-deploy.yml, .github/workflows/prod-release-deploy.yaml
Added merge-manifests job to create multi-arch manifests from per-arch images; changed update-rbac-and-crd, deploy-to-openshift, and dispatch-based deploys to depend on merge-manifests rather than raw build jobs.
Component selection, tagging, and kustomize updates
.github/workflows/components-build-deploy.yml, .github/workflows/prod-release-deploy.yaml
Reworked conditional logic to use build-matrix/has-builds instead of matrix.component.changed; added steps to compute per-component/arch image tags and expose them as step outputs; updated kustomize/operator env and agent-registry updates to apply only for components present in BUILT/merge-matrix.
Public API & registry handling
.github/workflows/components-build-deploy.yml, .github/workflows/prod-release-deploy.yaml
Aligned public-api handling with multi-arch flow (arch-aware tagging and inclusion in merged manifests); adjusted kustomize image references and registry updates to use merged-manifest tags and per-arch tag computation.

Sequence Diagram

sequenceDiagram
    participant Trigger as Workflow Trigger
    participant Detect as Detect/Release Job
    participant Matrix as Build Component<br/>Matrices Step
    participant Build as Build Job<br/>(per-arch)
    participant Registry as Container Registry
    participant Merge as Merge-Manifests Job
    participant Deploy as Deploy-to-OpenShift

    Trigger->>Detect: start workflow
    Detect->>Matrix: compute build/merge matrices
    Matrix-->>Build: emit build-matrix (component,arch)
    
    par Build amd64
        Build->>Registry: build amd64 image (sha + amd64 tag)
        Registry-->>Build: store amd64 image
    and Build arm64
        Build->>Registry: build arm64 image (sha + arm64 tag)
        Registry-->>Build: store arm64 image
    end

    Build-->>Merge: notify built images
    Merge->>Registry: create multi-arch manifests for components
    Registry-->>Merge: publish multi-arch manifests
    Merge-->>Deploy: multi-arch manifests ready
    Deploy->>Deploy: deploy manifests to OpenShift
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: replacing QEMU with native ARM64 runners for multi-arch builds, which aligns with the core objective of the changeset.
Description check ✅ Passed The description is directly related to the changeset, clearly explaining the workflow changes from build-and-push to build plus merge-manifests, dynamic matrices, and the performance benefits achieved.
Docstring Coverage ✅ Passed Docstring coverage is 86.67% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ci/native-multiarch-builds
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can generate a title for your PR based on the changes.

Add @coderabbitai placeholder anywhere in the title of your PR and CodeRabbit will replace it with a title based on the changes in the PR. You can change the placeholder by changing the reviews.auto_title_placeholder setting.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/prod-release-deploy.yaml (1)

293-302: ⚠️ Potential issue | 🟠 Major

Same public-api concern as components-build-deploy.yml.

The kustomization update omits public-api here as well. Ensure the manifest cleanup (or continued support) is addressed consistently across both workflows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/prod-release-deploy.yaml around lines 293 - 302, The
kustomize update step omits the public-api image; add a line alongside the other
kustomize edit set image commands to update the public-api tag using the same
RELEASE_TAG variable (e.g., add kustomize edit set image
quay.io/ambient_code/vteam_public-api:latest=quay.io/ambient_code/vteam_public-api:${RELEASE_TAG}
or the repo's exact image name
"vteam_public-api"/"vteam-public-api"/"public-api" to match existing naming),
ensuring the RELEASE_TAG variable is used consistently with the other image
updates.
.github/workflows/components-build-deploy.yml (1)

328-336: ⚠️ Potential issue | 🟠 Major

Add public-api to kustomize image tag updates to match other services.

The workflow currently tags five services (frontend, backend, operator, claude_runner, state_sync, api_server) with commit-specific SHA tags, but omits public-api. The public-api deployment is active in the manifests and will continue using the latest tag, while all other services use pinned versions. Add public-api to the kustomize edit commands to maintain consistency:

kustomize edit set image quay.io/ambient_code/vteam_public_api:latest=quay.io/ambient_code/vteam_public_api:${{ steps.image-tags.outputs.public_api_tag }}

This ensures all deployed services use commit-pinned image tags rather than latest, reducing the risk of uncontrolled version changes in production.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/components-build-deploy.yml around lines 328 - 336, The
kustomize update step "Update kustomization with image tags" is missing the
public-api image so it remains pinned to "latest"; add a kustomize edit set
image command for the public-api image similar to the others using the output
variable steps.image-tags.outputs.public_api_tag (i.e., mirror the pattern used
for frontend/backend/operator/claude_runner/state_sync/api_server but for
quay.io/ambient_code/vteam_public_api mapping to the public_api_tag) so all
services are updated to commit-pinned tags.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/components-build-deploy.yml:
- Around line 79-136: The matrix-building logic in the "Build component
matrices" step (ALL_COMPONENTS, FILTER_MAP, FILTERED, BUILD_MATRIX,
MERGE_MATRIX, HAS_BUILDS) is duplicated elsewhere; extract this block into a
reusable workflow or composite action that accepts inputs (force_build_all,
components, path-filter outputs like frontend/backend/claude-runner/etc.) and
returns outputs (build-matrix, merge-matrix, has-builds); then replace the
inline shell block with a call to that reusable workflow (or composite) in both
this workflow and prod-release-deploy.yaml, mapping inputs to the same variables
and propagating the outputs BUILD_MATRIX/MERGE_MATRIX/HAS_BUILDS so behavior
remains identical.
- Around line 228-237: Add a short cross-reference comment above the "Create
multi-arch manifest for ${{ matrix.component.name }}" step (the docker buildx
imagetools create invocation) that points to the build job's architecture matrix
where the "-amd64" and "-arm64" tag suffixes are defined (the arch matrix in the
build job), so readers can quickly find the source of the suffixes and
understand the coupling between the build and manifest steps.

In @.github/workflows/prod-release-deploy.yaml:
- Around line 150-178: The matrix-building step (step id matrix) duplicates
logic found in components-build-deploy.yml: extract the ALL_COMPONENTS JSON and
the filtering logic that uses FORCE_ALL, SELECTED, FILTERED, BUILD_MATRIX and
MERGE_MATRIX into a single reusable workflow or composite action, expose outputs
build-matrix and merge-matrix, then replace the inline step here and in
components-build-deploy.yml with calls to that reusable workflow/action (passing
github.event.inputs.force_build_all and github.event.inputs.components and
consuming the outputs) so both workflows share the same implementation.

---

Outside diff comments:
In @.github/workflows/components-build-deploy.yml:
- Around line 328-336: The kustomize update step "Update kustomization with
image tags" is missing the public-api image so it remains pinned to "latest";
add a kustomize edit set image command for the public-api image similar to the
others using the output variable steps.image-tags.outputs.public_api_tag (i.e.,
mirror the pattern used for
frontend/backend/operator/claude_runner/state_sync/api_server but for
quay.io/ambient_code/vteam_public_api mapping to the public_api_tag) so all
services are updated to commit-pinned tags.

In @.github/workflows/prod-release-deploy.yaml:
- Around line 293-302: The kustomize update step omits the public-api image; add
a line alongside the other kustomize edit set image commands to update the
public-api tag using the same RELEASE_TAG variable (e.g., add kustomize edit set
image
quay.io/ambient_code/vteam_public-api:latest=quay.io/ambient_code/vteam_public-api:${RELEASE_TAG}
or the repo's exact image name
"vteam_public-api"/"vteam-public-api"/"public-api" to match existing naming),
ensuring the RELEASE_TAG variable is used consistently with the other image
updates.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 927164f1-0976-4d22-b998-bc8ddcf87bee

📥 Commits

Reviewing files that changed from the base of the PR and between cd0a4a9 and 8985952.

📒 Files selected for processing (10)
  • .github/workflows/components-build-deploy.yml
  • .github/workflows/e2e.yml
  • .github/workflows/prod-release-deploy.yaml
  • components/ambient-cli/cmd/acpctl/project/cmd.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/mock_client.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/session.py
  • components/runners/ambient-runner/scripts/capture-fixtures.py
  • components/runners/ambient-runner/tests/test_bridge_claude.py
  • components/runners/ambient-runner/tests/test_git_identity.py
  • docs/internal/developer/local-development/openshift.md

@maskarb maskarb force-pushed the ci/native-multiarch-builds branch from 8985952 to 95e1ede Compare March 10, 2026 19:10
@mergify
Copy link

mergify bot commented Mar 10, 2026

🧪 CI Insights

Here's what we observed from your CI run for 95e1ede.

🟢 All jobs passed!

But CI Insights is watching 👀

@ambient-code ambient-code bot modified the milestones: Merge Queue, Review Queue Mar 11, 2026
maskarb and others added 2 commits March 12, 2026 12:39
Add comment in merge-manifests step pointing back to the build job's
arch matrix where the -amd64/-arm64 suffixes are defined.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The public-api component still exists in the repo and should continue
to be built. It was removed from the deploy steps intentionally but
should not have been removed from the build/merge-manifests matrix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/components-build-deploy.yml:
- Line 93: The build matrix still publishes the image named
"quay.io/ambient_code/vteam_public_api" for the "public-api" component even
though deploy manifests no longer update that image; remove the "public-api"
entry from the job matrix in components-build-deploy.yml (the JSON object with
"name":"public-api" and "image":"quay.io/ambient_code/vteam_public_api") so CI
stops pushing that image, or alternatively update the deployment
kustomization/manifests that reference vteam_public_api to the new image
name—choose one consistent fix and apply it to the matrix entry and/or the
manifests referencing vteam_public_api.
- Line 91: The matrix entry name "ambient-runner" is inconsistent with the
workflow_dispatch input which expects "claude-runner"; update the matrix item
(the {"name":"ambient-runner",...} entry) to use the same selectable name
"claude-runner" (or alternatively change the workflow_dispatch.inputs.components
allowed value to "ambient-runner") so manual dispatches with
components=claude-runner correctly match the runner image; ensure the "name"
field and the workflow_dispatch input value are identical.
- Around line 116-121: The selective-dispatch branch can set
AMBIENT_CODE_RUNNER_IMAGE, STATE_SYNC_IMAGE and agent registry to a commit SHA
even when those components were not part of FILTERED/ALL_COMPONENTS; update the
dispatch logic (the FILTERED creation and downstream "deploy-with-dispatch"
step) to detect whether each component (e.g., runner, state-sync, agent) is
present in FILTERED and only export/set AMBIENT_CODE_RUNNER_IMAGE,
STATE_SYNC_IMAGE and registry SHA for components that will actually be
built/published—alternatively, set those env vars to blank/omit them and let the
receiving job compute defaults per-component. Ensure you reference the FILTERED
value (created by the jq select) to decide presence before assigning the three
image/env variables so you don’t publish tags for images that weren’t built.

In @.github/workflows/prod-release-deploy.yaml:
- Around line 159-160: The workflow matrix builds a new public-api image
(quay.io/ambient_code/vteam_public_api:${{ needs.release.outputs.new_tag }}) but
the deploy-to-openshift job never applies that new tag, so production stays on
the old image; update the deploy-to-openshift job to replace or patch the
production kustomization for the "public-api" image with the new tag (use
needs.release.outputs.new_tag) before applying manifests, or include
"public-api" in the list of images the job updates; target the matrix entry name
"public-api", the image string quay.io/ambient_code/vteam_public_api, the
outputs variable needs.release.outputs.new_tag, and the deploy-to-openshift job
so the production kustomization
(components/manifests/overlays/production/kustomization.yaml) is updated to the
new tag.
- Line 157: The workflow matrix entry name "ambient-runner" doesn't match the
release form input "claude-runner", so selecting claude-runner builds nothing;
update the matrix item name to "claude-runner" (i.e., replace the "name" value
"ambient-runner" with "claude-runner" in the matrix entry that currently
contains context "./components/runners", image
"quay.io/ambient_code/vteam_claude_runner", and dockerfile
"./components/runners/ambient-runner/Dockerfile") so it matches
workflow_dispatch.inputs.components and the release form.
- Around line 166-169: The current filtering logic (variables FILTERED,
ALL_COMPONENTS, SELECTED) allows partial releases but the deploy job
(deploy-to-openshift) unconditionally rewrites component images to
needs.release.outputs.new_tag causing non-built images to be referenced; fix by
making the release tag update driven by the same filtered matrix: propagate
FILTERED (or the selected component names) into the release job outputs and into
deploy-to-openshift inputs, then change deploy-to-openshift to only update image
tags for components present in that filtered list (i.e., only touch
frontend/backend/operator/runner/state-sync/api-server entries if they exist in
FILTERED) rather than applying new_tag unconditionally; alternatively, if you
prefer full-image guarantees, set FORCE_ALL=true when producing release tags so
all images are built before deploy.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d5efc04b-f0be-4ddc-bf3b-2ee6c43ab5b2

📥 Commits

Reviewing files that changed from the base of the PR and between 8985952 and cab4672.

📒 Files selected for processing (2)
  • .github/workflows/components-build-deploy.yml
  • .github/workflows/prod-release-deploy.yaml

{"name":"frontend","context":"./components/frontend","image":"quay.io/ambient_code/vteam_frontend","dockerfile":"./components/frontend/Dockerfile"},
{"name":"backend","context":"./components/backend","image":"quay.io/ambient_code/vteam_backend","dockerfile":"./components/backend/Dockerfile"},
{"name":"operator","context":"./components/operator","image":"quay.io/ambient_code/vteam_operator","dockerfile":"./components/operator/Dockerfile"},
{"name":"ambient-runner","context":"./components/runners","image":"quay.io/ambient_code/vteam_claude_runner","dockerfile":"./components/runners/ambient-runner/Dockerfile"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the runner's selectable name consistent with the dispatch input.

workflow_dispatch.inputs.components still tells operators to pass claude-runner, but this matrix entry is named ambient-runner. A manual dispatch with components=claude-runner now filters to an empty set, so the runner image is silently skipped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/components-build-deploy.yml at line 91, The matrix entry
name "ambient-runner" is inconsistent with the workflow_dispatch input which
expects "claude-runner"; update the matrix item (the
{"name":"ambient-runner",...} entry) to use the same selectable name
"claude-runner" (or alternatively change the workflow_dispatch.inputs.components
allowed value to "ambient-runner") so manual dispatches with
components=claude-runner correctly match the runner image; ensure the "name"
field and the workflow_dispatch input value are identical.

{"name":"operator","context":"./components/operator","image":"quay.io/ambient_code/vteam_operator","dockerfile":"./components/operator/Dockerfile"},
{"name":"ambient-runner","context":"./components/runners","image":"quay.io/ambient_code/vteam_claude_runner","dockerfile":"./components/runners/ambient-runner/Dockerfile"},
{"name":"state-sync","context":"./components/runners/state-sync","image":"quay.io/ambient_code/vteam_state_sync","dockerfile":"./components/runners/state-sync/Dockerfile"},
{"name":"public-api","context":"./components/public-api","image":"quay.io/ambient_code/vteam_public_api","dockerfile":"./components/public-api/Dockerfile"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

public-api is still built here, but this workflow no longer deploys it.

This matrix still publishes quay.io/ambient_code/vteam_public_api, while the deploy paths below no longer update that image. Since components/manifests/base/public-api-deployment.yaml and components/manifests/overlays/production/kustomization.yaml:110-115 still reference vteam_public_api, a public-api-only push or dispatch will succeed in CI and leave the cluster on the old image.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/components-build-deploy.yml at line 93, The build matrix
still publishes the image named "quay.io/ambient_code/vteam_public_api" for the
"public-api" component even though deploy manifests no longer update that image;
remove the "public-api" entry from the job matrix in components-build-deploy.yml
(the JSON object with "name":"public-api" and
"image":"quay.io/ambient_code/vteam_public_api") so CI stops pushing that image,
or alternatively update the deployment kustomization/manifests that reference
vteam_public_api to the new image name—choose one consistent fix and apply it to
the matrix entry and/or the manifests referencing vteam_public_api.

{"name":"frontend","context":"./components/frontend","image":"quay.io/ambient_code/vteam_frontend","dockerfile":"./components/frontend/Dockerfile"},
{"name":"backend","context":"./components/backend","image":"quay.io/ambient_code/vteam_backend","dockerfile":"./components/backend/Dockerfile"},
{"name":"operator","context":"./components/operator","image":"quay.io/ambient_code/vteam_operator","dockerfile":"./components/operator/Dockerfile"},
{"name":"ambient-runner","context":"./components/runners","image":"quay.io/ambient_code/vteam_claude_runner","dockerfile":"./components/runners/ambient-runner/Dockerfile"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the runner's selectable name consistent with the release input.

The release form still documents claude-runner, but the matrix key here is ambient-runner. Selecting claude-runner in workflow_dispatch.inputs.components now produces no runner build.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/prod-release-deploy.yaml at line 157, The workflow matrix
entry name "ambient-runner" doesn't match the release form input
"claude-runner", so selecting claude-runner builds nothing; update the matrix
item name to "claude-runner" (i.e., replace the "name" value "ambient-runner"
with "claude-runner" in the matrix entry that currently contains context
"./components/runners", image "quay.io/ambient_code/vteam_claude_runner", and
dockerfile "./components/runners/ambient-runner/Dockerfile") so it matches
workflow_dispatch.inputs.components and the release form.

- Fix dispatch input description: claude-runner → ambient-runner to
  match the matrix entry name, preventing silent no-ops on dispatch
- Add public-api to deploy-to-openshift: output detection, image tag
  determination, and kustomize set image (was built but never deployed)
- Make deploy-with-dispatch conditional: only update operator env vars
  and agent registry ConfigMap for components that were actually built,
  preventing references to non-existent SHA tags
- Make prod release deploy component-aware: iterate built components
  instead of unconditionally setting all image tags, preventing partial
  releases from referencing non-existent release tags

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.github/workflows/components-build-deploy.yml (1)

94-95: ⚠️ Potential issue | 🟠 Major

public-api dispatches still build an image that never gets deployed.

Adding public-api to the selectable/buildable matrix here is only half of the fix: deploy-with-disptach still never runs the matching kustomize edit set image quay.io/ambient_code/vteam_public_api:... update. A public-api-only dispatch will publish fresh tags and leave the cluster on the previous image.

Suggested follow-up
       - name: Update kustomization with stage image tags
         working-directory: components/manifests/overlays/production
         run: |
           kustomize edit set image quay.io/ambient_code/vteam_frontend:latest=quay.io/ambient_code/vteam_frontend:stage
           kustomize edit set image quay.io/ambient_code/vteam_backend:latest=quay.io/ambient_code/vteam_backend:stage
           kustomize edit set image quay.io/ambient_code/vteam_operator:latest=quay.io/ambient_code/vteam_operator:stage
           kustomize edit set image quay.io/ambient_code/vteam_claude_runner:latest=quay.io/ambient_code/vteam_claude_runner:stage
           kustomize edit set image quay.io/ambient_code/vteam_state_sync:latest=quay.io/ambient_code/vteam_state_sync:stage
           kustomize edit set image quay.io/ambient_code/vteam_api_server:latest=quay.io/ambient_code/vteam_api_server:stage
+          kustomize edit set image quay.io/ambient_code/vteam_public_api:latest=quay.io/ambient_code/vteam_public_api:stage
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/components-build-deploy.yml around lines 94 - 95, The
workflow currently builds "public-api" but the deploy job never updates its
image; update the deploy step(s) that run "kustomize edit set image ..." (in the
deploy-with-disptach job) to handle the public-api matrix entry by adding a
branch that maps the matrix item name "public-api" to the image
quay.io/ambient_code/vteam_public_api and runs the corresponding kustomize edit
set image command, and ensure the deploy-with-disptach job is triggered for
public-api-only dispatches so a published tag is applied to the cluster image.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/prod-release-deploy.yaml:
- Around line 150-176: After computing FILTERED, validate it and abort the
workflow if it’s empty to prevent tagging/releases with no artifacts: add a
check after FILTERED is set that uses jq to test 'FILTERED | length == 0' and if
true either echo a clear error and exit 1 (to fail the job) or set a sentinel
output like has-builds=false (echo "has-builds=false" >> $GITHUB_OUTPUT) and
exit non-zero; otherwise proceed to compute BUILD_MATRIX and MERGE_MATRIX and
echo has-builds=true so downstream steps can gate on that output.
- Around line 303-320: The current loop only updates built components (BUILT) to
${RELEASE_TAG}, leaving unbuilt IMAGEs pointing at the repo ":latest"
placeholder; fix by first seeding kustomize with the actually deployed tags for
every comp_image and then overriding only built ones. Concretely: before the for
loop that iterates comp_image/COMP/IMAGE, query the cluster (oc/kubectl) to
extract the currently deployed image tag for each IMAGE and run kustomize edit
set image for that IMAGE to set IMAGE:${CURRENT_TAG}; then run the existing loop
to run kustomize edit set image ${IMAGE}:latest=${IMAGE}:${RELEASE_TAG} only
when COMP is in BUILT so built components are overridden but untouched services
retain their deployed tags. Ensure you reference RELEASE_TAG, BUILT, comp_image,
COMP, IMAGE and use kustomize edit set image consistently.

---

Duplicate comments:
In @.github/workflows/components-build-deploy.yml:
- Around line 94-95: The workflow currently builds "public-api" but the deploy
job never updates its image; update the deploy step(s) that run "kustomize edit
set image ..." (in the deploy-with-disptach job) to handle the public-api matrix
entry by adding a branch that maps the matrix item name "public-api" to the
image quay.io/ambient_code/vteam_public_api and runs the corresponding kustomize
edit set image command, and ensure the deploy-with-disptach job is triggered for
public-api-only dispatches so a published tag is applied to the cluster image.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d1d54452-68f0-4f30-bb8a-caed9a68031f

📥 Commits

Reviewing files that changed from the base of the PR and between cab4672 and 590ce1e.

📒 Files selected for processing (2)
  • .github/workflows/components-build-deploy.yml
  • .github/workflows/prod-release-deploy.yaml

@ambient-code ambient-code bot removed this from the Review Queue milestone Mar 13, 2026
@ambient-code ambient-code deleted a comment from ambient-code bot Mar 16, 2026
@Gkrumbach07
Copy link
Contributor

Review Queue Status

Check Status Detail
CI FAIL End-to-End Tests failing
Conflicts pass ---
Reviews FAIL 2 Critical CodeRabbit issues on prod-release-deploy.yaml — partial releases can deploy non-existent tags and leave unbuilt services on :latest
Jira warn No Jira reference found

Action needed: Address Critical CodeRabbit findings on release workflow before merge. Fix E2E tests.

Auto-generated by Review Queue workflow. Updated when PR changes.

maskarb and others added 2 commits March 16, 2026 10:00
…ection

Unbuilt components were left pointing at the repo's `:latest` placeholder
because kustomize only updated built components. Now queries the cluster for
currently deployed image tags and seeds kustomize with them before overriding
built components with the release tag.

Also aborts the release job early if the filtered component list is empty to
prevent tagging/releases with no artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/prod-release-deploy.yaml:
- Line 364: The component presence check using echo "$BUILT" | grep -q "$COMP"
is fragile to substrings; change the conditional that references BUILT and COMP
to use word-boundary matching (e.g., grep -w or an anchored regex) so only
whole-component names match, updating the if statement that tests BUILT against
COMP to use the word-boundary-safe grep invocation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f587f437-3c7b-4cc4-bc05-f804c9930805

📥 Commits

Reviewing files that changed from the base of the PR and between 590ce1e and 1d3a6e3.

📒 Files selected for processing (1)
  • .github/workflows/prod-release-deploy.yaml

Wrap BUILT list with commas and match against ,COMP, to prevent
substring false positives (e.g. a future "runner" component matching
"ambient-runner").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Gkrumbach07
Copy link
Contributor

@Mergifyio queue

@mergify
Copy link

mergify bot commented Mar 16, 2026

Merge Queue Status

This pull request spent 28 seconds in the queue, including 2 seconds running CI.

Required conditions to merge

@mergify mergify bot added the queued label Mar 16, 2026
mergify bot added a commit that referenced this pull request Mar 16, 2026
@mergify mergify bot merged commit a82a9ee into main Mar 16, 2026
19 checks passed
@mergify mergify bot deleted the ci/native-multiarch-builds branch March 16, 2026 15:12
@mergify mergify bot removed the queued label Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants