Skip to content

feat: Switch container builds to GHCR and deprecate ECR registry #109

@drew

Description

@drew

Problem

Container images are currently published to AWS ECR (524473328983.dkr.ecr.us-west-2.amazonaws.com/navigator/*). Additionally, many CI workflows and configs still reference the old repo name nv-agent-env instead of nemoclaw in GHCR image paths (e.g., ghcr.io/nvidia/nv-agent-env/...).

We should consolidate all container publishing to GHCR under the correct repo name and remove ECR entirely.

Proposed Solution

1. Update all nv-agent-env references to nemoclaw

The following files reference the old repo name and need updating:

File References
.github/workflows/publish.yml ghcr.io/nvidia/nv-agent-env/ci:latest (lines 44, 88)
.github/workflows/ci-image.yml ghcr.io/nvidia/nv-agent-env/ci (line 15)
.github/workflows/docker-build.yml ghcr.io/nvidia/nv-agent-env/ci:latest, ghcr.io/nvidia/nv-agent-env (lines 35, 44)
.github/workflows/e2e-test.yml ghcr.io/nvidia/nv-agent-env/* (lines 21, 31, 33, 43, 53)
.github/workflows/checks.yml ghcr.io/nvidia/nv-agent-env/ci:latest (lines 23, 44, 84)
.opencode/plans/github-actions-ecr-publish.md Multiple references (lines 155, 254)

All should become ghcr.io/nvidia/nemoclaw/....

2. Switch publish pipeline from ECR to GHCR

Files with ECR-specific logic to migrate or remove:

File What to change
.github/workflows/publish.yml Remove ECR login/push (line 64: 524473328983.dkr.ecr.us-west-2.amazonaws.com), publish all images to GHCR instead
tasks/scripts/docker-publish-multiarch.sh Remove ECR mode (--mode ecr), default to GHCR registry
tasks/docker.toml Update docker:publish:cluster:multiarch task description and remove ECR reference
tasks/publish.toml Update publish:main and publish:tag to target GHCR instead of ECR
.opencode/plans/github-actions-ecr-publish.md Update plan to reflect GHCR-only approach
architecture/build-containers.md Remove ECR documentation, update examples

3. Remove AWS ECR credentials and config

  • Remove AWS OIDC role/credentials setup from publish.yml
  • Remove AWS_ACCOUNT_ID / AWS_REGION defaults from scripts
  • Clean up any ECR-specific IAM references

Tasks

  • Rename all ghcr.io/nvidia/nv-agent-env references to ghcr.io/nvidia/nemoclaw across workflows
  • Rebuild and push the CI image to ghcr.io/nvidia/nemoclaw/ci:latest
  • Update publish.yml to push sandbox, server, and cluster images to GHCR instead of ECR
  • Update docker-publish-multiarch.sh to remove ECR mode
  • Update mise tasks in docker.toml and publish.toml
  • Update architecture/build-containers.md documentation
  • Remove AWS OIDC and ECR credential configuration from workflows
  • Verify all CI workflows pass with new image paths
  • Deprecation notice: document that ECR images are no longer published

Definition of Done

  • All container images (ci, sandbox, server, cluster) published exclusively to ghcr.io/nvidia/nemoclaw/*
  • Zero references to nv-agent-env in source-controlled files
  • Zero references to ECR registry in active workflows/scripts
  • All CI workflows passing with updated image paths

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions