Skip to content

chore(infra): bump prod.tag to 2026.4.5-bf9f699 — flip prod to extended image#323

Merged
prez2307 merged 1 commit into
mainfrom
chore/bump-prod-openclaw-tag
Apr 20, 2026
Merged

chore(infra): bump prod.tag to 2026.4.5-bf9f699 — flip prod to extended image#323
prez2307 merged 1 commit into
mainfrom
chore/bump-prod-openclaw-tag

Conversation

@prez2307
Copy link
Copy Markdown
Contributor

Summary

  • Prod's prod.tag was "bootstrap", so container-stack.ts:318-322 fell through to the legacy upstream alpine/openclaw:2026.4.5 — no clawhub / gh / uv baked in.
  • Flipping to 2026.4.5-bf9f699 (same tag dev has been running) so the extended multi-stage image rolls out to prod.
  • Env var CLAWHUB_WORKDIR=/home/node/.openclaw was already set on the base task def; it just couldn't help until the CLI was actually present.

Rollout behavior

  • Deploy registers a new base task def revision and re-pins the backend's ECS_TASK_DEFINITION env to it.
  • Existing per-user services keep launching from whatever revision they were registered against at provision time, so they will NOT automatically pick up the new image. To get existing users onto the extended image after deploy, trigger a re-register (no-op resize_user_container or delete+recreate service) per user.
  • New provisions created after deploy use the new base.

Test plan

  • Prod deploy succeeds (isol8-prod-container + isol8-prod-service stacks).
  • After deploy, a freshly-provisioned prod container has which clawhub return a path and echo $CLAWHUB_WORKDIR return /home/node/.openclaw.
  • clawhub install <slug> lands at /home/node/.openclaw/skills/<slug>.

🤖 Generated with Claude Code

…ded image

Prod was stuck on the "bootstrap" placeholder, so container-stack.ts fell
through to the legacy upstream alpine/openclaw:2026.4.5 — which does not
have clawhub (or gh, uv, etc.) baked in, and agents on prod hit
"clawhub: not in PATH" when installing skills.

Flipping to the same tag dev has been running (2026.4.5-bf9f699) so the
multi-stage extended image rolls out to prod. After deploy, existing
per-user services still launch from the task def revision registered at
provision time — they'll need a re-register (resize or delete+recreate)
to pick up the new base image.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@prez2307 prez2307 merged commit ed84428 into main Apr 20, 2026
prez2307 added a commit that referenced this pull request Apr 20, 2026
…def export bump (#324)

PR #323's prod deploy rolled back because isol8-prod-container tried to
update its OpenClawTaskDef export (new revision arn) while isol8-prod-service
was still importing the old value with no pending template diff of its
own. CFN blocks export updates when the consumer has no in-flight change
to resequence alongside.

isol8-dev didn't hit this today because its container stack had no task-def
change in PR #323 (dev.tag unchanged, only prod.tag). Previous dev task-def
updates succeeded because service-stack had pending changes in the same
deploy, letting CDK/CFN order the export refresh.

A trivial DEPLOY_NONCE env var on the backend container forces a service-
stack diff this run, which lets CFN reorchestrate the export update. No
runtime impact. Revert (or replace with a proper SSM-parameter indirection)
once PR #323 lands in prod.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
prez2307 added a commit that referenced this pull request Apr 21, 2026
…K_DEFINITION to .family, drop nonce (#326)

PR #323's prod deploy rolled back and every subsequent deploy hits the
same CloudFormation lock: isol8-prod-container can't update its
OpenClawTaskDef export because isol8-prod-service imports it. The lock
is checked against the consumer's live template, so no amount of pending
diff on service-stack this run helps (PR #324's DEPLOY_NONCE theory was
wrong — confirmed in today's failed deploy).

Quick-fix to unblock and set up the extended-image rollout as two PRs:

1. openclaw-version.json: revert prod.tag to "bootstrap" so container-
   stack has no task-def diff on this deploy. Prod's base image stays
   at alpine/openclaw:2026.4.5 (upstream) — where it is now, so zero
   runtime regression.
2. service-stack.ts: swap ECS_TASK_DEFINITION from
   props.container.openclawTaskDef.taskDefinitionArn to
   props.container.openclawTaskDef.family. That's an inlined static
   string ("isol8-prod-openclaw"), not an Fn::ImportValue — the cross-
   stack coupling disappears. On this deploy: container no-ops, service
   updates, and the OpenClawTaskDef export becomes unused.
3. Drop DEPLOY_NONCE from PR #324 (unnecessary once the cross-stack
   coupling is gone).

Follow-up PR re-bumps prod.tag to the extended image; with no consumer
left on the export, CFN updates the task-def revision freely.

Trade-off: .family reintroduces the "latest-in-family" lookup PR #299
moved away from. In practice safe under current code — CLAWHUB_WORKDIR
is now on every clone (added in PR #277, inherited by all per-user
clones since), and the per-user access point is always overridden by
_build_register_kwargs_from_base, so cross-user leakage can't happen.
A real fix (SSM-parameter indirection on the ARN) is worth doing later
to restore revision-pinning without the cross-stack lock.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
prez2307 added a commit that referenced this pull request Apr 21, 2026
Re-applies PR #323 now that PR #329 decoupled the task-def ARN from a
cross-stack Fn::ImportValue via SSM. container-stack can freely register
a new task-def revision with the extended OpenClaw image — the SSM param
value tracks the new ARN automatically, and no consumer imports the
revision-embedded export anymore.

After deploy:
- New provisions land on the extended image (clawhub baked in).
- Existing per-user services still launch from the task-def revisions
  they were registered against at provision time. Roll them forward via
  POST /container/updates with owner_id:"all" (banner + Update Now), or
  force-apply per-owner in a follow-up.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant